\input texinfo @c -*-texinfo-*- @c %**start of header @setfilename ctlseqs.info @include version.texi @settitle ctlseqs @value{VERSION} @c %**end of header @copying This manual is for ctlseqs, a helper library for control sequences. Copyright @copyright{} 2021,2022 CismonX @quotation Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''. @end quotation @end copying @titlepage @title ctlseqs @subtitle Helper Library for Control Sequences, version @value{VERSION} @author CismonX @page @vskip 0pt plus 1filll @insertcopying @end titlepage @summarycontents @contents @ifnottex @node Top @top ctlseqs This manual is for ctlseqs, a helper library for control sequences. Permission is granted to copy, distribute and/or modify this document under the terms of the @ref{GNU Free Documentation License}, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. ctlseqs is free software. You can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. @end ifnottex @menu * Overview:: Brief overview of ctlseqs. * Helper Macros:: Helper macros provided by ctlseqs. * Control Sequence Matching:: Using ctlseqs for matching control sequences. * Control Sequence Reading:: Using ctlseqs for reading control sequences. * Tips:: Tips & hints for using ctlseqs. * Example Programs:: Example programs using ctlseqs. Appendices * API Reference:: C API reference for ctlseqs. * General Index:: Index of general concepts of ctlseqs. * GNU Free Documentation License:: Copying conditions of this manual. @end menu @node Overview @chapter Overview of ctlseqs The name ``ctlseqs'' is an abbreviation of ``control sequences'', as defined in section 5.4 of ECMA-48. As the name suggests, this library focuses on handling control sequences. However, it only cares about the bit combinations, while the actual meaning and implementation of a control sequence is up to the user. The C API provided by ctlseqs is composed of three major parts: The helper macros, the control sequence matcher, and the control sequence reader. Any of them can be used separatedly or combined, after including the header file @code{ctlseqs.h} in a source file. @menu * Contributing:: Contributing to ctlseqs. * Use Scenarios:: When to use ctlseqs. @end menu @node Contributing @section Contributing @set ctlseqs-repo-url https://savannah.nongnu.org/projects/ctlseqs We welcome any form of contribution to ctlseqs (as well as this manual), including bug reports, patches, etc. Source code of ctlseqs is @url{@value{ctlseqs-repo-url}, hosted on Savannah}. You can contribute to ctlseqs using the bug tracker and patch manager, or discuss with the community using the mailing lists. @node Use Scenarios @section Use Scenarios of ctlseqs Control sequences, as well as other control functions, were once commonly used in computer terminals. Terminals exchange control information with the host regarding colors, font styles, cursor position, etc., using control functions embedded in normal text. Such physical terminals are no longer used today, however, popular ones like DEC VT100 are widely emulated by modern terminal emulators. The primary purpose of the ctlseqs library is to provide developers with simple and easy-to-use API for handling control functions, when working on terminal emulators and text-based programs. However, since there is no de facto standard, control functions used in terminals are largely vendor-specific, and terminal emulators like to add their own private controls. That makes ctlseqs not suitable for writing text-based programs which intend to be portable. Instead of raw control codes, developers should stick to ncurses or terminfo. @cindex List of common use cases of ctlseqs There are still cases when dealing with raw escape sequences is inevitable, and ctlseqs may come in handy: @itemize @bullet @item Development of text-based programs which rely heavily on special control sequences, which is not supported by libraries like ncurses. @item Implementing a terminal emulator. @item Experimenting or debugging the features of text-based programs or terminal emulators. @end itemize @node Helper Macros @chapter Helper Macros ctlseqs provides C preprocessor macros representing control functions, which expand to C string literals. @cindex List of control function types in ctlseqs helper macros The control function can be one of the following three types: @itemize @bullet @item Elements from the C0 or C1 set. @item Control Sequences. @item Other control functions (such as device control functions). @end itemize The name of a helper macro is the control function name with @code{CTLSEQS_} as prefix. For a control function other than elements from the C0 or C1 set, the corresponding helper macro is a function-like macro which may or may not take arguments. Control sequences listed in the helper macros are primarily excerpted from @url{https://invisible-island.net/xterm/ctlseqs/ctlseqs.html, XTerm's manual}, which may differ across implementations. As ctlseqs does not currently support 8-bit controls, 2-character 7-bit codes from the C1 set are used instead of their 1-character 8-bit representation. For example, @code{CTLSEQS_CSI} expands to @code{"\x1b["}. @cindex Helper macro usage example The following code snippet is an example usage of helper macros: @example printf(CTLSEQS_BEL); printf(CTLSEQS_XTVERSION()); printf(CTLSEQS_CUP("%d", "%d"), 3, 4); @end example Keep in mind that the standard output stream is by default line buffered in a terminal. In order to make the control functions take effect immediately, either @code{fflush(stdout)} after printing, or disable output buffering with @code{setvbuf(stdout, NULL, _IONBF, 0)}. @node Control Sequence Matching @chapter Control Sequence Matching Given a character string, checking whether it matches a control sequence is quite trivial, with only the standard C library: @example char const *str /* = ... */; int row, col; if (0 == strcmp(str, CTLSEQS_XTVERSION())) @{ // ... @} else if (2 == sscanf(str, CTLSEQS_CUP("%d", "%d"), &row, &col)) @{ // ... @} else @{ // ... @} @end example However, as the number of patterns grows, this naive implementation becomes less efficient and harder to maintain. This problem can be easily solved by using the control sequence matcher provided by ctlseqs. The @code{struct ctlseqs_matcher *} is a pointer to an opaque type which represents an instance of control sequence matcher. Before using, the matcher should be initialized with @code{ctlseqs_matcher_init}. After using, it should be deallocated with @code{ctlseqs_matcher_free}. @cindex Control sequence matcher initialization example @example struct ctlseqs_matcher *matcher = ctlseqs_matcher_init(); // ... ctlseqs_matcher_free(matcher); @end example On rare occurences when ctlseqs fail to allocate enough memory, function @code{ctlseqs_matcher_init} may return @code{NULL}. However, it is safe to pass null pointers to @code{ctlseqs_matcher_free}. @menu * Matcher Configuration:: Configuring a control sequence matcher * Matching String:: Matching a string with control sequence matcher @end menu @node Matcher Configuration @section Matcher Configuration Matcher configuration consists of two parts: the number of matching patterns, and the pattern values. Invoke function @code{ctlseqs_matcher_config} to configure a matcher. @cindex Control sequence matcher configuration example @example struct ctlseqs_matcher *matcher /* = ... */; char const *patterns[] = @{ // ... @}; struct ctlseqs_matcher_options options = @{ .patterns = patterns, .npatterns = sizeof(patterns) / sizeof(char const *), @}; int result = ctlseqs_matcher_config(matcher, &options); // ... @end example Each invocation of @code{ctlseqs_matcher_config} on the same matcher overwrites the data generated from the last invocation. Upon success, the function returns @code{CTLSEQS_OK}. If the function fails to allocate enough memory, returns @code{CTLSEQS_NOMEM}. @quotation Caution If the @code{patterns} field in @code{struct ctlseqs_matcher_options} is invalid, function behaviour is undefined. See @ref{Patterns} for details. @end quotation @menu * Patterns:: Supported control squence pattern formats @end menu @node Patterns @subsection Patterns The @code{patterns} field in @code{struct ctlseqs_matcher_options} is an array of NUL-terminated strings which indicates the desired patterns of control functions for the current matcher. @cindex Control functions supported by the matcher The following types of control functions are recognizable by the matcher: @itemize @bullet @item Control sequences: @code{CSI [param...] [intmd...] final} @item C1 functions with command string: @code{(APC|DCS|OSC|PM) [cmdstr] ST} @item Single shifts: @code{(SS2|SS3) ch} @item SOS function: @code{SOS [chrstr] ST} @end itemize According to ECMA-48, CSI parameter bytes are of range @code{0x30} to @code{0x3f}, intermediate bytes @code{0x20} to @code{0x2f}, and final byte @code{0x40} to @code{0x7e}. Command string consists of printable characters and characters of range @code{0x08} and @code{0x0e}. Character string can be any bit combination which does not represent @code{SOS} or @code{ST}. A supported control function, either verbatim or combined with placeholders, can be specified as a valid pattern. The terminating @code{NUL} character does not count into the pattern. @cindex List of supported placeholders A placeholder indicates that when matching a string against the pattern, the value at the placeholder's location should conform to its rules. A placeholder can only take place in the @code{param}, @code{intmd}, @code{cmdstr} or @code{chrstr} fields, and can be one of the following values: @itemize @bullet @item @code{CTLSEQS_PH_NUM}: An unsigned integer. @item @code{CTLSEQS_PH_NUMS}: Multiple unsigned integers separated with the semicolon ASCII character (value @code{0x3b}). @item @code{CTLSEQS_PH_STR}: A string of printable characters. @item @code{CTLSEQS_PH_CMDSTR}: A string containing only printable characters and characters of range @code{0x08} to @code{0x0d}. @item @code{CTLSEQS_PH_CSI_PARAM}: A string of CSI parameter bytes. @item @code{CTLSEQS_PH_CSI_INTMD}: A string of CSI intermediate bytes. @item @code{CTLSEQS_PH_HEXNUM}: A string representing a hexadecimal number. @item @code{CTLSEQS_PH_CHRSTR}: A string of any bit combination which does not represent @code{SOS} or @code{ST}. @end itemize @cindex Control sequence matcher pattern example The following code is a valid example of patterns: @example const char *patterns[] = @{ CTLSEQS_CUP(CTLSEQS_PH_NUM, CTLSEQS_PH_NUM), CTLSEQS_XTVERSION(), CTLSEQS_DECRQM("1000"), // ... @}; @end example @node Matching String @section Matching String Function @code{ctlseqs_match} matches a given character string to a matcher. The function accepts four arguments: the matcher, the string to match, length of the string to match, and a buffer which stores the match result. Before matching, a buffer which is large enough to store the match result should be allocated. The buffer is an array of @code{union ctlseqs_value}, whose definition is shown below: @example union ctlseqs_value @{ char const *str; size_t len; unsigned long num; @}; @end example If the string contains a recognizable control function, or part of a control function which is not yet terminated by the end of the string, the length of control function will be stored at @code{len} field of match result buffer at offset 0, and the pointer to the first character of the control funtion at the @code{str} field at offset 1. If @code{ctlseqs_match} fails to find any control functions, returns @code{CTLSEQS_NOSEQ}. For a partial control function, returns @code{CTLSEQS_PARTIAL}. If the matcher is not configured with a matching pattern of the control function, the function returns @code{CTLSEQS_NOMATCH}. If the control function matches a pattern configured in the matcher, returns the offset of the matched pattern, and stores the extracted values to the result buffer according to each of the placeholders, starting from offset 2: @itemize @bullet @item @code{CTLSEQS_PH_NUM}, @code{CTLSEQS_PH_HEXNUM}: The integer is stored in field @code{num}. @item @code{CTLSEQS_PH_NUMS}: The number of integers is stored in field @code{len}, followed by that many integers stored in field @code{num}. @item @code{CTLSEQS_PH_CSI_PARAM}, @code{CTLSEQS_PH_CSI_INTMD}, @code{CTLSEQS_PH_STR}, @code{CTLSEQS_PH_CMDSTR}, @code{CTLSEQS_PH_CHRSTR}: The length of the string is stored in field @code{len}, followed by a pointer to the first character of the string stored in field @code{str}. @end itemize The following code is an example of invoking @code{ctlseqs_match}: @example union ctlseqs_value buffer[4]; struct ctlseqs_matcher *matcher /* = ... */; // ... char const *str = "foo" CTLSEQS_CUP("2", "4"); size_t str_len = sizeof("foo" CTLSEQS_CUP("2", "4")) - 1; ssize_t result = ctlseqs_match(matcher, str, str_len, buffer); assert(result == 0); assert(buffer[0].len == 6 && buffer[1].str == str + 3); assert(buffer[2].num == 2 && buffer[3].num == 4); @end example Function @code{ctlseqs_match} allows @code{NULL} value for argument @code{matcher}, in which case it behaves like a matcher configured with zero patterns is provided. @quotation Caution If the given string can match multiple patterns in the matcher, it is unspecified which one of them will be the final match result. @end quotation @node Control Sequence Reading @chapter Control Sequence Reading In practice, control sequences are often read from stream (e.g. STDIN). The ctlseqs library provides a control sequence reader utility, which reads data from a file descriptor, buffers it, and matches the data against the patterns specified in the given matcher. Like the matcher, @code{struct ctlseqs_reader *} is a pointer to an opaque type which represents an instance of control sequence reader, initialized with @code{ctlseqs_reader_init} and deallocated with @code{ctlseqs_reader_free} after using. It is safe to pass null pointers to @code{ctlseqs_reader_free}. @cindex Control sequence reader initialization example @example struct ctlseqs_reader *reader = ctlseqs_reader_init(); // ... ctlseqs_reader_free(reader); @end example @node Reader Configuration @section Reader Configuration Options of a control sequence reader can be set with function @code{ctlseqs_reader_config}. @cindex Control sequence reader configuration example @example union ctlseqs_value buffer[4]; struct ctlseqs_reader *reader /* = ... */; struct ctlseqs_reader_options options = @{ .result = buffer, .maxlen = 1024, .fd = STDIN_FILENO, .flags = 0, @}; int result = ctlseqs_reader_config(reader, &options); // ... @end example In @code{struct ctlseqs_reader_options}: @itemize @bullet @item field @code{result} is the buffer where match results are stored, as defined in @ref{Matching String}. @item Field @code{maxlen} is the maximum length of control sequence to process, before the reader gives up reading and returns an error. @item Field @code{fd} is the file descriptor to read from. @item Field @code{flags} is the bitmask of multiple boolean options that affects the reader's behaviour. See @ref{Reader Flags} for details. @end itemize Upon success, the function return @code{CTLSEQS_OK}. If the function fails to allocate enough memory (for the internal read buffer), returns @code{CTLSEQS_NOMEM}. If trying to set @code{maxlen} to a value smaller than the current internal read buffer data size, returns @code{CTLSEQS_ERROR}. @node Reader Flags @subsection Reader Flags @node Tips @chapter Tips & Hints @node Example Programs @chapter Example Programs In the source code repository of ctlseqs, there are a few programs that demonstrates the basic usage of ctlseqs. See ``example/sixdraw.c'' for a simple TUI program in which you can draw lines on the terminal window using a mouse. See ``tests/tcsgrep.c'' for a program which parses control functions into human-readable text. It was originally written as a tool to run tests for ctlseqs, but can also be considered as a crude alternative of @url{https://www.gnu.org/software/teseq/, GNU Teseq}. @node API Reference @appendix API Reference @set url-man-pages https://nongnu.org/ctlseqs/man-pages/man3 This section contains a complete list of functions exposed by ctlseqs. See the corresponding man pages for details. @url{@value{url-man-pages}/ctlseqs_matcher_init.3.html, Initialize matcher}: @example struct ctlseqs_matcher *ctlseqs_matcher_init(void); @end example @url{@value{url-man-pages}/ctlseqs_matcher_config.3.html, Configure matcher}: @example int ctlseqs_matcher_config( struct ctlseqs_matcher *matcher, struct ctlseqs_matcher_options const *options ); @end example @url{@value{url-man-pages}/ctlseqs_match.3.html, Match string}: @example ssize_t ctlseqs_match( struct ctlseqs_reader const *matcher, char const *str, size_t str_len, union ctlseqs_value *result ); @end example @url{@value{url-man-pages}/ctlseqs_matcher_free.3.html, Destroy matcher}: @example void ctlseqs_matcher_free( struct ctlseqs_matcher *matcher ); @end example @url{@value{url-man-pages}/ctlseqs_reader_init.3.html, Initialize reader}: @example struct ctlseqs_reader *ctlseqs_reader_init(void); @end example @url{@value{url-man-pages}/ctlseqs_reader_config.3.html, Configure reader}: @example int ctlseqs_reader_config( struct ctlseqs_reader *reader, struct ctlseqs_reader_options const *options ); @end example @url{@value{url-man-pages}/ctlseqs_read.3.html, Read and match}: @example ssize_t ctlseqs_read( struct ctlseqs_reader *reader, struct ctlseqs_matcher const *matcher, int timeout ); @end example @url{@value{url-man-pages}/ctlseqs_purge.3.html, Purge reader}: @example void ctlseqs_purge( struct ctlseqs_reader *reader, size_t nbytes ); @end example @url{@value{url-man-pages}/ctlseqs_reader_free.3.html, Destroy reader}: @example void ctlseqs_reader_free( struct ctlseqs_reader *reader ); @end example @node General Index @appendix General Index @printindex cp @node GNU Free Documentation License @appendix GNU Free Documentation License @include fdl.texi @bye