Helper library for control sequences.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

502 lines
16 KiB

\input texinfo @c -*-texinfo-*-
@c %**start of header
@include version.texi
@settitle ctlseqs @value{VERSION} Manual
@c %**end of header
This manual is for ctlseqs, a helper library for control sequences.
Copyright @copyright{} 2021 CismonX <>
Permission is granted to copy, distribute and/or modify this document under
the terms of the GNU Free Documentation License, Version 1.3 or any later
version published by the Free Software Foundation; with no Invariant Sections,
with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the
license is included in the section entitled ``GNU Free Documentation License''.
@end quotation
@end copying
@title ctlseqs
@subtitle Helper Library for Control Sequences, version @value{VERSION}
@author CismonX
@vskip 0pt plus 1filll
@end titlepage
@node Top
@top ctlseqs
This manual is for ctlseqs, a helper library for control sequences.
Permission is granted to copy, distribute and/or modify this document under the
terms of the @pxref{GNU Free Documentation License}, Version 1.3 or any later
version published by the Free Software Foundation; with no Invariant Sections,
with no Front-Cover Texts, and with no Back-Cover Texts.
ctlseqs is free software. You can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option) any later
@end ifnottex
* Overview:: Brief overview of ctlseqs.
* Helper Macros:: Helper macros provided by ctlseqs.
* Control Sequence Matching:: Using ctlseqs for matching control sequences.
* Control Sequence Reading:: Using ctlseqs for reading control sequences.
* Tips:: Tips & hints for using ctlseqs.
* Example Programs:: Example programs using ctlseqs.
* API Reference:: C API reference for ctlseqs.
* GNU Free Documentation License:: Copying conditions of this manual.
@end menu
@node Overview
@chapter Overview of ctlseqs
The name ``ctlseqs'' is an abbreviation of ``control sequences'', as defined
in section 5.4 of ECMA-48.
As the name suggests, this library focuses on handling control sequences.
However, it only cares about the bit combinations, while the actual meaning
and implementation of a control sequence is up to the user.
The C API provided by ctlseqs is composed of three major parts: The helper
macros, the control sequence matcher, and the control sequence reader. Any of
them can be used separatedly or combined, after including the header file
@code{ctlseqs.h} in a source file.
* Contributing:: Contributing to ctlseqs.
* Use Scenarios:: When to use ctlseqs.
@end menu
@node Contributing
@section Contributing
We welcome any form of contribution to ctlseqs (as well as this manual),
including bug reports, patches, etc.
As ctlseqs is primarily @url{, hosted on Savannah},
it is recommended to contribute using the bug tracker and patch manager.
Sending an email to @email{} is also a viable option.
@cindex Checklist for bug reports
An effective bug report should contain enough information to reproduce the bug,
which may contain:
@itemize @bullet
@item The version number of ctlseqs involved.
@item A minimal code snippet to reproduce the bug.
@item Expected and actual behaviour of the program.
@item A core file for the crashed program.
@item Name of the operating system and hardware.
@end itemize
@cindex Checklist for patch submission
Before you submit a patch for ctlseqs, it is recommended to:
@itemize @bullet
@item Follow the existing coding style.
@item Discuss with the community about new features or breaking changes.
@item Write test cases, documentation and changelogs for your code.
@end itemize
@node Use Scenarios
@section Use Scenarios of ctlseqs
Control sequences, as well as other control functions, were once commonly used
in computer terminals. Terminals exchange control information with the host
regarding colors, font styles, cursor position, etc., using control functions
embedded in normal text. Such physical terminals are no longer used today,
however, popular ones like DEC VT100 are widely emulated by modern terminal
The primary purpose of the ctlseqs library is to provide developers with a set
of simple and easy-to-use API for handling control functions, when working on
terminal emulators and text-based programs.
However, while there is no de facto standard, control functions used in
terminals are largely vendor-specific, and terminal emulators like to add their
own private controls. That makes ctlseqs not suitable for writing text-based
programs which intend to be portable. Instead of raw control codes, the
developer should stick to ncurses or terminfo.
@cindex List of common use cases of ctlseqs
There are still cases when dealing with raw escape sequences is inevitable, and
ctlseqs may come in handy:
@itemize @bullet
@item Development of text-based programs which rely heavily on special control
sequences, which is not supported by libraries like ncurses.
@item Implementing a terminal emulator.
@item Experimenting or debugging the features of text-based programs or
terminal emulators.
@end itemize
@node Helper Macros
@chapter Helper Macros
A helper macro in ctlseqs is a C preprocessor macro representing a control
function, which expands to a C string literal.
@cindex List of control function types in ctlseqs helper macros
The control function can be one of the following three types:
@itemize @bullet
@item Elements from the C0 or C1 set.
@item Control Sequences.
@item Other control functions (such as device control functions).
@end itemize
Name of a helper macro is the function name with @code{CTLSEQS_} as prefix.
For a control function other than elements from the C0 or C1 set, the
corresponding helper macro is a function-like macro which may or may not take
Control sequences listed in the helper macros are primarily exerpted from
@url{, XTerm's manual},
which may differ across implementations.
As ctlseqs does not currently support 8-bit controls, 2-character 7-bit codes
from the C1 set are used instead of their 1-character 8-bit representation.
For example, @code{CTLSEQS_CSI} expands to @code{"\x1b["}.
@cindex Helper macro usage example
The following code snippet is an example usage of helper macros:
printf(CTLSEQS_CUP("%d", "%d"), 3, 4);
@end example
Rememeber that the standard output stream is line buffered within a terminal.
Either @code{fflush(stdout)} after printing, or disable output buffering with
@code{setvbuf(stdout, NULL, _IONBF, 0)}.
@node Control Sequence Matching
@chapter Control Sequence Matching
Given a character string, checking whether it matches a control sequence is
quite trivial, with only the standard C library:
char const *str /* = ... */;
int row, col;
if (0 == strcmp(str, CTLSEQS_XTVERSION())) @{
// ...
@} else if (2 == sscanf(str, CTLSEQS_CUP("%d", "%d"), &row, &col)) @{
// ...
@} else /* ... */
@end example
However, as the number of possible matches grows, this naive implementation
becomes less efficient and harder to maintain.
Such problems can be easily solved by using the control sequence matcher
provided by ctlseqs.
The @code{struct ctlseqs_matcher *} is a pointer to an opaque type which
represents an instance of control sequence matcher. Before using, the matcher
should be initialized with @code{ctlseqs_matcher_init}. After used, it should
be deallocated with @code{ctlseqs_matcher_free}.
@cindex Control sequence matcher initialization example
struct ctlseqs_matcher *matcher = ctlseqs_matcher_init();
// ...
@end example
On rare occurences when ctlseqs fail to allocate enough memory, function
@code{ctlseqs_matcher_init} may return @code{NULL}. However, it is okay to pass
null pointers to @code{ctlseqs_matcher_free}, which in turn does nothing.
* Matcher Configuration:: Configuring a control sequence matcher
* Matching String:: Matching a string with control sequence matcher
@end menu
@node Matcher Configuration
@section Matcher Configuration
Matcher configuration consists of two parts: the number of matching patterns,
and the patterns themselves. Invoke function @code{ctlseqs_matcher_config} to
configure a matcher.
@cindex Control sequence matcher configuration example
struct ctlseqs_matcher *matcher /* = ... */;
char const *patterns[] = @{
// ...
struct ctlseqs_matcher_options options = @{
.patterns = patterns,
.npatterns = sizeof(patterns) / sizeof(char const *),
int result = ctlseqs_matcher_config(matcher, &options);
// ...
@end example
Each invocation of @code{ctlseqs_matcher_config} on the same matcher overwrites
the data generated from the last invocation. Upon success, the function returns
@code{CTLSEQS_OK}. If the function fails to allocate enough memory, returns
@quotation Caution
If the @code{patterns} field in @code{struct ctlseqs_matcher_options} is
invalid, function behaviour is undefined. See @ref{Patterns} for details.
@end quotation
* Patterns:: Supported control squence pattern formats
@end menu
@node Patterns
@subsection Patterns
The @code{patterns} field in @code{struct ctlseqs_matcher_options} is an array
of NUL-terminated strings which indicates the desired patterns of control
functions for the current matcher.
@cindex Control functions supported by the matcher
The following types of control functions are recognizable by the matcher:
@itemize @bullet
@item Control sequences: @code{CSI [param...] [intmd...] final}
@item C1 functions with command string: @code{(APC|DCS|OSC|PM) [cmdstr] ST}
@item Single shifts: @code{(SS2|SS3) ch}
@item SOS function: @code{SOS [chrstr] ST}
@end itemize
According to ECMA-48, CSI parameter bytes are of range @code{0x30} to
@code{0x3f}, intermediate bytes @code{0x20} to @code{0x2f}, and final byte
@code{0x40} to @code{0x7e}. Command string consists of printable characters and
characters of range @code{0x08} and @code{0x0e}. Character string can be any
bit combination which does not represent @code{SOS} or @code{ST}.
A supported control function, either verbatim or combined with placeholders,
can be specified as a valid pattern. The terminating @code{NUL} character does
not count into the pattern.
@cindex List of supported placeholders
A placeholder indicates that when matching a string against the pattern, the
value at the placeholder's location should conform to its rules. A placeholder
can only take place in the @code{param}, @code{intmd}, @code{cmdstr} or
@code{chrstr} fields, and can be one of the following values:
@itemize @bullet
@item @code{CTLSEQS_PH_NUM}: An unsigned integer.
@item @code{CTLSEQS_PH_NUMS}: Multiple unsigned integers separated with the
semicolon ASCII character (value @code{0x3b}).
@item @code{CTLSEQS_PH_STR}: A string of printable characters.
@item @code{CTLSEQS_PH_CMDSTR}: A string containing only printable characters
and characters of range @code{0x08} to @code{0x0d}.
@item @code{CTLSEQS_PH_CSI_PARAM}: A string of CSI parameter bytes.
@item @code{CTLSEQS_PH_CSI_INTMD}: A string of CSI intermediate bytes.
@item @code{CTLSEQS_PH_HEXNUM}: A string representing a hexadecimal number.
@item @code{CTLSEQS_PH_CHRSTR}: A string of any bit combination which does not
represent @code{SOS} or @code{ST}.
@end itemize
@cindex Control sequence matcher pattern example
The following code is a valid example of patterns:
const char *patterns[] = @{
// ...
@end example
@node Matching String
@section Matching String
Function @code{ctlseqs_match} matches a given character string to a matcher.
The function accepts four arguments: the matcher, the string to match, length
of the string to match, and a buffer which stores the match result.
Before matching, a buffer which is large enough to store the match result
should be allocated. The buffer is an array of @code{union ctlseqs_value},
whose definition is shown below:
union ctlseqs_value @{
char const *str;
size_t len;
unsigned long num;
@end example
If the string contains a recognizable control function, or part of a control
function which is not yet terminated by the end of the string, the length of
control function will be stored at @code{len} field of match result buffer at
offset 0, and the pointer to the first character of the control funtion at the
@code{str} field at offset 1.
If @code{ctlseqs_match} fails to find any control functions, returns
@code{CTLSEQS_NESEQ}. For a partial control function, returns
@code{CTLSEQS_PARTIAL}. If the matcher is not configured with a pattern of the
control function, the function returns @code{CTLSEQS_NOMATCH}.
If the control function matches a pattern configured in the matcher, returns
the offset of the matched pattern, and stores the extracted values to the
result buffer according to each of the placeholders, starting from offset 2:
@itemize @bullet
@item @code{CTLSEQS_PH_NUM}, @code{CTLSEQS_PH_HEXNUM}: The integer is stored in
field @code{num}.
@item @code{CTLSEQS_PH_NUMS}: The number of integers is stored in field
@code{len}, followed by that many integers stored in field @code{num}.
length of the string is stored in field @code{len}, followed by a pointer to
the first character of the string stored in field @code{str}.
@end itemize
The following code is an example of invoking @code{ctlseqs_match}:
struct ctlseqs_matcher *matcher /* = ... */;
// ...
union ctlseqs_value buffer[4];
char const *str = "foo" CTLSEQS_CUP("2", "4");
size_t str_len = sizeof("foo" CTLSEQS_CUP("2", "4")) - 1;
ssize_t result = ctlseqs_match(matcher, str, str_len, buffer);
assert(result == 0);
assert(buffer[0].len == 6 && buffer[1].str == str + 3);
assert(buffer[2].num == 2 && buffer[3].num == 4);
@end example
Function @code{ctlseqs_match} allows @code{NULL} value for argument
@code{matcher}, in which case it behaves like a matcher configured with zero
patterns is provided.
@quotation Caution
If the given string can match multiple patterns in the matcher, it is
unspecified which of them will be the final match result.
@end quotation
@node Control Sequence Reading
@chapter Control Sequence Reading
@node Tips
@chapter Tips & Hints
@node Example Programs
@chapter Example Programs
@node API Reference
@appendix API Reference
This appendix section contains a complete list of functions exposed by ctlseqs,
which is meant for a TL;DR purpose. See the corresponding man(3) pages for
concise details.
Initialize matcher:
struct ctlseqs_matcher *ctlseqs_matcher_init(void);
@end example
Configure matcher:
int ctlseqs_matcher_config(
struct ctlseqs_matcher *matcher,
struct ctlseqs_matcher_options const *options);
@end example
Match string:
ssize_t ctlseqs_match(
struct ctlseqs_reader const *matcher,
char const *str,
size_t str_len,
union ctlseqs_value *result);
@end example
Destroy matcher:
void ctlseqs_matcher_free(struct ctlseqs_matcher *matcher);
@end example
Initialize reader:
struct ctlseqs_reader *ctlseqs_reader_init(void);
@end example
Configure reader:
int ctlseqs_reader_config(
struct ctlseqs_reader *reader,
struct ctlseqs_reader_options const *options);
@end example
Read and match:
ssize_t ctlseqs_read(
struct ctlseqs_reader *reader,
struct ctlseqs_matcher const *matcher,
int timeout);
@end example
Purge reader:
void ctlseqs_purge(
struct ctlseqs_reader *reader,
size_t nbytes);
@end example
Destroy reader:
void ctlseqs_reader_free(struct ctlseqs_reader *reader);
@end example
@node GNU Free Documentation License
@appendix GNU Free Documentation License
@include fdl.texi