Character handling functions
This header declares a set of functions to classify and transform individual characters.
All these functions take as parameter the int equivalent of one character and return an int, that can either be another character or a value representing a boolean value: an int value of 0 means false, and an int value different from 0 represents true.
There are two sets of functions:
First a set of classifying functions that check whether the character passed as parameter belongs to a certain category. These are:
isalnum | Check if character is alphanumeric (function) |
isalpha | Check if character is alphabetic (function) |
iscntrl | Check if character is a control character (function) |
isdigit | Check if character is decimal digit (function) |
isgraph | Check if character has graphical representation (function) |
islower | Check if character is lowercase letter (function) |
isprint | Check if character is printable (function) |
ispunct | Check if character is a punctuation character (function) |
isspace | Check if character is a white-space (function) |
isupper | Check if character is uppercase letter (function) |
isxdigit | Check if character is hexadecimal digit (function) |
And secondly, two functions to convert between letter cases:
tolower | Convert uppercase letter to lowercase (function) |
toupper | Convert lowercase letter to uppercase (function) |
For the first set, here is a map of how the original 127-character ASCII set is considered by each function (an x indicates that the function returns true on that character)
ASCII values | characters | iscntrl | isspace | isupper | islower | isalpha | isdigit | isxdigit | isalnum | ispunct | isgraph | isprint |
0x00 .. 0x08 | NUL, (other control codes) |
x | | | | | | | | | | |
0x09 .. 0x0D | (white-space control codes: '\t','\f','\v','\n','\r') |
x | x | | | | | | | | | |
0x0E .. 0x1F | (other control codes) |
x | | | | | | | | | | |
0x20 | space (' ') |
| x | | | | | | | | | x |
0x21 .. 0x2F | !"#$%&'()*+,-./ |
| | | | | | | | x | x | x |
0x30 .. 0x39 | 01234567890 |
| | | | | x | x | x | | x | x |
0x3a .. 0x40 | :;<=>?@ |
| | | | | | | | x | x | x |
0x41 .. 0x46 | ABCDEF |
| | x | | x | | x | x | | x | x |
0x47 .. 0x5A | GHIJKLMNOPQRSTUVWXYZ |
| | x | | x | | | x | | x | x |
0x5B .. 0x60 | [\]^_` |
| | | | | | | | x | x | x |
0x61 .. 0x66 | abcdef |
| | | x | x | | x | x | | x | x |
0x67 .. 0x7A | ghijklmnopqrstuvwxyz |
| | | x | x | | | x | | x | x |
0x7B .. 0x7E | {|}~ |
| | | | | | | | x | x | x |
0x7F | (DEL) |
x | | | | | | | | | | |
The characters in the extended character set (above 0x7F) may belong to diverse categories depending on the locale and the platform. As a general rule, ispunct, isgraph and isprint return true on these for the standard C locale on most platforms supporting extended character sets.
|