D.10.3 Conversions
The following functions convert between the two string representations.
- Function: int utf8_mbtowc_internal (void *data, int (*read) (void*), unsigned int *pwc)
Internal function for converting a single UTF-8 character to a corresponding wide character representation. The character to convert is obtained by calling the function pointed to by read with data as its only argument. If that call returns a non-positive value, the function sets
errno
to ‘ENODATA’ and returns -1.
- Function: int utf8_mbtowc (unsigned int *pwc, const char *r, size_t len)
Converts first len characters from the multi-byte string r to wide character representation. On success, returns 0 and stores the result in pwc. The result pointer is allocated using
malloc
(3).On error (invalid multi-byte sequence encountered), returns -1 and sets
errno
to ‘EILSEQ’.
- Function: int utf8_wctomb (unsigned char *r, unsigned int wc)
Stores the UTF-8 representation of the Unicode character wc in
r[0..5]
. Returns the number of bytes stored. If wc is out of range, return -1 and setserrno
to ‘EILSEQ’.
- Function: int utf8_wc_to_mbstr (const unsigned *word, size_t wordlen, char **retptr)
Converts first wordlen characters of the wide character string word to multi-byte representation. The result is returned in retptr. It is allocated using
malloc
(3).Returns 0 on success. On error, returns -1 and sets
errno
to one of the following values:- ENOMEM
Not enough memory to allocate the return buffer.
- EILSEQ
An invalid wide character is encountered.
- Function: int utf8_mbstr_to_wc (const char *str, unsigned **wptr, size_t *plen)
Converts a multi-byte string from str to its wide character representation.
The result is returned in retptr. It is allocated using
malloc
(3).Returns 0 on success. On error, returns -1 and sets
errno
to one of the following values:- ENOMEM
Not enough memory to allocate the return buffer.
- EILSEQ
An invalid wide character is encountered.
- Function: int utf8_mbstr_to_norm_wc (const char *str, unsigned **wptr, size_t *plen)
Converts a multi-byte string from str to its wide character representation, replacing each run of one or more whitespace characters with a single space character (ASCII 32).
The result is returned in retptr. It is allocated using
malloc
(3).Returns 0 on success. On error, returns -1 and sets
errno
to one of the following values:- ENOMEM
Not enough memory to allocate the return buffer.
- EILSEQ
An invalid wide character is encountered.