D
Language
Phobos
Comparisons
object
std
std.base64
std.boxer
std.compiler
std.conv
std.ctype
std.date
std.file
std.format
std.gc
std.intrinsic
std.math
std.md5
std.mmfile
std.openrj
std.outbuffer
std.path
std.process
std.random
std.recls
std.regexp
std.socket
std.socketstream
std.stdint
std.stdio
std.cstream
std.stream
std.string
std.system
std.thread
std.uri
std.utf
std.zip
std.zlib
std.windows
std.linux
std.c
std.c.stdio
std.c.windows
std.c.linux
|
std.utf
Encode and decode UTF-8, UTF-16 and UTF-32 strings.
For more information on UTF-8, see
http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8.
Note: For Win32 systems, the C wchar_t type
is UTF-16 and corresponds to the D wchar type.
For linux systems, the C wchar_t type
is UTF-32 and corresponds to the D utf.dchar type.
UTF character support is restricted to (0 <= character <= 0x10FFFF).
- class UtfError
- Exception class that is thrown upon any errors.
The members are:
- idx
- Set to the index of the start of the offending UTF sequence.
- alias ... dchar
- An alias for a single UTF-32 character. This may
become a D basic type in the future.
- bit isValidDchar(dchar c)
- Test if c is a valid UTF-32 character.
Returns true if it is, false if not.
- dchar decode(char[] s, inout uint idx)
- dchar decode(wchar[] s, inout uint idx)
- dchar decode(dchar[] s, inout uint idx)
- Decodes and returns character starting at s[idx].
idx is advanced past the decoded character.
If the character is not well formed, a UriError
is thrown and idx remains unchanged.
- void encode(inout char[] s, dchar c)
- void encode(inout wchar[] s, dchar c)
- void encode(inout dchar[] s, dchar c)
- Encodes character c and appends it to array s.
- void validate(char[] s)
- void validate(wchar[] s)
- void validate(dchar[] s)
- Checks to see if string is well formed or not.
Throws a UtfError if it is not.
Use to check all untrusted input for correctness.
- char[] toUTF8(char[] s)
- char[] toUTF8(wchar[] s)
- char[] toUTF8(dchar[] s)
- Encodes string s into UTF-8 and returns the encoded string.
- wchar[] toUTF16(char[] s)
- wchar* toUTF16z(char[] s)
- wchar[] toUTF16(wchar[] s)
- wchar[] toUTF16(dchar[] s)
- Encodes string s into UTF-16 and returns the encoded string.
toUTF16z is suitable for calling the 'W' functions in the
Win32 API that take an LPWSTR or LPCWSTR argument.
- dchar[] toUTF32(char[] s)
- dchar[] toUTF32(wchar[] s)
- dchar[] toUTF32(dchar[] s)
- Encodes string s into UTF-32 and returns the encoded string.
|