Unicode Utility Routines

The %UTF2HEX and %HEX2UTF M utility routines provide conversions between UTF-8 and hexadecimal code-point representations. Both these utilities run in only in UTF-8 mode; in M mode, they both trigger a run-time error.

%UTF2HEX

The GT.M %UTF2HEX utility returns the hexadecimal notation of the internal byte encoding of a UTF-8 encoded GT.M character string. This routine has entry points for both interactive and non-interactive use.

DO ^%UTF2HEX converts the string stored in %S to the hexadecimal byte notation and stores the result in %U.

DO INT^%UTF2HEX converts the interactively entered string to the hexadecimal byte notation and stores the result in %U.

$$FUNC^%UTF2HEX(s) returns the hexadecimal byte representation of the character string s.

Example:

GTM>write $zchset
UTF-8
GTM>SET %S=$CHAR($$FUNC^%HD("0905"))_$CHAR($$FUNC^%HD("091A"))_$CHAR($$FUNC^%HD(
"094D"))_$CHAR($$FUNC^%HD("091B"))_$CHAR($$FUNC^%HD("0940"))

GTM>zwrite
%S="अच्छी"

GTM>DO ^%UTF2HEX

GTM>zwrite
%S="अच्छी"
%U="E0A485E0A49AE0A58DE0A49BE0A580"

GTM>write $$FUNC^%UTF2HEX("ABC")
414243
GTM>

Note that %UTF2HEX provides functionality similar to the UNIX binary dump utility (od -x).

%HEX2UTF

The GT.M %HEX2UTF utility returns the GT.M encoded character string from the given bytestream in hexadecimal notation. This routine has entry points for both interactive and non-interactive use.

DO ^%HEX2UTF converts the hexadecimal byte stream stored in %U into a GT.M character string and stores the result in %S.

DO INT^%HEX2UTF converts the interactively entered hexadecimal byte stream into a GT.M character string and stores the result in %S.

$$FUNC^%HEX2UTF(s) returns the GT.M character string specified by the hexadecimal character values in s (each character is specified by its Unicode code point).

Example:

GTM>set u="E0A485" write $$FUNC^%HEX2UTF(u)

GTM>set u="40E0A485" write $$FUNC^%HEX2UTF(u)
@
GTM>