Home Special characters Dashes Universal charset
 The guide of UDO
 The syntax of UDO
 Special characters

Converting 8 bit chars

In an UDO source file you can use "higher" characters without having to know how a character has to look like in a destination format like LaTeX or Windows Help. So you can enter a German `ß' without any fear, UDO converts it for you and it knows that this has to be ß for HTML or {\ss} for LaTeX.

UDO expects files containing chars of the system charset of your operating system. If you run UDO on a MS-DOS computer UDO expects text files that are written with the IBM PC character set by default. If UDO runs on an Atari computer UDO will expect the TOS character set by default.

But UDO can manage file that are written with another character set, too. You have simply to tell UDO which character set your source file uses with !code [<charset>].

Below is an overview of the character sets UDO knows about:

cp437 IBM-PC character set, codepage 437, see http://www.kostis.net/charsets/cp437.htm
cp850 IBM-PC character set, codepage 850, see http://www.kostis.net/charsets/cp850.htm
dos IBM-PC character set, same as !code [cp437]
hp8 HP-Roman-8 character set, see http://www.kostis.net/charsets/hproman8.htm
iso ISO-Latin-1 character set, see http://www.kostis.net/charsets/iso8859.1.htm
iso-8859-1 like !code [iso]
mac Apple-Macintosh character set, see http://www.kostis.net/charsets/applerom.htm
next NeXTStep character set, see http://www.kostis.net/charsets/nextstep.htm
latin1 like !code [iso]
os2 OS/2 character set, same as !code [cp850]
tos Atari-ST character set (TOS), see http://www.kostis.net/charsets/atarist.htm
utf8 UTF-8, http://www.nada.kth.se/i18n/ucs/unicode-iso10646-oview.html
utf-8 UTF-8, same as !code [utf8]

There are some things you have to remember. Some character sets contain characters that aren't available in another one. So you shouldn't use characters from the PC graphic character set or the Hebraic characters of the Atari character set because they can't be printed in formats like LaTeX, Windows Help, RTF or HTML. In this case UDO prints an error message. You should remove these characters from your source file and find another solution.

If source files are converted that don't use the character set of the operating system UDO is running on the limitations are even higher. In the first step UDO will convert the characters into ISO Latin 1. In the second step UDO will convert the ISO Latin 1 characters into the character set of the current operating system. In some cases there's is no possibility to convert the characters without any loss. In such a case UDO will print an error message.

Please note:

If any character was forgotten or a character is converted in a wrong way please send a bug report!


Copyright © www.udo-open-source.org
Last updated on November 5, 2006

Home Special characters Dashes Universal charset