MULTEXT - Document MSC 1. MtScript/Languages.


MtScript: languages and character sets supported

At the moment, MtScript supports the languages and character sets listed in the table below. The table also gives the mapping between languages and character sets, and the corresponding language code.

Language Character set Language code
Arabic iso_8859_6ar
Bulgarian iso_8859_5bu
Chinese GB gb_2312_80zh_CN
Chinese BIG5 big5_0zh_TW
Czech iso_8859_2cs
Dutch iso_8859_1nl
English iso_8859_1en
Estonian iso_8859_4et
French iso_8859_1fr
German iso_8859_1de
Greek iso_8859_7el
Hebrew iso_8859_8iw
Hungarian iso_8859_2hu
Romanian iso_8859_2ro
Italian iso_8859_1it
Japanese jisx_0208_1983_0ja
Korean ksc_5601_1987_0ko
Russian iso_8859_5ru
Slovak iso_8859_2sk
Slovene iso_8859_2sl
Spanish iso_8859_1es
Swedish iso_8859_1sv
Ukrainian iso_8859_5uk

Other languages and character sets are under development. If you have expertise in a language which is not listed above, you can help us design the language rules for that language so that other users may benefit from an additional language in the next release. Thanks!

Copyright © Centre National de la Recherche Scientifique, 1996.