You are on page 1of 7

##Adobe File Version: 1.000 #======================================================================= # FTP file name: THAI.

TXT # # Contents: Map (external version) from Mac OS Thai # character set to Unicode 2.1 # # Copyright: (c) 1995-1999 by Apple Computer, Inc., all rights # reserved. # # Contact: charsets@apple.com # # Changes: # # b02 1999-Sep-22 Update contact e-mail address. Matches # internal utom<b1>, ufrm<b2>, and Text # Encoding Converter version 1.5. # n07 1998-Feb-05 Update to match internal utom<n5>, ufrm<n13> # and Text Encoding Converter version 1.3: # Use standard Unicodes plus transcoding hints # instead of single corporate characters; see # details below. Also update header comments # to new format. # n04 1995-Nov-17 First version (after fixing some typos). # Matches internal ufrm<n6>. # # Standard header: # ---------------# # Apple, the Apple logo, and Macintosh are trademarks of Apple # Computer, Inc., registered in the United States and other countries. # Unicode is a trademark of Unicode Inc. For the sake of brevity, # throughout this document, "Macintosh" can be used to refer to # Macintosh computers and "Unicode" can be used to refer to the # Unicode standard. # # Apple makes no warranty or representation, either express or # implied, with respect to these tables, their quality, accuracy, or # fitness for a particular purpose. In no event will Apple be liable # for direct, indirect, special, incidental, or consequential damages # resulting from any defect or inaccuracy in this document or the # accompanying tables. # # These mapping tables and character lists are subject to change. # The latest tables should be available from the following: # # <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/> # <ftp://dev.apple.com/devworld/Technical_Documentation/Misc._Standards/> # # For general information about Mac OS encodings and these mapping # tables, see the file "README.TXT". # # Format: # ------# # Three tab-separated columns; # '#' begins a comment which continues to the end of the line. # Column #1 is the Mac OS Thai code (in hex as 0xNN) # Column #2 is the corresponding Unicode or Unicode sequence # (in hex as 0xNNNN or 0xNNNN+0xNNNN).

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Column #3 is a comment containing the Unicode name The entries are in Mac OS Thai code order. Some of these mappings require the use of corporate characters. See the file "CORPCHAR.TXT" and notes below. Control character mappings are not shown in this table, following the conventions of the standard UTC mapping tables. However, the Mac OS Thai character set uses the standard control characters at 0x00-0x1F and 0x7F. Notes on Mac OS Thai: --------------------Codes 0xA1-0xDA and 0xDF-0xFB are the character set from Thai standard TIS 620-2533, except that the following changes are made: 0xEE is TRADE MARK SIGN (instead of THAI CHARACTER YAMAKKAN) 0xFA is REGISTERED SIGN (instead of THAI CHARACTER ANGKHANKHU) 0xFB is COPYRIGHT SIGN (instead of THAI CHARACTER KHOMUT) Codes 0x80-0x82, 0x8D-0x8E, 0x91, 0x9D-0x9E, and 0xDB-0xDE are various additional punctuation marks (e.g. curly quotes, ellipsis), no-break space, and two special characters "word join" and "word break". Codes 0x83-0x8C, 0x8F, and 0x92-0x9C are for positional variants of the upper vowels, tone marks, and other signs at 0xD1, 0xD4-0xD7, and 0xE7-0xED. The positional variants would normally be considered presentation forms only and not characters. In most cases they are not typed directly; they are selected automatically at display time by the WorldScript software. However, using the Thai-DTP keyboard, the presentation forms can in fact be typed directly using dead keys. Thus they must be treated as real characters in the Mac OS Thai encoding. They are mapped using variant tags; see below. Several code points are undefined and unused (they cannot be typed using any of the Mac OS Thai keyboard layouts): 0x90, 0x9F, 0xFC-0xFE. These are not shown in the table below. Unicode mapping issues and notes: --------------------------------The goals in the Apple mappings provided here are: - Ensure roundtrip mapping from every character in the Mac OS Thai character set to Unicode and back - Use standard Unicode characters as much as possible, to maximize interchangeability of the resulting Unicode text. Whenever possible, avoid having content carried by private-use characters. To satisfy both goals, we use private use characters to mark variants that are similar to a sequence of one or more standard Unicode characters. Apple has defined a block of 32 corporate characters as "transcoding hints." These are used in combination with standard Unicode characters to force them to be treated in a special way for mapping to other encodings; they have no other effect. Sixteen of these transcoding

# hints are "grouping hints" - they indicate that the next 2-4 Unicode # characters should be treated as a single entity for transcoding. The # other sixteen transcoding hints are "variant tags" - they are like # combining characters, and can follow a standard Unicode (or a sequence # consisting of a base character and other combining characters) to # cause it to be treated in a special way for transcoding. These always # terminate a combining-character sequence. # # The transcoding coding hints used in this mapping table are four # variant tags in the range 0xF873-75. Since these are combined with # standard Unicode characters, some characters in the Mac OS Thai # character set map to a sequence of two Unicodes instead of a single # Unicode character. For example, the Mac OS Thai character at 0x83 is a # low-left positional variant of THAI CHARACTER MAI EK (the standard # mapping is for the abstract character at 0xE8). So 0x83 is mapped to # 0x0E48 (THAI CHARACTER MAI EK) + 0xF875 (a variant tag). # # Details of mapping changes in each version: # ------------------------------------------# # Changes from version n04 to version n07: # # - Changed mappings of the positional variants to use standard # Unicodes + transcoding hint, instead of using single corporate # zone characters. This affected the mappings for the following: # 0x83-08C, 0x8F, 0x92-0x9C # # - Just comment out unused code points in the table, instead # of mapping them to U+FFFD. # ################## 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2A 0x2B 0x2C 0x2D 0x2E 0x2F 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3A 0x3B 0x0020 0x0021 0x0022 0x0023 0x0024 0x0025 0x0026 0x0027 0x0028 0x0029 0x002A 0x002B 0x002C 0x002D 0x002E 0x002F 0x0030 0x0031 0x0032 0x0033 0x0034 0x0035 0x0036 0x0037 0x0038 0x0039 0x003A 0x003B # # # # # # # # # # # # # # # # # # # # # # # # # # # # SPACE EXCLAMATION MARK QUOTATION MARK NUMBER SIGN DOLLAR SIGN PERCENT SIGN AMPERSAND APOSTROPHE LEFT PARENTHESIS RIGHT PARENTHESIS ASTERISK PLUS SIGN COMMA HYPHEN-MINUS FULL STOP SOLIDUS DIGIT ZERO DIGIT ONE DIGIT TWO DIGIT THREE DIGIT FOUR DIGIT FIVE DIGIT SIX DIGIT SEVEN DIGIT EIGHT DIGIT NINE COLON SEMICOLON

0x3C 0x3D 0x3E 0x3F 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4A 0x4B 0x4C 0x4D 0x4E 0x4F 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5A 0x5B 0x5C 0x5D 0x5E 0x5F 0x60 0x61 0x62 0x63 0x64 0x65 0x66 0x67 0x68 0x69 0x6A 0x6B 0x6C 0x6D 0x6E 0x6F 0x70 0x71 0x72 0x73 0x74 0x75 0x76 0x77

0x003C 0x003D 0x003E 0x003F 0x0040 0x0041 0x0042 0x0043 0x0044 0x0045 0x0046 0x0047 0x0048 0x0049 0x004A 0x004B 0x004C 0x004D 0x004E 0x004F 0x0050 0x0051 0x0052 0x0053 0x0054 0x0055 0x0056 0x0057 0x0058 0x0059 0x005A 0x005B 0x005C 0x005D 0x005E 0x005F 0x0060 0x0061 0x0062 0x0063 0x0064 0x0065 0x0066 0x0067 0x0068 0x0069 0x006A 0x006B 0x006C 0x006D 0x006E 0x006F 0x0070 0x0071 0x0072 0x0073 0x0074 0x0075 0x0076 0x0077

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

LESS-THAN SIGN EQUALS SIGN GREATER-THAN SIGN QUESTION MARK COMMERCIAL AT LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER LEFT SQUARE BRACKET REVERSE SOLIDUS RIGHT SQUARE BRACKET CIRCUMFLEX ACCENT LOW LINE GRAVE ACCENT LATIN SMALL LETTER A LATIN SMALL LETTER B LATIN SMALL LETTER C LATIN SMALL LETTER D LATIN SMALL LETTER E LATIN SMALL LETTER F LATIN SMALL LETTER G LATIN SMALL LETTER H LATIN SMALL LETTER I LATIN SMALL LETTER J LATIN SMALL LETTER K LATIN SMALL LETTER L LATIN SMALL LETTER M LATIN SMALL LETTER N LATIN SMALL LETTER O LATIN SMALL LETTER P LATIN SMALL LETTER Q LATIN SMALL LETTER R LATIN SMALL LETTER S LATIN SMALL LETTER T LATIN SMALL LETTER U LATIN SMALL LETTER V LATIN SMALL LETTER W

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

0x78 0x79 0x7A 0x7B 0x7C 0x7D 0x7E # 0x80 0x81 0x82 0x83 0x84 0x85 0x86 0x87 0x88 0x89 0x8A 0x8B 0x8C 0x8D 0x8E 0x8F # 0x91 0x92 0x93 0x94 0x95 0x96 0x97 0x98 0x99 0x9A 0x9B 0x9C 0x9D 0x9E # 0xA0 0xA1 0xA2 0xA3 0xA4 0xA5 0xA6 0xA7 0xA8 0xA9 0xAA 0xAB 0xAC 0xAD 0xAE 0xAF 0xB0 0xB1 0xB2 0xB3

0x0078 0x0079 0x007A 0x007B 0x007C 0x007D 0x007E

# # # # # # #

LATIN SMALL LETTER X LATIN SMALL LETTER Y LATIN SMALL LETTER Z LEFT CURLY BRACKET VERTICAL LINE RIGHT CURLY BRACKET TILDE

0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK 0x2026 # HORIZONTAL ELLIPSIS 0x0E48+0xF875 # THAI CHARACTER MAI EK, low left position 0x0E49+0xF875 # THAI CHARACTER MAI THO, low left position 0x0E4A+0xF875 # THAI CHARACTER MAI TRI, low left position 0x0E4B+0xF875 # THAI CHARACTER MAI CHATTAWA, low left position 0x0E4C+0xF875 # THAI CHARACTER THANTHAKHAT, low left position 0x0E48+0xF873 # THAI CHARACTER MAI EK, low position 0x0E49+0xF873 # THAI CHARACTER MAI THO, low position 0x0E4A+0xF873 # THAI CHARACTER MAI TRI, low position 0x0E4B+0xF873 # THAI CHARACTER MAI CHATTAWA, low position 0x0E4C+0xF873 # THAI CHARACTER THANTHAKHAT, low position 0x201C # LEFT DOUBLE QUOTATION MARK 0x201D # RIGHT DOUBLE QUOTATION MARK 0x0E4D+0xF874 # THAI CHARACTER NIKHAHIT, left position 0x2022 # BULLET 0x0E31+0xF874 # THAI CHARACTER MAI HAN-AKAT, left position 0x0E47+0xF874 # THAI CHARACTER MAITAIKHU, left position 0x0E34+0xF874 # THAI CHARACTER SARA I, left position 0x0E35+0xF874 # THAI CHARACTER SARA II, left position 0x0E36+0xF874 # THAI CHARACTER SARA UE, left position 0x0E37+0xF874 # THAI CHARACTER SARA UEE, left position 0x0E48+0xF874 # THAI CHARACTER MAI EK, left position 0x0E49+0xF874 # THAI CHARACTER MAI THO, left position 0x0E4A+0xF874 # THAI CHARACTER MAI TRI, left position 0x0E4B+0xF874 # THAI CHARACTER MAI CHATTAWA, left position 0x0E4C+0xF874 # THAI CHARACTER THANTHAKHAT, left position 0x2018 # LEFT SINGLE QUOTATION MARK 0x2019 # RIGHT SINGLE QUOTATION MARK 0x00A0 0x0E01 0x0E02 0x0E03 0x0E04 0x0E05 0x0E06 0x0E07 0x0E08 0x0E09 0x0E0A 0x0E0B 0x0E0C 0x0E0D 0x0E0E 0x0E0F 0x0E10 0x0E11 0x0E12 0x0E13 # # # # # # # # # # # # # # # # # # # # NO-BREAK SPACE THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER THAI CHARACTER KO KAI KHO KHAI KHO KHUAT KHO KHWAI KHO KHON KHO RAKHANG NGO NGU CHO CHAN CHO CHING CHO CHANG SO SO CHO CHOE YO YING DO CHADA TO PATAK THO THAN THO NANGMONTHO THO PHUTHAO NO NEN

0xB4 0xB5 0xB6 0xB7 0xB8 0xB9 0xBA 0xBB 0xBC 0xBD 0xBE 0xBF 0xC0 0xC1 0xC2 0xC3 0xC4 0xC5 0xC6 0xC7 0xC8 0xC9 0xCA 0xCB 0xCC 0xCD 0xCE 0xCF 0xD0 0xD1 0xD2 0xD3 0xD4 0xD5 0xD6 0xD7 0xD8 0xD9 0xDA 0xDB 0xDC 0xDD 0xDE 0xDF 0xE0 0xE1 0xE2 0xE3 0xE4 0xE5 0xE6 0xE7 0xE8 0xE9 0xEA 0xEB 0xEC 0xED 0xEE 0xEF

0x0E14 0x0E15 0x0E16 0x0E17 0x0E18 0x0E19 0x0E1A 0x0E1B 0x0E1C 0x0E1D 0x0E1E 0x0E1F 0x0E20 0x0E21 0x0E22 0x0E23 0x0E24 0x0E25 0x0E26 0x0E27 0x0E28 0x0E29 0x0E2A 0x0E2B 0x0E2C 0x0E2D 0x0E2E 0x0E2F 0x0E30 0x0E31 0x0E32 0x0E33 0x0E34 0x0E35 0x0E36 0x0E37 0x0E38 0x0E39 0x0E3A 0xFEFF 0x200B 0x2013 0x2014 0x0E3F 0x0E40 0x0E41 0x0E42 0x0E43 0x0E44 0x0E45 0x0E46 0x0E47 0x0E48 0x0E49 0x0E4A 0x0E4B 0x0E4C 0x0E4D 0x2122 0x0E4F

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

THAI CHARACTER DO DEK THAI CHARACTER TO TAO THAI CHARACTER THO THUNG THAI CHARACTER THO THAHAN THAI CHARACTER THO THONG THAI CHARACTER NO NU THAI CHARACTER BO BAIMAI THAI CHARACTER PO PLA THAI CHARACTER PHO PHUNG THAI CHARACTER FO FA THAI CHARACTER PHO PHAN THAI CHARACTER FO FAN THAI CHARACTER PHO SAMPHAO THAI CHARACTER MO MA THAI CHARACTER YO YAK THAI CHARACTER RO RUA THAI CHARACTER RU THAI CHARACTER LO LING THAI CHARACTER LU THAI CHARACTER WO WAEN THAI CHARACTER SO SALA THAI CHARACTER SO RUSI THAI CHARACTER SO SUA THAI CHARACTER HO HIP THAI CHARACTER LO CHULA THAI CHARACTER O ANG THAI CHARACTER HO NOKHUK THAI CHARACTER PAIYANNOI THAI CHARACTER SARA A THAI CHARACTER MAI HAN-AKAT THAI CHARACTER SARA AA THAI CHARACTER SARA AM THAI CHARACTER SARA I THAI CHARACTER SARA II THAI CHARACTER SARA UE THAI CHARACTER SARA UEE THAI CHARACTER SARA U THAI CHARACTER SARA UU THAI CHARACTER PHINTHU ZERO WIDTH NO-BREAK SPACE ZERO WIDTH SPACE EN DASH EM DASH THAI CURRENCY SYMBOL BAHT THAI CHARACTER SARA E THAI CHARACTER SARA AE THAI CHARACTER SARA O THAI CHARACTER SARA AI MAIMUAN THAI CHARACTER SARA AI MAIMALAI THAI CHARACTER LAKKHANGYAO THAI CHARACTER MAIYAMOK THAI CHARACTER MAITAIKHU THAI CHARACTER MAI EK THAI CHARACTER MAI THO THAI CHARACTER MAI TRI THAI CHARACTER MAI CHATTAWA THAI CHARACTER THANTHAKHAT THAI CHARACTER NIKHAHIT TRADE MARK SIGN THAI CHARACTER FONGMAN

0xF0 0xF1 0xF2 0xF3 0xF4 0xF5 0xF6 0xF7 0xF8 0xF9 0xFA 0xFB

0x0E50 0x0E51 0x0E52 0x0E53 0x0E54 0x0E55 0x0E56 0x0E57 0x0E58 0x0E59 0x00AE 0x00A9

# # # # # # # # # # # #

THAI DIGIT ZERO THAI DIGIT ONE THAI DIGIT TWO THAI DIGIT THREE THAI DIGIT FOUR THAI DIGIT FIVE THAI DIGIT SIX THAI DIGIT SEVEN THAI DIGIT EIGHT THAI DIGIT NINE REGISTERED SIGN COPYRIGHT SIGN

You might also like