You are on page 1of 7

Proceedings of the 7th Workshop on Asian Language Resources, ACL-IJCNLP 2009, pages 96102,

Suntec, Singapore, 6-7 August 2009. c 2009 ACL and AFNLP


A Syntactic Resource for Thai: CG Treebank
Taneth Ruangrajitpakorn Kanokorn Trakultaweekoon Thepchai Supnithi
Human Language Technology Laboratory
National Electronics and Computer Technology Center
112 Thailand Science Park, Phahonyothin oad, !long 1,
!long Luang Pathumthani, 1212", Thailand
#$$%2%&$'%$("" E)t*2&'+, ,a)*- #$$%2%&$'%$++2
{taneth.ruangrajitpakorn, kanokorn.trakultaweekoon, thep-
chai.supnithi}@nectec.or.th
Abstract
This paper presents Thai syntactic re%
source- Thai C. treebank, a categorial
approach o/ language resources* Since
there are 0ery /e1 Thai syntactic re%
sources, 1e designed to create treebank
based on C. /ormalism* Thai corpus 1as
parsed 1ith e)isting C. syntactic dic%
tionary and L2L parser* The correct
parsed trees 1ere collected as prelimin%
ary C. treebank* 3t consists o/ &",4'$
trees /rom 2+,24( utterances* Trees can
be split into three grammatical types*
There are 12,5+$ sentential trees, 14,+25
noun phrasal trees, and 15,4'2 0erb
phrasal trees* There are 1+,5'+ utterances
that obtain one tree, and an a0erage tree
per an utterance is 1*5&*
1 Introuction
Syntactic le)ical resources such as P6S tagged
corpus and treebank play one o/ the important
roles in NLP tools /or instance machine transla%
tion 78T9, automatic P6S tagger, and statistical
parser* :ecause o/ a load burden and lacking lin%
guistic e)pertise to manually assign syntactic an%
notation to sentence, 1e are currently limited to a
/e1 syntactical resources* There are /e1 re%
searches 7Satayamas and !a1trakul, 2""'9 /o%
cused on de0eloping system to build treebank*
;n/ortunately, there is no /urther report on the
e)isting treebank in Thai so /ar* Especially /or
Thai, Thai belongs to analytic language 1hich
means grammatical in/ormation relying in a
1ord rather than in/lection 7ichard, 1($'9*
,unction 1ords represent grammatical in/orma%
tion such as tense, aspect, modal, etc* There/ore,
to recognise 1ord order is a key to syntactic ana%
lysis /or Thai* Categorial .rammar 7C.9 is a
/ormalism 1hich /ocuses on principle o/ syntact%
ic beha0iour* 3t can be applied to sol0e 1ord or%
der issues in Thai* To apply C. /or machine
learning and statistical based approach, C. tree%
bank, is initially re<uired*
C. is a based concept that can be applied to
ad0ance grammar such as Combinatory Cat%
egorial .rammar 7CC.9 7Steedman, 2"""9*
8oreo0er, CC. is pro0ed to be superior than
P6S /or CC. tag consisting o/ /ine grained le)%
ical categories and its accuracy rate 7Curran et
al*, 2""$= Clark and Curran, 2""+9*
No1adays, C. and CC. become popular in
NLP researches* There are se0eral researches us%
ing them as a main theoretical approach in 2sia*
,or e)ample, there is a research in China using
C. 1ith Type Lifting 7>o1ty, 1(559 to /ind /ea%
tures interpretations o/ unde/ined 1ords as syn%
tactic%semantic analysis 7?iangsheng, 2"""9* 3n
?apan, researchers also 1orks on ?apanese cat%
egorial grammar 7?C.9 1hich gi0es a /oundation
o/ semantic parsing o/ ?apanese 7!omatsu,
1(((9* 8oreo0er, there is a research in ?apan to
impro0e C. /or sol0ing ?apanese particle shi/t%
ing phenomenon and using C. to /ocus on ?a%
panese particle 7Nishiguchi, 2""59*
This paper is organised as /ollo1s* Section 2
re0ie1s categorial grammar and its /unction*
Section 4 e)plains resources /or building Thai
C. treebank* Section ' describes e)periment res%
ult* Section & discusses issues o/ Thai C. tree%
bank* Last, Section $ summarises paper and lists
up /uture 1ork*
96
! Categorial Gra""ar
Categorial grammar 72ka* C. or classical cat%
egorial grammar9 72@dukie1icA, 1(4&= Car%
penter, 1((2= :usAko1ski, 1((5= Steedman,
2"""9 is a /ormalism in natural language synta)
moti0ated by the principle o/ constitutionality
and organised according to the syntactic ele%
ments* The syntactic elements are categorised in
terms o/ their ability to combine 1ith one anoth%
er to /orm larger constituents as /unctions or ac%
cording to a /unction%argument relationship* 2ll
syntactic categories in C. are distinguished by a
syntactic category identi/ying them as one o/ the
/ollo1ing t1o types-
1* 2rgument- this type is a basic category,
such as s 7sentence9 and np 7noun
phrase9*
2* ,unctor 7or /unctor category9- this cat%
egory type is a combination o/ argument
and operator7s9 BCB and BDB* ,unctor is
marked to a comple) le)icon to assist ar%
gument to complete sentence such as
sDnp 7intransiti0e 0erb9 re<uires noun
phrase /rom the le/t side to complete a
sentence*
C. captures the same in/ormation by associat%
ing a /unctional type or category 1ith all gram%
matical entities* The notation ECF is a right1ard%
combining /unctor o0er a domain o/ E into a
range o/ F* The notation EDF is a le/t1ard%com%
bining /unctor o0er F into E* E and F are both ar%
gument syntactic categories 7Hockenmaier and
Steedman, 2""2= :aldridge and !rui@//, 2""49*
The basic concept is to /ind the core o/ the com%
bination and replace the grammatical modi/ier
and complement 1ith set o/ categories based on
the same concept 1ith /ractions* ,or e)ample, in%
transiti0e 0erb is needed to combine 1ith a sub%
@ect to complete a sentence there/ore intransiti0e
0erb is 1ritten as sDnp 1hich means it needs a
noun phrase /rom the le/t side to complete a sen%
tence* 3/ there is a noun phrase e)ists on the le/t
side, the rule o/ /raction cancellation is applied
as npGsDnp H s* Iith C., each le)icon can be an%
notated 1ith its o1n syntactic category*
Ho1e0er, a le)icon could ha0e more than one
syntactic category i/ it is able to be used in di/%
/erent appearances*
,urthermore, C. does not only construct a
purely syntactic structure but also deli0ers a
compositional interpretation* The identi/ication
o/ deri0ation 1ith interpretation becomes an ad%
0antage o0er others*
E)ample o/ C. deri0ation o/ Thai sentence is
illustrated in ,igure 1*
ecently, there are many researches on com%
binatory categorial grammar 7CC.9 1hich is an
impro0ed 0ersion o/ C.* Iith the C. based
concept and notation, it is possible to easily up%
grade it to ad0ance /ormalism* Ho1e0er, Thai
synta) still remains unclear since there are se0er%
al points on Thai grammar that are yet not com%
pletely researched and /ound absolute sol0ent
7uangra@itpakorn et al*, 2""+9* There/ore, C. is
currently set /or Thai to signi/icantly reduce o0er
generation rate o/ comple) composition or am%
biguate usage*
,igure 1* C. deri0ation tree o/ Thai sentence
# Resources
To collect C. treebank, C. dictionary and pars%
er are essentially re<uired* ,irstly, Thai corpus
1as parsed 1ith the parser using C. dictionary
as a syntactic resource* Then, the correct trees o/
each sentence 1ere manually determined by lin%
guists and collected together as treebank*
#$1 Thai CG %ictionary
ecently, 1e de0eloped Thai C. dictionary to be
a syntactic dictionary /or se0eral purposes since
C. is ne1 to Thai NLP* C. 1as adopted to our
syntactic dictionary because o/ its /ocusing on
le)iconBs beha0iour and its /ine grained le)ical%
ised grammar* C. is proper to nature o/ Thai
language since Thai belongs to analytic language
typology= that is, its synta) and meaning depend
on the use o/ particles and 1ord orders rather
than in/lection 7:oonk1an, and Supnithi, 2""59*
8oreo0er, pronouns and other grammatical in%
/ormation, such as tenses, aspects, numbers, and
0oices, are e)pressed by /unction 1ords such as
97
determiners, au)iliary 0erbs, ad0erbs and ad@ect%
i0es, 1hich are in /i) 1ord order* Iith C., it is
possible to 1ell capture Thai grammatical in%
/ormation* Currently 1e only aim to impro0e an
accuracy o/ Thai synta) parsing since it still re%
mains unresearched ambiguities in Thai synta)*
2 list o/ grammatical Thai 1ord orders 1hich are
handled 1ith C. is sho1n in Table 1*
Thai
utilisation
&or'orer
Sentence
% Sub@ect # Jerb # 76b@ect9
1
Krigid orderL
Compound
noun
% Core noun # 2ttachment
2d@ecti0e
modi/ication
% Noun # 2d@ecti0e
2
Predicate 2d%
@ecti0e
% Noun # 2d@ecti0e
4
>eterminer % Noun # 7Classi/ier9 # >eterminer
Numeral e)%
pression
% Noun # 78odi/ier9 # Number # Classi/ier #
78odi/ier9
2d0erb
modi/ication
% Sentence # 2d0erb
% 2d0erb # Sentence
Se0eral au)%
iliary 0erbs
% Sub@ect # 72u) 0erbs9 # JP # 72u) 0erbs9
Negation
% Sub@ect # Negator # JP
% Sub@ect # 72u) 0erb9 # Negator # 72u) 0erb9 #
JP
% Sub@ect # JP # 72u) 0erb9 # Negator # 72u)
0erb9
Passi0e % 2ctee # Passi0e marker # 72ctor9 # Jerb
>itransiti0e
% Sub@ect # >itransiti0e 0erb # >irect ob@ect # 3n%
direct ob@ect
elati0e
clause
% Noun # elati0e marker # Clause
Compound
sentence
% Sentence # Con@unction # Sentence
% Con@unction # Sentence # Sentence
Comple)
sentence
% Sentence # Con@unction # Sentence
% Con@unction # Sentence # Sentence
Subordinate
clause that
begins 1ith
1ord M N
% Sub@ect # Jerb # M N # Sentence
Table 1* Thai 1ord orders that C. can sol0e
1
3n/ormation in parentheses is able to be omitted*
2
2d@ecti0e modi/ication is a /orm o/ an ad@ecti0e per%
/orms as a modi/ier to a noun, and they combine as a
noun phrase*
4
Predicate ad@ecti0e is a /orm o/ an ad@ecti0e acts as a
predicate o/ a sentence*
3n addition, there are many multi%sense 1ords
in Thai* These 1ords ha0e the same sur/ace /orm
but they ha0e di//erent meanings and di//erent
usages* This issue can be sol0ed 1ith C. /ormal%
ism* The di//erent usages are separated because
the annotation o/ syntactic in/ormation* ,or e)%
ample, Thai 1ord M N PQR
S
TP UVW XY Z[Y\ ]^
re/er to noun as an Bisland' and it is marked as
np, and this 1ord can also be denoted an action
1hich means Bto clinkB or Bto attachB and it is
marked as s(np)np*
2/ter obser0ation Thai 1ord usage, the list o/
C. 1as created according to C. theory e)%
plained in Section 2*
Thai argument syntactic categories 1ere ini%
tially created* ,or Thai language, si) argument
syntactic categories 1ere determined* Thai C.
arguments are listed 1ith de/inition and e)%
amples in Table 2* 2dditionally, np, nu", and
spnu" are a Thai C. arguments that can dir%
ectly tag to a 1ord, but other can not and they
can only be used as a combination /or other argu%
ment*
Iith the arguments, other type o/ 1ord are
created as /unctor by combining the arguments
together /ollo1ing its beha0iour and en0iron%
mental re<uirements* The /irst argument in a
/unctor is a result o/ combination* There are only
t1o main operators in C. 1hich are slash BCB and
backslash BDB be/ore an argument* 2 slash BCB re/ers
to argument re<uirement /rom the right, and a
backslash BDB re/ers to argument re<uirement /rom
the le/t* ,or instance, a transiti0e 0erb re<uires
one np /rom the le/t and one np /rom the right to
complete a sentence* There/ore, it can be 1ritten
as s(np)np in C. /orm* Ho1e0er, se0eral Thai
1ords ha0e many /unctions e0en it has the same
1ord sense* ,or e)ample, Thai 1ord M N PU_`a bP
7to belie0e9 is capable to use as intransiti0e 0erb,
transiti0e 0erb, and 0erb that can be /ollo1ed
1ith subordinate clause* This 1ord there/ore has
three di//erent syntactic categories* Currently,
there are +2 /unctors /or Thai*
Iith an argument and a /unctor, each 1ord in
the 1ord list is annotated 1ith C.* This in/orma%
tion is su//icient /or parser to analyse an input
sentence into a grammatical tree* 3n conclusion,
C. dictionary presently contains '2,&$' le)ical
entries 1ith +& C. syntactic categories* 2ll Thai
C. categories are sho1n in 2ppendi) 2*
98
Thai ar'
gu"ent
category
efinition e*a"ple
np a noun phrase
7elephant9,
73, me9
num
2 both digit and 1ord
cardinal number
7one9,
2 7t1o9
spnum
a number 1hich is suc%
ceeding to classi/ier in%
stead o/ proceeding clas%
si/ier like ordinary num%
ber
7one9,
7one9
'
pp a prepositional phrase
7in car9,
7on table9
s a sentence

7elephant eats ba%
nana9
1s
a speci/ic category /or
Thai 1hich is assigned
to a sentence that begins
1ith Thai 1ord 7that -
sub%ordinate clause
marker9*
G a
&
Bthat he 1ill come
lateB
Table 2* List o/ Thai C. arguments
#$! +arser
6ur implemented lookahead L parser 7L2L9
72ho and ?ohnson, 1(+'= !nuth, 1($&9 1as used
as a tool to syntactically parse input /rom corpus*
,or our L2L parser, a grammar rule is not
manually determined, but it is automatically pro%
duced by a any gi0en syntactic notations aligned
1ith le)icons in a dictionary there/ore this L2L
parser has a co0erage including a C. /ormalism
parsing* ,urthermore, our L2L parser has po%
tential to parse a tree /rom sentence, noun phrase
and 0erb phrase* Ho1e0er, the parser does not
only return the best /irst tree, but also all parsable
trees to gather all ambiguous trees since Thai
language tends to be ambiguous because o/ lack%
ing e)plicit sentence and 1ord boundary*
#$# Tree ,isualiser
To reduce load burden o/ linguist to seek /or the
correct tree among all outputs, 1e de0eloped a
tree 0isualiser* This tool 1as de0eloped by using
an open source library pro0ided by NLT!- The
'
This spnum category has a di//erent usage /rom other
numerical use, e*g* Knoun,BhorseBL Kclassi/ierL
Kspnum,BoneBL Blit- one horseB* This case is di//erent
/rom normal numerical usage, e*g* Knoun,BhorseBL
Knum,BoneBL Kclassi/ierL Blit- one horseB
&
This e)ample is a part o/ a sentence a
a Blit- 3 belie0e that he 1ill come lateB
Natural Language Toolkit 7http-CC111*nltk*orgC
Home= :ird and Loper, 2""'9*
2 tree 0isualiser is a tool to trans/orm a te)tual
tree structure to graphic tree* This tool reads a
tree marking 1ith parentheses /orm and trans%
mutes it into graphic* This tool can trans/orm all
output types o/ tree including sentence tree, noun
phrase tree, and 0erb phrase tree* ,or e)ample,
Thai sentence cdd da dL ddqn dc
PQVeW fge [`h b ijW QVeW i_VT U^W i_VkP Blit- Tiger
hunting is an ad0entureB 1as parsed to a tree
sho1n in ,igure 2* Iith a tree 0isualiser, the tree
in ,igure 2 1as trans/ormed to a graphic tree il%
lustrated in ,igure 4*
- .*peri"ent Result
3n the preliminary e)periment, 2+,24( Thai utter%
ances 1ith a mi) o/ sentences and phrases /rom a
general domain corpus are tested* The input 1as
1ord%segmented by ?1ordSeg 7http-CC111*su%
parsit*comCnlp%tools9 and appro0ed by linguists*
3n the test corpus, the longest utterance contains
se0enteen 1ords, and the shortest utterance con%
tains t1o 1ords*
s
7np
7npC7sDnp9KL
sDnp7
7sDnp9CnpK L
npKa L
9
9
sDnp7
7sDnp9CnpKL L
np7
npC7sDnp9KL
sDnpKqnL
9
9
9
,igure 2* 2n e)ample o/ C. tree output
,igure 4* 2n e)ample o/ graphic tree
99
2ll trees are manually obser0ed by linguists to
e0aluate accuracy o/ the parser* The criteria o/
accuracy are-
2 tree is correct i/ sentence is success%
/ully parsed and syntactically correct ac%
cording to Thai grammar*
3n case o/ syntactic ambiguity such as a
usage o/ preposition or phrase and sen%
tence ambiguity, any tree /ollo1ing
those ambiguity is acceptable and coun%
ted as correct*
The parser returns &",4'$ trees /rom 2+,24(
utterances as 1*5& trees per input in a0erage*
There are 1+,5+' utterances that returns one tree*
The outputs can be di0ided into three di//erent
output types- 12,5+$ sentential trees, 14,+25
noun phrasal trees, and 15,4'2 0erb phrasal trees*
,rom the parser output, tree amount collecting
in the C. tree bank in details is sho1n in Table
4*
Tree type /tterance
a"ount
Tree
a"ount
A0erage
6nly S 5,15' 12,+(5 1*&$
6nly NP +,211 12,'"+ 1*+2
6nly JP 5,""$ 11,44( 1*'2
:oth NP
and S
1,&54 &,155 4*25
:oth JP
and S
1,+2& $,51$ 4*(&
:oth NP
and JP
4(+ 1,1'" 2*5+
S, NP, JP 144 $&5 '*(&
Total 2+,24( &",4'$ 1*5&
Table 4* 2mount o/ tree categorised by a di/%
/erent kind o/ grammatical tree
1 %iscussion
2/ter obser0ation o/ our result, 1e /ound t1o
main issues*
,irst, some Thai inputs 1ere parsed into se0er%
al correct outputs due to ambiguity o/ an input*
The use o/ an ad@ecti0e can be parsed to both
noun phrase and sentence since Thai ad@ecti0e
can be used either a noun modi/ier or predicate*
,or e)ample, Thai input Md (daaddadl C
\m
S
Q \m
S
Q [n\ [ok X^W [VT WoepC can be literally
translated as /ollo1s-
1* Children is cheer/ul on a playground*
2* Cheer/ul children on a playground
,or this problem, 1e decided to keep both
trees in our treebank since they are both gram%
matically correct*
Second, the ne)t issue is a 0ariety o/ syntactic
usages o/ Thai 1ord* 3t is the /act that Thai has a
narro1 range o/ 1ordBs sur/ace but a lot o/ poly%
symy 1ords* The more the 1ord in Thai is gener%
ally used, the more utilisation o/ 1ord becomes
0arieties* Iith the se0eral combination, there are
more chances to generate trees in a 1rong con%
ceptual meaning e0en they /orm a correct syn%
tactic 1ord order* ,or e)ample, Thai noun phrase
Md N n PQVpfVqpVTroe[oeWPsfk]etuYV]
po1erB can automatically be parsed to three trees
/or a sentence, a noun phrase, and a 0erb phrase
because o/ polysymy o/ the /irst 1ord* The /irst
1ord c c has t1o syntactic usages as a noun
1hich conceptually re/ers to power and a pre%
au)iliary 0erb to imply progressi0e aspect* The
1ord cc is an ad@ecti0e 1hich can per%
/orm t1o options in Thai as noun modi/ier and
predicate* These a//ect parser to result three trees
as /ollo1s-
np- np7npK L npDnpKnL9
s- s7npKL sDnpKnL9
0p- sDnp77sDnp9C7sDnp9K L sDnpKnL9
E0en though all trees are syntactically correct,
only noun phrasal tree is /ully acceptable in
terms o/ semantic sense as great power* The oth%
er trees are a1k1ard and out o/ certain meaning
in Thai* There/ore, the only noun phrase tree is
collected into our C. treebank /or such case*
2 Conclusion an 3uture &ork
This paper presents Thai C. treebank 1hich is a
language resource /or de0eloping Thai NLP ap%
plication* This treebank consists o/ &",4'$ syn%
tactic trees /rom 2+,24( utterances 1ith C. tag
and composition* Trees can be split into three
grammatical types* There are 12,5+$ sentential
trees, 14,+25 noun phrasal trees, and 15,4'2 0erb
phrasal trees* There are 1+,5'+ utterances that
obtain one tree, and an a0erage tree per an utter%
ance is 1*5&*
3n the /uture, 1e plan to impro0e Thai C.
treebank to Thai CC. treebank* Ie also plan to
reduce a 0ariety o/ trees by e)tending semantic
/eature into C.* Ie 1ill impro0e our L2L
parser to be .L and P.L parser respecti0ely
to reduce a missing 1ord and named entity prob%
lem* 8oreo0er, 1e 1ill de0elop parallel Thai%
English treebank by adding a parallel English
treebank aligned 1ith Thai since parallel tree%
bank is use/ul resource /or learning to statistical
100
machine translation* ,urthermore, 1e 1ill apply
obtained C. treebank /or automatic C. tagging
de0elopment*
Reference
2l/red J* 2ho, and Stephen C* ?ohnson* 1(+' L
Parsing, 3n Proceedings of Computing Surveys,
Jol* $, No* 2*
:ob Carpenter* 1((2* MCategorial .rammars, Le)ical
ules,and the English Predicati0el, 3n * Le0ine,
ed*, ,ormal .rammar- Theory and 3mplementation*
6;P*
>a0id >o1ty, Type raising, /unctional composition,
and non%constituent con@unction, 3n ichard 6ehrle
et al*, ed*, Categorial .rammars and Natural Lan%
guage Structures* >* eidel, 1(55*
>onald E* !nuth* 1($&* On the translation of lan-
guages from left to right, 3n/ormation and Control
5$*
Hisashi !omatsu* 1(((* M?apanese Categorial .ram%
mar :ased on Term and Sentencel* 3n Proceeding
of The !th Pacific "sia Conference on Language,
#nformation and Computation, Tai1an*
?ames * Curran, Stephen Clark, and >a0id Jadas*
2""$* 8ulti%Tagging /or Le)icaliAed%.rammar
Parsing* 3n Proceedings of the $oint Conference of
the #nternational Committee on Computational
Linguistics and the "ssociation for Computational
Linguistics %"CL&, Paris, ,rance*
?ason :aldridge, and .eert%?an* 8* !rui@//* 2""4*
M8ultimodal combinatory categorial grammarl* 3n
Proceeding of 'th Conference of the (uropean
Chapter of the "CL-)''!, :udapest, Hungary*
?ulia Hockenmaier, and 8ark Steedman* 2""2* M2c%
<uiring Compact Le)icaliAed .rammars /rom a
Cleaner Treebankl* 3n Proceeding of !rd #nterna-
tional Conference on Language *esources and
(valuation %L*(C-)'')&, Las Palmas, Spain*
?IordSeg, 1ord%segmentation toolkit* 20ailable
/rom- http-CC111*suparsit*comCnlp%tools9, 2""+*
!aAimierA 2@dukie1icA* 1(4&* +ie Syntaktische ,on-
ne-itat, Polish Logic*
8ark Steedman* 2"""* The Syntactic Process, The
83T Press, Cambridge 8ass*
NLT!- The Natural Language Toolkit* 20ailable
/rom- http-CC111*nltk*orgCHome
Noss :* ichard* 1($'* Thai *eference .rammar, ;*
S* .o0ernment Printing 6//ice, Iashington >C*
Prachya :oonk1an, and Thepchai Supnithi* 2""5*
8emory%inducti0e categorial grammar- 2n ap%
proach to gap resolution in analytic%language trans%
lation* 3n Proceeding of !rd #nternational $oint
Conference on /atural Language Processing
%#$C/LP-)''0&, Hyderabad, 3ndia*
Stephen Clark and ?ames * Curran* 2""+* ,ormal%
ism%3ndependent Parser E0aluation 1ith CC. and
>ep:ank* 3n Proceedings of the 12th "nnual
3eeting of the "ssociation for Computational Lin-
guistics %"CL&, Prague, CAech epublic*
Ste0en .* :ird, and Ed1ard Loper* 2""'* NLT!- The
Natural Language Toolkit, 3n Proceedings of 1)nd
3eeting of the "ssociation for Computational Lin-
guistics %+emonstration Track&, :arcelona, Spain*
Sumiyo Nishiguchi* 2""5* Continuation%based CC.
o/ ?apanese vuanti/iers* 3n Proceeding of 4th
#CCS, The !orean Society o/ Cogniti0e Science,
Seoul, South !orea*
Taneth uangra@itpakorn, Iasan* na Chai, Prachya
:oonk1an, 8ontika :oriboon, and Thepchai*
Supnithi* 2""+* The >esign o/ Le)ical 3n/ormation
/or Thai to English 8T, 3n Proceeding of S/LP
)''5, Pattaya, Thailand*
Jee Satayamas, and 2sanee !a1trakul* 2""'* Iide%
Co0erage .rammar E)traction /rom Thai Tree%
bank* 3n Proceedings of Papillon )''1 6orkshops
on 3ultilingual Le-ical +ata7ases, .renoble,
,rance*
Io@ciech :usAko1ski, Iitold 8arcisAe1ski, and ?o%
han 0an :enthem, ed*, Categorial .rammar, ?ohn
:en@amin, 2msterdam, 1((5*
wu ?iangsheng* 2"""* Categorial .rammar 7ased on
8eature Structures, dissersion in 3n%stitute o/ Com%
putational Linguistics, Peking ;ni0ersity*
101
A
p
p
e
n
d
i
x

A
T
y
p
e
C
G

C
a
t
e
g
o
r
y
T
y
p
e
C
G

C
a
t
e
g
o
r
y
T
y
p
e
C
G

C
a
t
e
g
o
r
y
C
o
n
j
o
i
n
e
r
w
s
/
s
V
e
r
b
(
s
\
n
p
)
/
w
s
F
u
n
c
t
i
o
n

w
o
r
d
(
(
s
\
n
p
)
\
(
s
\
n
p
)
)
/
(
n
p
\
n
p
)
C
o
n
j
o
i
n
e
r
w
s
/
(
s
/
n
p
)
V
e
r
b
,

A
d
j
e
c
t
i
v
e
(
s
\
n
p
)
/
p
p
F
u
n
c
t
i
o
n

w
o
r
d
(
(
s
\
n
p
)
\
(
s
\
n
p
)
)
/
(
(
s
\
n
p
)
\
(
s
\
n
p
)
)
F
u
n
c
t
i
o
n

w
o
r
d
s
p
n
u
m
D
e
t
e
r
m
i
n
e
r
(
s
\
n
p
)
/
n
u
m
V
e
r
b
(
(
s
\
n
p
)
/
w
s
)
/
p
p
P
a
r
t
i
c
l
e
,

A
d
v
e
r
b
s
\
s
V
e
r
b
,

A
d
j
e
c
t
i
v
e
(
s
\
n
p
)
/
n
p
V
e
r
b
(
(
s
\
n
p
)
/
w
s
)
/
n
p
V
e
r
b
s
\
n
p
/
(
s
\
n
p
)
/
n
p
F
u
n
c
t
i
o
n

w
o
r
d
,

V
e
r
b
,

A
d
v
e
r
b
,

A
u
x
i
l
i
a
r


v
e
r
b
(
s
\
n
p
)
/
(
s
\
n
p
)
A
d
v
e
r
b
,

A
u
x
i
l
i
a
r


v
e
r
b
(
(
s
\
n
p
)
/
p
p
)
\
(
(
s
\
n
p
)
/
p
p
)
V
e
r
b
s
\
n
p
F
u
n
c
t
i
o
n

w
o
r
d
(
s
\
n
p
)
/
(
n
p
\
n
p
)
V
e
r
b
(
(
s
\
n
p
)
/
p
p
)
/
n
p
F
u
n
c
t
i
o
n

w
o
r
d
,

P
a
r
t
i
c
l
e
s
/
s
A
u
x
i
l
i
a
r


v
e
r
b
(
s
\
n
p
)
/
(
(
s
\
n
p
)
/
n
p
)
F
u
n
c
t
i
o
n

w
o
r
d
,

A
d
v
e
r
b
(
(
s
\
n
p
)
/
p
p
)
/
(
(
s
\
n
p
)
/
p
p
)
F
u
n
c
t
i
o
n

w
o
r
d
s
/
n
p
C
o
n
j
u
n
c
t
i
o
n
(
s
/
s
)
/
s
A
u
x
i
l
i
a
r


v
e
r
b
(
(
s
\
n
p
)
/
n
p
)
\
(
(
s
\
n
p
)
/
n
p
)
A
u
x
i
l
i
a
r


v
e
r
b
s
/
(
s
/
n
p
)
F
u
n
c
t
i
o
n

w
o
r
d
(
s
/
s
)
/
n
p
V
e
r
b
(
(
s
\
n
p
)
/
n
p
)
/
n
p
!
e
n
t
e
n
c
e
s
F
u
n
c
t
i
o
n

w
o
r
d
(
s
/
s
)
/
(
s
/
n
p
)
V
e
r
b
(
(
s
\
n
p
)
/
n
p
)
/
(
s
\
n
p
)
C
o
n
j
o
i
n
e
r
p
p
/
s
C
l
a
s
s
i
"
i
e
r
(
n
p
\
n
p
)
\
n
u
m
A
d
v
e
r
b
e
r
b
(
(
s
\
n
p
)
/
(
s
\
n
p
)
)
\
(
(
s
\
n
p
)
/
(
s
\
n
p
)
)
C
o
n
j
o
i
n
e
r
p
p
/
n
p
F
u
n
c
t
i
o
n

w
o
r
d
,

A
d
v
e
r
b
,

A
u
x
i
l
i
a
r


v
e
r
b
(
n
p
\
n
p
)
\
(
n
p
\
n
p
)
F
u
n
c
t
i
o
n

w
o
r
d
(
(
n
p
\
n
p
)
\
(
n
p
\
n
p
)
)
/
n
p
C
o
n
j
o
i
n
e
r
p
p
/
(
s
\
n
p
)
C
l
a
s
s
i
"
i
e
r
(
n
p
\
n
p
)
/
s
p
n
u
m
C
o
n
j
o
i
n
e
r
(
(
n
p
\
n
p
)
\
(
n
p
\
n
p
)
)
/
(
n
p
\
n
p
)
F
u
n
c
t
i
o
n

w
o
r
d
n
u
m
F
u
n
c
t
i
o
n

w
o
r
d
(
n
p
\
n
p
)
/
s
A
d
v
e
r
b
,

A
u
x
i
l
i
a
r


v
e
r
b
(
(
n
p
\
n
p
)
/
p
p
)
\
(
(
n
p
\
n
p
)
/
p
p
)
C
l
a
s
s
i
"
i
e
r
n
p
\
n
u
m
D
e
t
e
r
m
i
n
e
r
(
n
p
\
n
p
)
/
n
u
m
A
d
v
e
r
b
,

F
u
n
c
t
i
o
n

w
o
r
d
(
(
n
p
\
n
p
)
/
p
p
)
/
(
(
n
p
\
n
p
)
/
p
p
)
A
d
j
e
c
t
i
v
e
n
p
\
n
p
A
d
j
e
c
t
i
v
e
,

C
o
n
j
o
i
n
e
r
(
n
p
\
n
p
)
/
n
p
A
u
x
i
l
i
a
r


v
e
r
b
(
(
n
p
\
n
p
)
/
n
p
)
\
(
(
n
p
\
n
p
)
/
n
p
)
#
o
u
n
,

P
r
o
n
o
u
n
n
p
/
p
p
F
u
n
c
t
i
o
n

w
o
r
d
(
n
p
\
n
p
)
/
(
s
\
n
p
)
C
o
n
j
o
i
n
e
r
(
(
n
p
/
p
p
)
\
(
n
p
/
p
p
)
)
/
(
n
p
/
p
p
)
A
d
j
e
c
t
i
v
e
,

D
e
t
e
r
m
i
n
e
r
n
p
/
n
p
C
l
a
s
s
i
"
i
e
r
,

F
u
n
c
t
i
o
n

w
o
r
d
,

A
d
v
e
r
b
,

A
u
x
i
l
i
a
r


v
e
r
b
(
n
p
\
n
p
)
/
(
n
p
\
n
p
)
V
e
r
b
(
(
(
s
\
n
p
)
\
n
p
)
F
u
n
c
t
i
o
n

w
o
r
d
n
p
/
(
s
\
n
p
)
A
u
x
i
l
i
a
r


v
e
r
b
(
n
p
\
n
p
)
/
(
(
n
p
\
n
p
)
/
n
p
)
V
e
r
b
(
(
(
s
\
n
p
)
/
w
s
)
/
p
p
)
/
n
p
A
u
x
i
l
i
a
r


v
e
r
b
n
p
/
(
n
p
/
n
p
)
A
d
j
e
c
t
i
v
e
,

D
e
t
e
r
m
i
n
e
r
(
n
p
/
p
p
)
\
(
n
p
/
p
p
)
C
o
n
j
o
i
n
e
r
(
(
(
s
\
n
p
)
/
p
p
)
\
(
(
s
\
n
p
)
/
p
p
)
)
/
(
(
s
\
n
p
)
/
p
p
)
F
u
n
c
t
i
o
n

w
o
r
d
n
p
/
(
(
s
\
n
p
)
/
n
p
)
D
e
t
e
r
m
i
n
e
r
(
n
p
/
p
p
)
/
(
n
p
/
p
p
)
F
u
n
c
t
i
o
n

w
o
r
d
(
(
(
s
\
n
p
)
/
p
p
)
\
(
(
s
\
n
p
)
/
p
p
)
)
/
(
(
(
s
\
n
p
)
/
p
p
)
\
(
(
s
\
n
p
)
/
p
p
)
)
#
o
u
n
,

P
r
o
n
o
u
n
n
p
C
l
a
s
s
i
"
i
e
r
(
(
s
\
n
p
)
\
(
s
\
n
p
)
)
\
n
u
m
V
e
r
b
(
(
(
s
\
n
p
)
/
p
p
)
/
n
p
)
/
n
p
C
o
n
j
u
n
c
t
i
o
n
(
s
\
s
)
/
s
C
l
a
s
s
i
"
i
e
r
(
(
s
\
n
p
)
\
(
s
\
n
p
)
)
/
s
p
n
u
m
F
u
n
c
t
i
o
n

w
o
r
d
(
(
(
s
\
n
p
)
/
p
p
)
/
n
p
)
/
(
(
(
s
\
n
p
)
/
p
p
)
/
n
p
)
A
d
v
e
r
b
,

A
u
x
i
l
i
a
r


v
e
r
b
(
s
\
n
p
)
\
(
s
\
n
p
)
F
u
n
c
t
i
o
n

w
o
r
d
(
(
s
\
n
p
)
\
(
s
\
n
p
)
)
/
n
p
V
e
r
b
(
(
(
s
\
n
p
)
/
n
p
)
/
(
s
\
n
p
)
)
/
p
p
C
o
n
j
o
i
n
e
r
(
(
s
\
n
p
)
\
(
s
\
n
p
)
)
/
(
s
\
n
p
)
C
o
n
j
o
i
n
e
r
(
(
(
n
p
\
n
p
)
/
p
p
)
\
(
(
n
p
\
n
p
)
/
p
p
)
)
/
(
(
n
p
\
n
p
)
/
p
p
)
102

You might also like