You are on page 1of 9

Cc cng trnh nghin cu, pht trin v ng dng CNTT-TT

Tp V-1, S 7 (27), thng 5/2012

Mt gii php suy din cu tr li


trong h thng hi p thng tin
A Solution for Answer Reasoning in QA System
Phan Th Ti, Nguyn Chnh Thnh
Abstract: Question Answering (QA) is an
important research field that research groups focus on
to develop useful QA system for many languages in the
world (English, Chinese, Japanese ) but just few for
Vietnamese. One of most difficult processings in QA is
the answer reasoning applying to natural language
question, especially Vietnamese. The paper introduces
an approach of reasoning answers for Vietnamese
question based on Graph Theory and Artificial
Intelligence (AI). The experimentation, which is done
for Vietnamese questions in initial phase, shows that
the proposed approach is feasible for Vietnamese and
it can be upgraded more for other languages in future.
Keywords: reasoning, CG, question answering, QA.
I. GII THIU
Hi p thng tin (Question Answering, QA) l
mt trong nhng nhu cu thit thc ca mi ngi
dng trn th gii. Nhiu phng php ca cc nhm
nghin cu trn th gii v vn ny v ang
c tin hnh vi mt s kt qu minh chng tnh
kh thi ca nhng phng php .
Mt trong nhng hng nghin cu v QA c
thc hin trong lnh vc tr tu nhn to lin quan n
h chuyn gia vi mt s thnh cng nht nh.
Ngoi ra, vi mong mun ci tin cc h thng
QA nhm tng kh nng x l v cht lng ni dung
tr li, cc nghin cu v QA c pht trin theo
hng kt hp vi ngn ng t nhin. Hng nghin
cu ny thng kt hp vi mt s lnh vc khc nh
truy xut thng tin (trc tuyn hoc khng trc tuyn),
h chuyn gia.

Trong nhng nm 1960, mt s h thng hi p


u tin bng ngn ng t nhin c xy dng nh
Elisa, Lunar, Baseball. Cc h thng QA lin tc c
hon thin v pht trin cng vi ngnh ngn ng hc
tnh ton (Computational Linguistics) v hiu vn bn
(Text Comprehension) trong thp nin 1970 k tip.
TREC (trec.nist.gov) chnh thc a ra cc ti v
cc hot ng nghin cu trong nhng nm 1990 v
h thng QA. Hin nay tn ti mt s h thng QA
bng ting Anh, nh AquaLog [1], START [2].
Trong nhng nghin cu v QA, cc gii php v
suy din tr li cu hi lun l mt vn kh khn v
phc tp. Mt s nghin cu gii hn li phm vi x
l ch yu ch da trn c ch so khp cu trc c
php ca cu truy vn vi mt s mu cu trc c sn,
nh [1], [3], [4], [5], [6], [7] v [9]. Mt s d n nh
S-CREAM [11] v MnM [12] hay AquaLog [1] dng
nhiu k thut hc my rt trch quan h gia cc
i tng, tuy nhin ch lm c bn t ng. Cng
trnh ca IBM Watson gii thiu mt hng tip cn
khc da trn ngn ng t nhin [10]. Nhm tc gi
[13] v [14] dng th nim biu din ng
ngha cc quan h gia cc thc th t xc nh
cc phng n tr li cho ni dung cu hi.
Ngoi ra, mt s phn mm tp trung x l suy
din cho mt vn tng qut1, t c th ng dng
vo h thng QA h tr suy din cu tr li cho ni
dung hi ca ngi dng, nh bng 1, vi (*) l dng
m ngun m v (**) khng min ph.
Hng nghin cu v QA c h tr truy vn dng
1

Lit k chi tit trong


en.wikipedia.org/wiki/Semantic_reasoner

- 40 -

Cc cng trnh nghin cu, pht trin v ng dng CNTT-TT


ngn ng t nhin ting Vit, cng trnh [19] gii
thiu mt phng php khai thc thng tin trong c s
d liu thng qua giao din ngi dng c h tr truy
vn dng ngn ng t nhin.Tuy nghin cu ny tp
trung v vic phn tch truy vn theo dng vn phm
ng ngha vi kt qu chnh xc 91.91% v to ra cy
c php hng ng ngha (syntactics-semantic tree),
nhng y cng l mt trong cc hng tip cn nh
hng n k thut phn tch cu hi trong ting Vit.
Ngoi ra, kt qu ca [20] trnh by mt cch tip
cn da trn bn th hc sinh cu tr li vi kt qu
tr li ng 95% trn tp 60 cu hi ting Vit.
Phng php sinh ni dung y tp trung x l vic
suy din da trn cc quan h ng ngha c trong bn
th hc i snh cc nim lin quan n ni dung
nhm t nghi vn.
Cng trnh [26] gii thiu mt nghin cu chuyn
su v m hnh h thng QA h tr ting Vit da trn
bn th hc. Trong gii php suy din cu tr li
tp trung theo hng s dng quan h ng ngha is-a
thc hin vic tm kim kt qu ph hp. Kt qu
thc nghim kh quan ca h cho thy trin vng ca
hng tip cn ny trong tng lai. Tuy nhin, vic
suy din t ng c h tr x l bc cu cho cc cu
hi phc tp l mc tiu khng nhng ca [26] m cn
l mt vn c gii thiu trong bi bo ny.
Cch tip cn ca [27], tuy khng trin khai cho
cu hi ting Vit nhng hng n vic phn tch v
biu din truy vn vi ct li l cm t theo dng
th nim, t thc hin vic suy din ni dung tr
li v c bit c h tr suy lun bc cu, em li kt
qu mc 93% v 98% cho chnh xc v bao
ph trn tp th nghim 346 truy vn. Hng tip cn
ny c nh hng ln n nghin cu ca chng ti,
tuy nhin vic sinh t ng ton b chui cc suy lun
ni dung tr li theo phng n hp l cng l mt
vn kh m chng ti trnh by hng gii quyt
trong bi bo ny.
Cng trnh [21] xut mt phng php phi c
php phc v cho vic hiu truy vn ngn ng t
nhin hnh thnh cu hi khng chun mc.
Phng php ny thc hin khai thc kin thc trong

Tp V-1, S 7 (27), thng 5/2012

mt bn th hc nhn bit cc thc th v xc nh


mi quan h ca h trong mt truy vn, t n gin
ha quy tc chuyn i t cc truy vn ngn ng t
nhin da trn vic lin kt cht ch gia ngn ng t
nhin v th nim.
Cc cng trnh khc kh ni ting v th nim
v cc ng dng nh [22], [23], [24] v [25] cung
cp mt nh hng tip cn hp l v cht ch, lm
c s cho cc vn nghin cu lin quan n th
nim trnh by trong bi bo ny
T cc kho st phn tch trn, vic pht trin h
thng hi p ting Vit trn c s k tha c chn lc
cc u im ca nhng nghin cu lin quan, c bit
l [1], [14], [21] v [27], t vn dng hiu qu
cho ngn ng ting Vit trong bi ton hi p v truy
xut thng tin nhm hng n mc tiu pht trin
c mt h thng truy vn thng tin h tr ng ngha
tt hn.
Bi bo ny gii thiu hng x l suy din ni
dung cu tr li cho h thng hi p hng n ng
ngha c h tr ting Vit da trn vic phn tch cc
phng n suy din xc nh c t c s tri thc
trn ni dung cu hi phn tch c da trn nn tng
Vn phm Ph thuc ([16], [17] v [18]).
Ni dung bi bo ny c trnh by nh sau:
Mc u tin cung cp mt gc nhn tng quan v
phng php xut v cc nghin cu lin quan
trong v ngoi nc. M hnh tng qut v h thng
QA c nu trong mc II nhm gii thiu tin trnh
x l cu hi v tr li trong h thng xut. Mc III
trnh by ni dung chnh ca bi bo, mt cch tip
cn khc v th nim v phng php suy din t
ng da trn gii thut CGBAR, v t tm lc
kt qu thc nghim mc IV. Cui cng l phn kt
lun v nh hng nghin cu k tip ca chng ti.
II. M HNH H THNG HI P TING
VIT
V bn cht, h thng hi p ting Vit ca
chng ti hot ng trong lnh vc th vin in t
nhm h tr ngi dng tra cu cc thng tin chuyn
su lin quan n cc ti liu khoa hc k thut. H
thng ny c m t trong Hnh 1 ([17]).

- 41 -

Cc cng trnh nghin cu, pht trin v ng dng CNTT-TT

Tp V-1, S 7 (27), thng 5/2012

Bng 1. Tm lc v cc phn mm h tr suy din cng b trn th gii


BaseVISor
**

Bossam
*

FaCT *

FaCT++
*

HermiT
*

Hoolet
*

Jena *

KAON2 *

OntoBroker **

OWLIM
**

Pellet
**

RacerPro
*

SweetRules
*

Gii thut
suy din

Rulebased,
Rete

Rulebased

Tableau

Tableau

Hypertableau

Firstorder
prover

Rulebased

Resolution
& Datalog

Rulebased

Tableau

Tableau

Rulebased

Phin bn

2.0

0.9b45

1.1.8

1.2.4

2.0.2

2.0

2.1

Khng
bit
Khng
bit

2008-0629
C

2.x/3.x

Khng

Khng
bit
C

2.5.4

OWL-DL
Entailment
Mc
biu hin
h tr cho
suy din
Kim tra
tnh chc
chn

Khng
bit
C

OWL: Resolution
& Datalog; Flogic: Rule-based
(BottomUp,
MagicSet, QSQ,
DynamicFiltering)
6.0
C

Khng

Khng

SHIQ

SROIQ
(D)

SROIQ
(D)

Khng
bit

SHIQ (D)

OWL: SHIQ (D)


(for OntoBroker
6.1); F-logic

SHIQ
(D-)

Khng
bit

Rentailment,
OWL 2
RL
C

SROIQ
(D)

Khng
bit

Khng

Khng

Khng

Khng

Khng

Cc
dng
khc
nhau
Cha
xong
cho
OWL
DL
C

Khng

Khng

C,t nh
dng

C,
SWRL
& t
nh
dng

Khng

Khng

C,
SWRL
- DL
Safe
Rules

C,
SWRL

C,
SWRL DL Safe
Rules

C, SWRL, RIF,
F-logic,
ObjectLogic

C, t
nh dng

C,
SWRL
-DL
Safe
Rules

C,
SWRL&
t nh
dng

C,
SWRL,
RuleML,
Jess

H tr
DIG
H tr lut

Rentailment,
OWL 2
RL
C

Khng

C, t
nh
dng

- 42 -

Cc cng trnh nghin cu, pht trin v ng dng CNTT-TT

Tp V-1, S 7 (27), thng 5/2012

trong v d <Aho, l tc gi, sch Compiler>.


B phn chn lc v to cu tr li trong mun
cui cng gip h thng cung cp cc cu tr li ting
Vit dng ngn ng t nhin cho ngi dng mt
cch thn thin hn v d hiu hn. y cng l mt
chc nng d kin trong h thng hi p ting Vit
ca nhm tc gi (nh [17] [18]).
III. SUY DIN NI DUNG TR LI

Hnh 1. M hnh h thng hi p ting Vit


Trong h thng ny, ba khi chc nng chnh l
b phn tch cu hi ting Vit, b phn tch v xc
nh ni dung tr li v b chn lc to cu tr li.
Trong khi u tin, cu hi ting Vit ca ngi
dng dng ngn ng t nhin s c phn tch theo
trnh t sau y:
- Phn on t v gn nhn t loi cho mi
thnh phn tng ng trong cu hi. Bc x
l ny to tin cho cc tc v tip theo
trong h thng.
- Phn tch dng cu hi tng ng da trn t
hi v cc thng tin thu thp c bc x
l trc . T y, cu hi ban u c t
chc lu tr dng cu trc ng thi c
biu din dng cy phn tch.
Thng qua bc phn tch trong khi ny, ty theo
cu hi nhp vo ca ngi dng c dng n gin
hay phc tp m kt qu xut ra l mt hay nhiu b
ngn ng (linguistic tuple) tng ng. y khi
nim b ngn ng biu din cu trc lu tr ca cu
hi ban u, trong cc thnh phn gm tc nhn,
hnh vi v i tng, nh trong v d <ai?, vit, sch
AI>. Ty thuc vo tng dng cu hi m mt hay cc
thnh phn trong b ngn ng ny s c biu th
bng t hi (nghi vn) hoc du hi (k hiu ? ).
Trong khi k tip, b ngn ng c th ng vai
tr t bo c bn trong c s tri thc, nn cn c
gi l b tri thc (knowledge tuple) trong c s tri
thc, c h thng rt trch tng ng ph hp vi
cc ni dung v t hi trong b ngn ng. Khi nim
b tri thc c dng din t nhm thng tin lin
quan n tc nhn, hnh vi v i tng, nh m t

Hng tip cn cho vic suy din ny c nhm


tc gi xut da trn nn tng l thuyt th v bi
ton tm ng i trn th (trong lnh vc tr tu
nhn to v ton ri rc). Bc u tin trong hng
tip cn ny l xy dng v pht trin mt th phn
lp cc nim. th ny ng vai tr nn tng cho
phng thc suy din t ng bc tip theo, trong
gii quyt vn tm ni dung tr li cho mt cu
hi da trn cc kh nng xy ra tng ng vi cc
cung ng thch hp trn th ban u.
1. th cc phn lp nim
Hng nghin cu v th nim (Conceptual
Graph) c pht trin rt mnh vi nhiu nhm
nghin cu v cng trnh lin quan2. Trong bi bo
ny, nhm tc gi trnh by khi nim th cc phn
lp nim din t dng th lin thng c
hng, vi tp nh E gm cc lp nim, tp cnh V
gm cc quan h ng ngha lin quan gia cc nh
trong E. th ny c k hiu l Gcc = <V, E>.
th Gcc c minh ha trong Hnh 2 gm:
E={e0:Author, e1:Conference, ..., e8:Topic} vi tp cc phn lp
nim l {Author, Conference, ..., Topic}
V={v8:Topic-4:Paper, v0:Author-4:Paper, ..., v5:Publication-6:Publisher}
th Gcc, trong cc cnh ve1-e2 ni gia nh
(ng thi l lp nim) e1 v e2 c b sung thm
yu t trng lng, l gi tr xc sut iu kin gia
mt nim thuc lp e2 ph thuc vo mt nim
thuc lp e1, s c gi l th nim c trng
lng, k hiu l Gccw = <Vw, E>

Danh sch chi tit trong


en.wikipedia.org/wiki/Conceptual_graph

- 43 -

Cc cng trnh nghin cu, pht trin v ng dng CNTT-TT

Hnh 2. th cc nim phn lp Gcc


Nhng dng th trn c th c ng dng
trong cc bi ton thuc nhng lnh vc khc nhau,
nh xy dng ch mc hng ng ngha, hoc thit k
cu trc c s tri thc, hay phc v bi ton tm kim.
Trong th Gcc, vi hai nh e1 v e2 bt k, lun
tn ti mt ng i p t e1 n e2, l tp cc cung
lin kt (cnh) ni gia cc nh trung gian gia
chng. ng i p ny th hin mt mi quan h ng
ngha gia mt phn t thuc phn lp e1 vi mt
phn t thuc phn lp e2 thng qua cc quan h bc
cu gia cc lp trung gian.
Vi k nh e1, e2, , ek trong Gcc, tn ti t nht
mt ng i gia cc nh ny da trn cc trng
hp:
- Ba nh ny cng nm thng hng trn cng
mt ng i;
- Cc nh ny thuc nhng cung lin kt thuc
cc ng i khc nhau;
V d t th trong Hnh 3:
- Trng hp th nht xy ra trn cc nh
e0:Author; e5:Publication v e6:Publisher v cc nh ny
nm trn cng ng i p: e0:Author-e4:Papere5:Publication-e6:Publisher
- Trng hp th hai xy ra vi cc nh
e0:Author, e3:Keyword, e7:Reference v c hai ng i
lin quan p1: e0:Author-e4:Paper-e3:Keyword v p2:
e0:Author-e4:Paper-e7:Reference
Tng t, ng i pw trong th Gccw din t
mi quan h ng ngha c trng s da trn tr xc
xut tch hp bi cc tr xc sut trung gian.
2. Dng mu trong th Gcc
T th Gcc hay Gccw, gia hai nh bt k e1 v
e2 lun tn ti t nht mt ng i xuyn qua cc nh

Tp V-1, S 7 (27), thng 5/2012

trung gian theo hng t e1 n e2 hay ngc li.


ng i ny d theo hng no cng to thnh chui
cc quan h ng ngha ni tip nhau. Chui ny biu
din quan h gia cc thnh phn ct li trong ni
dung ca mt hay nhiu cu trong mt vn bn. Dn
n, mt cu hi hay mt cu tr li (lin quan n ni
dung vn bn) c th gn lin vi mt khung li cc
lp thng tin lin quan bc cu nhau. Khung thng tin
ny c gi l dng mu. V d dng mu AuthorPaper-Publication-Publisher lin quan n ng i
t nh Author n Publisher (nh Hnh 2).
Thng qua ng i trong mt dng mu xc
nh, ta c th tm c dng mu ngc li3. V d
Publisher-Publication-Paper-Author l dng mu
ngc tm c t v d trn.
Da trn th Gcc (hoc Gccw), tp cc dng mu
c th c xc nh thng qua phng n sau:
Phng n xy dng mu t th Gcc
Nhp: th Gcc i din cho mt ontology O
Xut: tp dng mu biu din cc phn lp thng tin
trong O
X l:
1.
Vi i=1..n (n l s nh ca Gcc)
1.1. Tm tt c cc ng i qua i nh trong Gcc
1.2. Xc nh v lu tr dng mu da trn cc nt
(phn lp) c trong ng i tm c trn.
2.
Tr v danh mc lu tr dng mu xc nh
c.

T th Hnh 2, tp mu c xc nh nh
trong Bng 2.
Nhng dng mu ny c th c pht trin
xy dng tp hp cc cu hi (kiu Yes/No hay WH)
hay cu tr li trong h thng hi p.
Bng 2. Danh sch mu c bn
STT
1.
2.
3.
4.

70.
71.
72.
3

Cc thnh phn lin quan


Author-Paper
Author-Paper-Publication
Author-Paper-Publication-Publisher
Author-Paper-Reference

Topic-Publication
Topic-Publication-Publisher
Topic-Paper-Reference

n gin, chng ta ch cn nu dng mu v khng


trnh by dng mu ngc.

- 44 -

Cc cng trnh nghin cu, pht trin v ng dng CNTT-TT


V d, mt s cu hi lin quan dng mu AuthorPaper-Reference nh:
- WH: Ai vit bi bo OPQ?
- Y/N: Tc gi ABC vit bi bo OPQ phi
khng?
- WH: Tc gi ABC vit bi bo OPQ c nhng
tham kho no?
- Y/N: Tc gi ABC vit bi bo OPQ c tham
kho XYZ khng?
Vi mt cu hi dng ch ng hoc b ng,
bc phn tch (xem [18]) s nhn dng cc thnh
phn tng ng trong ni dung hi. Ty thuc vo
thnh phn nghi vn trong cu hi m dng mu hay
dng mu ngc s c chn ph hp.
3. Suy din ni dung cu tr li
Sau qu trnh phn tch nhng cu hi da trn
dng mu nu trn, bc suy din tm ni dung tr li
(trong Hnh 1) c thc hin da trn gii thut suy
din ng i da trn th (Conceptual Graphbased answering reasoning algorithm, CGBAR). Gii
thut ny c pht trin da trn gii thut tm ng
i trong l thuyt th v tr tu nhn to4.
y, q l cu hi ca ngi dng, ti l mt mnh
hi con trong q v ng thi l c s xc nh
b ngn ng, Di l tp cc b ngn ng xc nh c
t q; vik l ni dung thnh phn ti trong q v tng ng
vi phn lp cik v c th c tnh cht aik; cij l mt
nh (im) trong th; n(i) l di ng i trong
th tng ng vi ti;
Gii thut CGBAR
Nhp:
- cu hi q c cc mnh hi thnh phn Q
={qi}i=1..n,
- ontology O cha tp cc phn lp C ={cj}j=1..m
Xut: ni dung tr li
X l:
1. Xy dng th Gcc cho tp C ca ontology O.
2. Vi mi mnh hi qi (i=1, 2, , n)
2.1. Di
2.2. Vi mi thnh phn tik ca mnh hi qi
2.2.1. Xc nh tr vik, phn lp cik v tnh cht aik.
2.2.2. Di Di {<vik, cik, aik>}
4

Mt tham kho trch t


en.wikipedia.org/wiki/A*_search_algorithm

Tp V-1, S 7 (27), thng 5/2012

Vi Di={<vik, cik, aik> / k=1, 2, , n(i)} (i=1..n)


3.1. Sp xp th t tn cc lp sao cho ci1 lin quan
n vi1 khc rng v khc t nghi vn.
3.2. Thc hin tm kim ng i ngn nht p gia
im u ci1 n im cui cin(i) trong th Gcc.
3.3. Nu n(i)=2, thc hin suy din kt hp gia kt
qu tm c trong b ti1 lin quan vi1v ti2 lin
quan vi2, t tr v kt qu trung gian v
chuyn sang bc 4.
3.4. Nu n(i)>2, thc hin suy din kt hp gia kt
qu tm c trong b tik lin quan vik v tik+1 lin
quan vik+1 (k<n(i)-1) theo l trnh ca ng i.
3.4.1. Nu khng tn ti p, phn r cc ng i
con ngn nht thnh phn sp1=c11c1l(1),
sp2=c21c2l(2), , spx=cx1cxl(x) sao cho
c1=c11=c21==cx1.
3.4.2. To cc tp kt qu ring phn s1 t sp1, ,
sx t spx.
3.4.3. Lin kt cc tp s=s1sx tr v kt qu
trung gian v chuyn sang bc 4.
4. To v tr v ni dung cu tr li da trn cc tp
kt qu trung gian ca nhng bc trn.
3.

Nu da trn ontology O ta c th xy dng mt


th Gccw th gii thut CGBAR c ci tin thnh
gii thut WCGBAR bng vic x l tm kim trn
th c trng lng Gccw cc bc 1 v 3.2.
Hai gii thut c xut trn c s khc bit
nht nh so vi phng php ca nhm Salloum [14].
Nhm ny xut mt gii php c sc thc hin
vin suy din da trn th nim vi kt qu kh
thuyt phc. Tuy nhin, nn tng ca phng php
li c xy dng th cng bi chuyn gia. iu ny
to s gii hn nht nh cho nng lc x l trong
phng php ca h. im khc bit ca hai gii thut
xut nu trn chnh l c ch x l vic suy din t
ng da theo bi ton tm ng i trn th. iu
ny khc phc cc gii hn trong phng php ca
[14] ng thi ci tin v nng cao nng lc x l ca
h thng do nhm tc gi xut.
IV. THC NGHIM
1. Pht trin ontology phc v suy din hi p
Phn tch v khai thc d liu cc bi bo khoa
hc t ngun ACM (www.acm.org) gm 31679 bi
bo v cng ngh thng tin (dung lng 149MB) nh
sau.

- 45 -

Cc cng trnh nghin cu, pht trin v ng dng CNTT-TT

Tp V-1, S 7 (27), thng 5/2012

Bng 3. Thng k d liu hun luyn


Loi
Lin kt (Links)
Tc gi (Authors)
T kha tng qut
(GeneralTerms)
T kha
(Keywords)
Ni dung bi bo
(Papers)
Ti liu tham kho
(References)
Ch bi bo
(Topics)

S
lng
144981
111736
222858

S lng phn
bit chnh xc
144978
47458
118995

T l
100.00%
42.47%
53.39%

559448

273360

48.86%

27412

27412

100.00%

309466

248540

80.31%

126997

7350

5.79+

Hnh 3. Ni dung gii thut CGBAR

2. Thc nghim phng php suy din ni dung


Trong gii thut CGBAR, vic ci t cc bc 3.3,
3.4.3 v 4 c th c pht trin da trn nhng k
thut khc nhau trong lnh vc cng ngh phn mm.
Hnh 3 minh ha kt qu ci t v dng sn phm
trung gian ca qu trnh x l. Kt qu ci t hon
chnh ca ng c suy din c th hin Hnh 4.
Trong qu trnh ci t ca gii thut CGBAR v
ng c suy din, mt s vn pht sinh nh sau:
- (V1) Ti u thi gian thc thi vic suy din tr
li v cc kt qu trung gian sinh ra qu phc
tp, khi d liu nhiu th truy vn qu lu.
- (V2) Ti u ni dung bc suy din tr li
trong trng hp c nhiu t kha xc nh
sn trong cu hi.
(V3) Mt cu hi lin quan n hnh vi (ng
t) bt k, cn tm gii php x l suy lun.
- (V4) M rng cu hi v t vn cho trng
hp cu hi mp m.
Phng n gii quyt cho V1 l ti u kt qu
trung gian thng qua vic ti t chc cu trc d liu
cho kt qu trung gian theo dng bng bm (hashtable) gip tng cng kh nng tm kim v gim
thiu xp x 60% thi gian x l, gp phn tng hiu
sut hot ng ca ng c suy din.
Mt hng x l cho vn V2 l gi li ch
nhng t kha lin quan n nhng thnh phn nm
trn tp kt qu tng hp trong bc 4 hay bc 5.3
ca gii thut CGBAR.

Hnh 4. Kt qu ci t hon chnh


ca ng c suy din
Trong vn V3, vic xc nh cc mi quan h
ng ngha (is-a, part-of, similar, hypernymy ) gia
ng t vi mt phn lp ca ontology O s gip
xc nh phn lp chnh m hnh vi c lin quan.
T , nhng t ng ngha c trong phn lp s
c s dng thay th phc v vic tm kim ni
dung tr li.
Trng hp xy ra trong vn V4 khi c ti thiu
mt thng tin trong cu hi khng xc nh c
thuc v phn lp no trong ontology O. Vic p dng
phng php phn hi lin quan (relevant feedback)
trong m rng truy vn l mt chn la hp l gii
quyt vn ny. Phng php ny s gip ng c
suy din hc c kin thc t cc chuyn gia (ngi
s dng), t phn tch v thng k c nhng
chn la c tn sut chn la cao t vn cho ngi
dng, gp phn gii quyt nhp nhng ca vn ny.

- 46 -

Cc cng trnh nghin cu, pht trin v ng dng CNTT-TT


Trong thc nghim, bc phn tch cu hi xc
nh nh x gia nhng ni dung trong cu hi vi
phn lp trong ontology O c tin hnh trn 210
cu hi thuc 5 nhm th nghim tng ng nhau
vi 90.52% phn tch ng. Tuy nhin mt s trng
hp mt phn ni dung cu hi khng nhn bit c
do gii hn s lng u mc t vng trong ontology
hay khng nhn bit c chnh xc t ting Vit t
cu hi, dn n s lng kt qu ng (theo chng
trnh v theo chuyn gia) vn cn mc gii hn. Da
trn kt qu x l trc ny, vic vic suy din c
tin hnh v thu c kt qu vi s liu tm tt nh
trong bng sau.
Bng 4. Kt qu suy din ni dung tr li
Thc nghim
Nhm 1
Nhm 2

(1)
31
38

(2)
31
38

(3)
29
36

(4)
30
37

(5)
93.548%
94.737%

(6)
96.667%
97.297%

Nhm 3
Nhm 4
Nhm 5

54
40
47

54
40
47

52
37
44

54
39
46

96.296%
92.500%
93.617%

94.545%
94.872%
95.652%

Tp V-1, S 7 (27), thng 5/2012

x l t ng tm kim cc phng n tr li cho


cu hi ca ngi dng. Vic ti u cc gii thut ny
l mt trong cc nghin cu k tip ca nhm tc gi.
Kt qu thc nghim tuy b hn ch bi dng thc
n gin ca cu hi (dng Y/N v WH mt mnh
hi) v ln ca tp cu hi (ch 210 cu) cng vi
c s tri thc phc v thc nghim, nhng cng cho
thy tnh kh thi ca cc phng php xut thng
qua cc gi tr o ( chnh xc, bao ph) thu
c. T y, vic ti u phng php v m hnh
xut l s nghin cu cn tip tc ca nhm nghin
cu trong thi gian sp ti vi mc tiu xy dng mt
h thng hi p h tr ting Vit thc s hiu qu v
hu ch khng ch trong lnh vc cng ngh thng tin
m cn trong cc lnh vc khc.
TI LIU THAM KHO.
[1]. Vanessa Lopez, Victoria Uren, Enrico
Motta, Michele Pasin, AquaLog, An ontology
driven question answering system for organizational
semantic intranets, Journal of Web Semantics, 31
Mach 2007

Vi:
(1) S cu hi ting Vit phn tch ng
(2) S cu hi ting Vit suy din tr li c
(3) S cu hi ting Vit suy din tr li c hp l
(4) S cu hi ting Vit suy din tr li c hp l
theo chuyn gia
(5) chnh xc, (6) bao ph

[2]. START, start.csail.mit.edu


[3]. Lexxe, www.lexxe.com
[4]. Ask, www.ask.com
[5]. W5hanswers Q&A, www.w5hanswers.com
[6]. www.mshd.net

Trong cc nhm thc nghim, mt s trng hp


cha t yu cu lin quan n mt trong nhng vn
V1 V4 phn tch nu trn v ngoi ra cn do vn
nhp nhng cha c gii quyt trit . Nhn
chung, vic suy din to kt qu trung gian cho thy
chnh xc ca bc x l tng i tt, d cc
o ny c ph thuc vo kt qu phn tch trc .
V. KT LUN
Bi bo ny trnh by hng tip cn nhm pht
trin ng c suy din cho h thng hi p c h tr
ting Vit vi trng tm l phng php suy din ni
dung tr li da trn gii thut A* v l thuyt th.
Cc gii thut CGBAR v WCGBAR c
xut trong nghin cu ny vi mong mun nh hng

[7]. Hesitation, www.hesitation.co.uk


[8]. Google Answers, knol.google.com/k/google-answers
[9]. Google Answers (Chinese), enda.tianya.cn/wenda
[10]. IBM Watson, ww.ibm.com/innovation/us/watson
[11]. S. Handschuh, S.Staab, F.Ciravegna, SCream, Semiautomatic CREAtion of metadata, 13th
Int. Conference on Knowledge Engineering and
Management, 2002, Spain.
[12]. M. vargasVera, et. al., MnM: ontology driven
semiautomatic supp for semantic markup, 13th Int.
Conference on Knowledge Engineering and
Management, 2002, Spain.
[13]. HNG TRUNG DNG, CAO HONG TR, Dch
t ng truy vn ting Vit sang th nim, Tp ch
Tin hc v iu khin hc, tp 23, s 3, 2007 (trang
272283)

- 47 -

Cc cng trnh nghin cu, pht trin v ng dng CNTT-TT


[14]. Salloum, Wael, A Question Answering System
based on Conceptual Graph Formalism, 2nd Int.
Symposium on Knowledge Acquisition & Modeling,
IEEE CS Press, 2009.
[15]. Jir Mrovsky, Netgraph Query Language for
the Prague Dependentcy Treebank 2.0, The Prague
bulletin of Mathematical Linguistics, number 90,
12/2008, (pp.532)
[16]. Tuoi Phan, Thanh Nguyen, Thuy Huynh,
Question Semantic Analysis in Vietnamese QA
System, ACIIDS 2010, Vietnam.
[17]. Tuoi T.Phan, Thanh C.Nguyen, Vietnamese
knowledge base development and exploitation,
International Journal of Business Intelligence and Data
Mining, 2010. ISSN: 1743-8195.

[26]. Dang T. Nguyen and Tri Phi-M. Nguyen,


A Question Answering Model Based Evaluation for
OVL (Ontology for Vietnamese Language),
International Journal of Computer Theory and
Engineering, Vol. 3, No. 3, June 2011.
[27]. Tho Thanh Quan, Siu Cheung Hui,
Ontology-based Natural Query Retrieval using
Conceptual Graphs, PRICAI08, Hanoi, 2008.

Nhn bi ngy: 18/05/2011


S LC V TC GI
PHAN TH TI

[18]. V THANH HNG, Nghin cu v xy dng tp


cc cu truy vn phc v cho h thng hi p ting
Vit, Lun vn tt nghip i hc, Khoa Khoa hc v
K thut My tnh, i hc Bch khoa Tp.HCM, 2010.
[19]. Anh K. Nguyen, Huong T. Le, Natural
Language Interface Construction using Semantic
Grammars, PRICAI08, Hanoi, 2008, Vietnam.
[20]. Dai Q. Nguyen, Dat Q.Nguyen, Son B.
Pham, A Vietnamese Question Answering System,
KSE09, IEEE CS, 2009.
[21]. Cao, T.H. & Anh, M.H, Ontology-Based
Understanding of Natural Language Queries using
Nested Conceptual Graphs, 18th Int.Conference on
Conceptual Structures, 2010, Malaysia, LNCS 6208.

Tt nghip i hc K thut Tip


khc, ngnh My tnh, nm 1976.
Tin s chuyn ngnh Khoa hc
my tnh, nm 1985, Trng i
hc Charles, Cng ha Czech.
Hin cng tc ti Khoa KH v KT
My tnh, Trng i hc Bch khoa Tp. HCM.
Hng nghin cu: X l ngn ng t nhin- X l
vn bn; Truy xut thng tin; Rt trch thng tin.
Email: tuoi@cse.hcmut.edu.vn

NGUYN CHNH THNH

[22]. Philip H. P. Nguyen, Dan Corbett, A basic


mathematical framework for conceptual graphs,
Journal IEEE Transactions on Knowledge and Data
Engineering Volume 18 Issue 2, February 2006.

Tt nghip i hc S phm Tp.


HCM, ngnh Ton, nm 1994. Tt
nghip K s, Thc s k thut v
Tin s k thut ngnh My tnh
nm 1998, 2003 v 2011, Trng
i hc Bch Khoa Tp. HCM.

[23]. Cao, T.H., Conceptual Graphs and Fuzzy Logic: A


Fusion for Representing and Reasoning with Linguistic
Information, Studies in Computational Intelligence,
Vol. 306, Springer-Verlag, 2010.
[24]. Cao, T.H., Fuzzy Conceptual Graph Programs for
Knowledge Representation and Reasoning, Tech.
Report 400, University of Queensland, Australia, 1997.

Tp V-1, S 7 (27), thng 5/2012

Hng nghin cu: X l ngn ng t nhin, Truy


xut thng tin, Rt trch thng tin, Web ng ngha.
Email: chanh.thanh@yahoo.com.vn

[25]. Croitoru and Van Deemter, A Conceptual


Graph Approach to the Generation of Referring
Expressions, IJCAI, 2007, Hyderabad, India.

- 48 -

You might also like