Professional Documents
Culture Documents
Session 2009-10
• Interpretation of text.
• The XML parser’s job is load the document, check that
follows all necessary rules (at minimum, for well-
formatted ness), and build a document tree structure that
can be passed on to the application.
• The application is any program (e.g. browser, reader,
middleware) that acts upon the tree structure, processing
the data it contains.
How parsing Is done?
Packets of
parsed
XML data Application
XML XML to
Document Parser manipulate
XML Data
XML Application
Disadvantage:
(1) It is memory inefficient
(2) It seems complicated, although not really
The Trouble With DOM
• Written by C programmers
• Cumbersome API
– Node does double-duty as collection
• Open source
• Classes / Interfaces
Event-driven
• This is an event-driven, serial-access mechanism
that does element-by-element processing.
It calls a method you provide to process each
construct it encounters
More efficient for handling large XML documents
Gives you the information in bits and pieces
Continue…
Your program
startDocument(...)
The SAX parser
startElement(...)
main(...)
parse(...)
characters(...)
endElement( )
endDocument( )
Simple SAX program
ContentHandler
DTDHandler
EntityResolver
ErrorHandler
Used to create a
SAX Parser Handles document
events: start tag, end
XML-Reader tag, etc.
Factory
Handles
Content Parser Errors
Handler
Error
XML Handler Handles
XML Reader DTD
DTD
Handler
Entity Handles
Resolver Entities
Advantage:
(1) It is simple
The data is broken into pieces and clients never have all the
information as a whole unless they create their own data
structure
StAX better than SAX
• However,
– StAX is new, so most existing projects use SAX
– Many ideas, such as the use of factories, are the same for each
SAX vs DOM Parsing: Efficiency
QURIES???