You are on page 1of 11

Sun

Java

Solaris

Communities

My SDN Account

SDN Home > Java Technology > Reference > Technical Articles and Tips >

Article

Java Architecture for XML Binding (JAXB)


Print-friendly Version By Ed Ort and Bhakti Mehta, March 2003

You will find the following topics covered in this article: What's JAXB? An Example: Accessing an XML Document Bind the Schema Unmarshal the Document Another Example: Building an XML Document Bind the Schema Create the Content Tree Marshal the Content Tree A Final Example: Updating an XML Document Binding Can Be Customized Distinct Advantages Run the Examples

What's JAXB?
The Extensible Markup Language (XML) and Java XML and Java technology are recognized as ideal technology are natural partners in helping building blocks for developing Web services and developers exchange data and programs across applications that access Web services. A new Java API the Internet. That's because XML has emerged as called Java Architecture for XML Binding (JAXB) can the standard for exchanging data across disparate make it easier to access XML documents from systems, and Java technology provides a platform applications written in the Java programming language. for building portable applications. This partnership is particularly important for Web services, which promise users and application developers program functionality on demand from anywhere to anywhere on the Web. XML and Java technology are recognized as ideal building blocks for developing Web services and applications that access Web services. But how do you couple these partners in practice? More specifically, how do you access and use an XML document (that is, a file containing XML-tagged data) through the Java programming language? One way to do this, perhaps the most typical way, is through parsers that conform to the Simple API for XML (SAX) or the Document Object Model

(DOM). Both of these parsers are provided by Java API for XML Processing (JAXP). Java developers can invoke a SAX or DOM parser in an application through the JAXP API to parse an XML document -- that is, scan the document and logically break it up into discrete pieces. The parsed content is then made available to the application. In the SAX approach, the parser starts at the beginning of the document and passes each piece of the document to the application in the sequence it finds it. Nothing is saved in memory. The application can take action on the data as it gets it from the parser, but it can't do any in-memory manipulation of the data. For example, it can't update the data in memory and return the updated data to the XML file. In the DOM approach, the parser creates a tree of objects that represents the content and organization of data in the document. In this case, the tree exists in memory. The application can then navigate through the tree to access the data it needs, and if appropriate, manipulate it. Now developers have another Java API at their disposal that can make it easier to access XML documents: Java Architecture for XML Binding (JAXB). A Reference Implementation of the API is now available in the Java Web Services Developer Pack V 1.1. Let's look at JAXB in action, and compare it to SAX and DOM-based processing.

An Example: Accessing an XML Document


Suppose you need to develop a Java application that accesses and displays data in XML documents such as books.xml. These documents contain data about books, such as book name, author, description, and ISBN identification number. You could use the SAX or DOM approach to access an XML document and then display the data. For example, suppose you took the SAX approach. In that case, you would need to: Write a program that creates a SAX parser and then uses that parser to parse the XML document. The SAX parser starts at the beginning of the document. When it encounters something significant (in SAX terms, an "event") such as the start of an XML tag, or the text inside of a tag, it makes that data available to the calling application. Create a content handler that defines the methods to be notified by the parser when it encounters an event. These methods, known as callback methods, take the appropriate action on the data they receive. As an example, here is a program that uses JAXP to create and use a SAX parser to parse an XML document. The program uses a content handler, , to display the data passed to it by the SAX parser. Now let's look at how you use JAXB to access an XML document such as and display its data. Using JAXB, you would: Bind the schema for the XML document.

JAXB allows Java developers to access and process XML data without having to know XML or XML processing. For example, there's no need to create or use a SAX parser or write callback methods.

Unmarshal the document into Java content objects. The Java content objects represent the content and organization of the XML document, and are directly available to your program. After unmarshalling, your program can access and display the data in the XML document simply by accessing the data in the Java content objects and then displaying it. There is no need to create and use a parser and no need to write a content handler with callback methods. What this means is that developers can access and process XML data without having to know XML or XML processing.

Bind the Schema

JAXB simplifies access to an XML document from a Java program by presenting the XML document to the program in a Java format. The first step in this process is to bind the schema for the XML document into a set of Java classes that represents the schema. Schema: A schema is an XML specification that governs the allowable components of an XML document and the relationships between the components. For example, a schema identifies the elements that can appear in an XML document, in what order they must appear, what attributes they can have, and which elements are subordinate (that is, are child elements) to other elements. An XML document does not have to have a schema, but if it does, it must conform to that schema to be a valid XML document. JAXB requires that the XML document you want to access has a schema, and that schema is written in the W3C XML Schema Language (see the box "Why W3C XML Schema Language?").

Assume, for this example, that the document has a schema, books.xsd, that is written in the W3C XML Schema Language. This schema defines a as an element that has a complex type. This means that it has child elements, in this case, elements. Each element also has a complex type named . The element has child elements such as , , and . Some of these have their own child elements.

Why W3C XML Schema Language? The W3C XML Schema Language is not the only schema language. In fact, the XML specification describes document-type definitions (DTDs) as the way to express a schema. In addition, pre-release versions of the JAXB Reference Implementation worked only with DTDs -- that is, not with schemas written in the XML Schema Language. However, the XML Schema Language is much richer than DTDs. For example, schemas written in the XML Schema Language can describe structural relationships and data types that can't be expressed (or can't easily be expressed) in DTDs. There are tools available to convert DTDs to the W3C XML Schema Language, so if you have DTD-based schemas that you used with an earlier version of the JAXB Reference Implementation, you can use these tools to convert the schemas to XML Schema Language.

Binding: Binding a schema means generating a set of Java classes that represents the schema. All JAXB implementations provide a tool called a binding compiler to bind a schema (the way the binding compiler is invoked can be implementation-specific). For example, the JAXB Reference Implementation provides a binding compiler that you can invoke through scripts. Suppose, for example, you want to bind the schema using the binding compiler provided by the JAXB Reference Implementation. Suppose too that you're working in the Solaris Operating Environment. Here's a command you can use to run the script that binds the schema:

The option identifies a package for the generated classes, and the option identifies a target directory. So for this command, the classes are packaged in within the directory. In response, the binding compiler generates a set of interfaces and a set of classes that implement the interfaces. Here are the interfaces it generates for the schema: . Represents the unnamed complex type for the element. . Represents the element. . Represents the complex type. . Contains methods for generating instances of the interfaces. Here are the classes that implement the interfaces (these are generated in an subdirectory). Note that these classes are implementation-specific -- in this example, they are specific to the Reference Implementation. Because the classes are implementation-specific, classes generated by the binding compiler in one JAXB implementation will probably not work with another JAXB implementation. So if you change to another JAXB implementation, you should rebind the schema with the binding compiler provided by that implementation. . Implements the . Implements the . Implements the interface described in interface described in interface described in . . .

In total, the generated classes represent the entire schema. Notice that the classes define and methods that are used to respectively obtain and specify data for each type of element and attribute in the schema. You then compile the generated interfaces and classes. For example:

This compiles all of the interfaces and classes in the

package generated by the binding compiler.

Unmarshal the Document

Unmarshalling an XML document means creating a tree of content objects that represents the content and organization of the document. The content tree is not a DOM-based tree. In fact, content trees produced through JAXB can be more efficient in terms of memory use than DOM-based trees. The content objects are instances of the classes produced by the binding compiler. In addition to providing a binding compiler, a JAXB implementation must provide runtime APIs for JAXB-related operations such as marshalling. The APIs are provided as part of a binding framework. The binding framework comprises three packages. The primary package, , contains classes and interfaces for performing operations such as unmarshalling, marshalling, and validation (marshalling and validation will be covered later). A second package, , contains a number of utility classes. The third package, , is designed for JAXB implementation providers. To unmarshal an XML document, you: Create a object. This object provides the entry point to the JAXB API. When you create the object, you need to specify a context path. This is a list of one or more package names that contain interfaces generated by the binding compiler. By allowing multiple package names in the context path, JAXB allows you to unmarshal a combination of XML data elements that correspond to different schemas. For example, the following code snippet creates a package that contains the interfaces generated for the object whose context path is schema: , the

Create an object. This object controls the process of unmarshalling. In particular, it contains methods that perform the actual unmarshalling operation. For example, the following code snippet creates an object:

Call the method. This method does the actual unmarshalling of the XML document. For example, the following statement unmarshals the XML data in the file:

Note that a

here is a

, not a

Use the methods in the schema-derived classes to access the XML data. Recall that the classes that a JAXB compiler generates for a schema include and methods you can use to respectively obtain and specify data for each type of element and attribute in the schema. For example, the following statement gets the data in the and elements:

After obtaining the data, you can display it directly from your program. Here, for example, is a program that unmarshals the data in the file and then displays the data. If you run the program, you should see the following result:

Validating the Source Data: Notice that the program includes the following statement:

This statement highlights an important feature of You can validate source data against an associated JAXB: you can have it validate the source data schema as part of the unmarshalling operation. against the associated schema as part of the unmarshalling operation. In this case, the statement asks JAXB to validate the source data against its schema. If the data is found to be invalid (that is, it doesn't conform to the schema) the JAXB implementation can report it and might take further action. JAXB providers have a lot of flexibility here. The JAXB specification mandates that all provider implementations report validation errors when the errors are encountered, but the implementation does not have to stop processing the data. Some provider implementations might stop processing when the first error is found, others might stop even if many errors are found. In other words, it is possible for a JAXB implementation to successfully unmarshal an invalid XML document, and build a Java content tree. However, the result won't be valid. The main requirement is that all JAXB implementations must be able to unmarshal valid documents. You also have the flexibility of turning the validation switch off if you don't want to incur the additional validation processing overhead. Unmarshalling Other Sources: Although the example described in this section shows how to unmarshal XML data in a file, you can unmarshal XML data from other input sources such as an , a URL, or a DOM node. You can even unmarshal transformed XML data. For example, you can unmarshal a object. You can also unmarshal SAX events -- in other words, you can do a SAX parse of a document and then pass the events to JAXB for unmarshalling. An Alternative: Accessing Data without Unmarshalling: JAXB also allows you to access XML data without having to unmarshal it. One of the classes generated from a schema, , contains methods to generate objects for each of the schema-derived interfaces and classes. For example, the package generated for the schema includes an class that has methods such as to create a object, and to create a object. You can use these methods to create a tree of content objects without doing any unmarshalling. All your program needs is access to the class that's in the package for the pertinent schema. Then you can use the appropriate methods in the class to create the objects you need. After you create the objects, you need to provide their content. To do that, you use the methods in the objects.

Another Example: Building an XML Document


Instead of accessing data in an XML document, suppose you need to build an XML document through a Java application. Here too using JAXB is easier. Let's investigate. You could use the DOM approach to build an XML document, but not SAX. That's because you would need to build and populate the content of the document in memory -- recall that SAX does not allow you to perform any in-memory manipulation of data. Using the DOM approach, your program needs to create and use DOM objects and methods to build the document. DOM is designed to represent the content and organization of data in a document as a tree of objects. To build the document, your program uses DOM to create a object that represents the document. Your program then uses object methods to create other objects that represent the nodes of the tree. Each node contains content for the XML document. You then append the nodes in an order that reflects the organization of the tree. In other words, your program uses DOM object methods to create a root node, and append the root node to the object. Then it creates child nodes and appends them to the root node. If a child node has children of its own, your program uses DOM object methods to create those nodes and append them to their parent node. Unlike the SAX approach, there is no need in DOM to write a content handler and callback methods. However the DOM approach requires you to understand the organization of the document tree. In fact, if you use DOM to access data, you create a parser that builds a tree, and then you use DOM methods to navigate to the appropriate object in the tree that contains the data you need. So an understanding of the tree's organization is a requirement. Compare

this to JAXB, where you have direct access to unmarshalled XML data through objects in the content tree. As in DOM-based processing, JAXB allows access to data in non-sequential order, but it doesn't force an application to navigate through a tree to access the data. In addition, with all the creating and appending of objects that represent the nodes of the tree, the DOM approach can be tedious. Here, for example, is a program that uses DOM to build and populate a document, and then write the document to an XML file. Notice that the type of data that gets populated into the document is similar to the data in the file that was used in the first example, Accessing an XML Document. In fact, the program validates the document it builds against the books.xsd schema that was used in the first example. Now let's look at how you use JAXB to build the same document, validate it against the schema, and write the document to an XML file. Using JAXB, you would: Bind the schema for the XML document (if it isn't already bound). Create the content tree. Marshal the content tree into the XML document. In this process, you don't deal with the intricacies of the DOM object model or even need to know XML.

As in DOM-based processing, JAXB allows access to data in non-sequential order, but it doesn't force an application to navigate through a tree to access the data.

Bind the Schema


This is the same operation you perform prior to unmarshalling a document. In this case, the schema is for the XML document you want to build. Of course, if you've already bound the schema (for instance, you unmarshalled an XML document, updated the data, and now want to write the updated data back to the XML document), you don't have to bind the schema again.

Create the Content Tree


The content tree represents the content that you want to build into the XML document. You can create the content tree by unmarshalling XML data, or you can create it using the class that's generated by binding the appropriate schema. Let's use the approach. First, create an instance of the class:

Next, use

methods in the

object to create each of the objects in the content tree. For example:

Then use

methods in the created objects to specify data values. For example:

Marshal the Content Tree

Marshalling is the opposite of unmarshalling. It creates an XML document from a content tree. To marshal a content tree, you: Create a object, and specify the appropriate context path -- that is, the package that contains the classes and interfaces for the bound schema. As is the case for unmarshalling, you can specify multiple package names in the context path. That gives you a way of building an XML document using a combination of

XML data elements that correspond to different schemas.

Create a object. This object controls the process of marshalling. In particular, it contains methods that perform the actual marshalling operation.

The object has properties that you can set through the method. For example, you can specify the output encoding to be used when marshalling the XML data. Or you can tell the to format the resulting XML data with line breaks and indentation. The following statement turns this output format property on -- line breaks and indentation will appear in the output format:

Call the method. This method does the actual marshalling of the content tree. When you call the method, you specify an object that contains the root of the content tree, and the output target. For example, the following statement marshals the content tree whose root is in the object and writes it as an output stream to the XML file :

Here, for example, is a program that creates a content tree, fills it with data, and then marshals the content tree to an XML file. Validating the Content Tree : Notice that validation is not performed as part of the marshalling operation. In other words, unlike the case for unmarshalling, there is no method for marshalling. Instead, when marshalling data, you use the class that is a part of the binding framework to validate a content tree against a schema. For example:

Validating the data as a separate operation from marshalling gives you a lot of flexibility. For example, you can do the validating at one point in time, and do the marshalling at another time. Or you can do some additional processing in between the two operations. Note that the JAXB specification doesn't require a content tree to be valid before it's marshalled. That doesn't necessarily mean that a JAXB implementation will allow invalid data to be marshalled -- it might marshal part or all of the invalid data, or not. But all JAXB implementations must be able to marshal valid data. Marshalling to Other Targets: Although the example described in this section shows how to marshal data to an XML file, you can marshal to other output formats such as an object or a DOM node. You can also marshal to a transformed data format such as . You can even marshal to a content handler. This allows you to process the data as SAX events.

A Final Example: Updating an XML Document


Here's a final example, one that logically combines elements of accessing an XML document and building an XML document. Suppose you need to update an XML document. In the DOM approach, you would create and use a DOM parser to navigate to the appropriate object in the tree that contains the data you need, update the data, and then write the updated data to an XML file. Here, for example, is a program that uses DOM to update an XML document. As discussed in Building an XML Document, the DOM approach is relatively tedious and forces you to know the organization of the content tree. Here, by comparison, is a JAXB program that updates an XML document. Specifically, it updates an unmarshalled content tree and then marshals it back to an XML document. Notice how JAXB simplifies the process. The program has direct access to the object it needs to update. The program uses a method to access the data it needs, and a method to update the data. Although it's tempting to think that the XML data can make a "roundtrip" unchanged, there's no guarantee of that. In other words, if you use JAXB to unmarshal an XML document and then marshal it back to the same XML file, there's no guarantee that the XML document will look exactly the same at it did originally. For example, the indentation of the resulting XML document might be a bit different than the original. The JAXB specification does not require the preservation of the XML information set in a roundtrip from XML document-to-Java representation-to XML document. But it also doesn't forbid the preserving of it.

Binding Can Be Customized

The JAXB specification describes the default The JAXB specification describes the default behavior behavior for binding a subset of XML schema for binding a subset of XML schema components to Java components to Java components. The components. However JAXB allows you to annotate a specification identifies which XML schema schema with binding declarations that override or extend components must be bound and to what Java the default binding behavior. representations these components are bound. For example, the XML built-in datatype must be bound to the Java data type . All JAXB compiler implementations must implement the default binding specifications. However there are times when the default behavior might not be what you want. For example, suppose you want an XML data type mapped to a Java data type that is different than the type called for by the default binding specification. Or you want the binding compiler to assign a name of your choice to a class that it generates. To meet these and other customization needs, JAXB allows you to annotate a schema with binding declarations that override or extend the default binding behavior. JAXB allows these declarations to be made "inline" -- that is, in the schema, or in a separate document. Let's look at a customization example. Here is an annotated version of the previous examples. The annotations in this example are inline. Notice the element near the top of the schema: schema that was used in the

All binding declarations are in an declarations must be made this way.

element and its subordinate

element. In fact, all inline binding

This block of code demonstrates a number of customizations that you can make to a schema: Make global customizations: The element specifies binding declarations that have global scope. In JAXB, binding declarations can be specified at different levels, or "scopes." Each scope inherits from the scopes above it, and binding declarations in a scope override binding declarations in scopes above it. Global scope is at the top of the scope hierarchy. It covers all the schema elements in the source schema and (recursively) any schemas that are included or imported by the source schema. Global scope is followed in the hierarchy by Schema scope (covers all the schema elements in the target namespace of a schema), Definition scope (covers all schema elements that reference a specified type definition or a global declaration), and Component scope (applies only to a specific schema element that was annotated with a binding declaration). Notice that the namespace prefix ( ) for the element is bound to http://java.sun.com/xml/ns /jaxb. This URI contains the core schema for binding declarations. Add method signatures. The declaration tells the binding compiler to generate methods for the properties of all generated classes. These methods are used to determine if a property in a class is set or has a default value. Change binding style . By default, schema components that have complex types and that have a content type property of mixed or element-only are bound with a style called element binding. In element binding, each element in the complex type is mapped to a unique content property. Alternatively, you can change the binding style to model group binding by specifying . In model group binding, schema components that have complex type and that are nested in the schema are mapped to Java interfaces. This gives users a way to specifically customize these nested components. For example, the following component is nested in the customized schema:

As result of the global declarations made earlier, the binding compiler will generate the following methods for the elements tagged as :

Include vendor-specific extensions. The declaration is an extension binding declaration. The prefix binds to a namespace for extension binding declarations. These declarations are vendorspecific extensions to the binding declarations defined in http://java.sun.com/xml/ns/jaxb. Here, the vendor-specific declaration covers the binding of classes that implement . The serial version uid 12343 will be assigned to each generated class. Customize the binding of a simple data type . The declaration binds the XML datatype to the Java data type . This overrides the default binding behavior, which is to bind to the Java primitive data type . The additional declaration tells the binding compiler to use the method in JAXB's package to convert a lexical representation of the XML data type into the Java data type. The parse method is invoked by the JAXB provider's implementation during unmarshalling. The additional declaration tells the binding compiler to use the method in JAXB's package to convert the Java data type into a lexical representation of the XML data type. The print method is invoked by the JAXB provider's implementation during marshalling. Additional customizations: Other annotations in the schema illustrate additional types of customization, such as annotating a specific schema element to a Java Content Interface or Java Element Interface. This is done through a binding declaration. In the annotated schema example, a binding declaration is used to specify the name for the interface bound to the class. Another binding declaration in the annotated schema example binds the element to its Java representation as a typesafe enumeration class. Although not illustrated in the annotated schema, another type of customization you can make is to specify javadoc for a generated package or class. These are only some of the many binding customizations that JAXB allows. You can see the impact of the binding declarations by binding the annotated schema. When you do the binding, specify the option, as in the following command:

The option allows you to use vendor-provided extensions. You need this to enable the extension binding declaration in the schema. If you don't specify the option, the binding compiler will run in "strict" mode. In this mode, it allows only for default bindings, and will produce an error message when it comes to the extension binding declaration. After you run the program, examine the interfaces and classes that the binding compiler generates, and compare them to the interfaces and classes generated from the uncustomized schema. For example, here is the file generated for the unnamed complex type for the element. Notice the additional methods that have been added because of the binding customizations.

Distinct Advantages
Let's reiterate a number of important advantages of using JAXB: JAXB simplifies access to an XML document from a Java program: . JAXB allows you to access and process XML data without having to know XML or XML processing. Unlike SAX-based processing, there's no need to create a SAX parser or write callback methods. JAXB allows you to access data in non-sequential order, but unlike DOM-based processing, it doesn't force you to navigate through a tree to access the data. By unmarshalling XML data through JAXB, Java content objects that represent the content and organization of the data are directly available to your program. JAXB uses memory efficiently: The tree of content objects produced through JAXB tends can be more efficient in terms of memory use than DOM-based trees. JAXB is flexible:

You can unmarshal XML data from a variety of input sources, including a file, an object, a URL, a DOM node, or a transformed source object. You can marshal a content tree to a variety of output targets, including an XML file, an object, a DOM node, or a transformed data object You can unmarshal SAX events -- for example, you can do a SAX parse of a document and then pass the events to JAXB for unmarshalling. JAXB allows you to access XML data without having to unmarshal it. Once a schema is bound you can use the methods to create the objects and then use methods in the generated objects to create content. You can validate source data against an associated schema as part of the unmarshalling operation, but you can turn validation off if you don't want to incur the additional validation overhead. You can validate a content tree, using the class, separately from marshalling. For example, you can do the validating at one point in time, and do the marshalling at another time. JAXB's binding behavior can be customized in a variety of ways. Java developers should find JAXB a welcome aid in developing Web services and other Java-XML applications.

Run the Examples


If you'd like to run the examples in this article with Java Web Services Developer Pack V 1.1, you need to: 1. Install Java Web Services Developer Pack V 1.1 (if you haven't already done so).

2. Set the environment variable to the Java Web Services Developer Pack V 1.1 installation directory. For example, if you're using the C shell in the Solaris Operating Environment, enter the command:

replace install_dir with the Java Web Services Developer Pack V 1.1 installation directory. 3. Set the class paths for JAXB, JAXP, and Java Web Services Developer Pack V 1.1. For example, if you're using the C shell in the Solaris Operating Environment, enter the commands:

and on one line:

4. Set the path for JAXB. For example, if you're using the C shell in the Solaris Operating Environment, enter the command:

For More Information


Java Architecture for XML Binding (JAXB) Java Web Services Developer Pack V 1.1 W3C XML Schema Language

About the Authors


Ed Ort is a java.sun.com staff member. He has written extensively about Java technology and Web services. Bhakti Mehta is a Member of Technical Staff at Sun Microsystems Inc. She is in the Web Technologies and Standards Interoperability and Quality team, and has worked with JAXP, JAXB, JAXR and JAXM. Have a question about programming? Use Java Online Support.

Oracle is reviewing the Sun product roadmap and will provide guidance to customers in accordance with Oracle's standard product communication policies. Any resulting features and timing of release of such features as determined by Oracle's review of roadmaps, are at the sole discretion of

Oracle. All product roadmap information, whether communicated by Sun Microsystems or by Oracle, does not represent a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. It is intended for information purposes only, and may not be incorporated into any contract.

About Sun | About This Site | Newsletters | Contact Us | Employment | How to Buy | Licensing | Terms of Use | Privacy | Trademarks

A Sun Developer Network Site Unless otherwise licensed, code in all technical manuals herein (including articles, FAQs, samples) is provided under this License. Sun Developer RSS Feeds

2010, Oracle Corporation and/or its affiliates

You might also like