Professional Documents
Culture Documents
XML Processing on
z/OS
Overview of XML generation and
parsing technologies available on z/OS
Mike Ebbers
Mogens Conrad
Hans-Dieter Mertiens
Nagesh Subrahmanyam
Michael Todd
ibm.com/redbooks
International Technical Support Organization
December 2009
SG24-7810-00
Note: Before using this information and the product it supports, read the information in “Notices” on
page xv.
Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
The team who wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
Contents v
10.1 Where and when to validate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
10.2 XPLINK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
10.2.1 XML Toolkit for z/OS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
10.2.2 z/OS XML System Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
10.2.3 COBOL and PL/I built-in parsers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
10.3 zIIPs and zAAPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
10.3.1 zAAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
10.3.2 zIIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
10.3.3 zAAP on zIIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
10.4 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
10.4.1 XML Toolkit for z/OS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
10.4.2 z/OS XML System Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
10.4.3 COBOL and PL/I built-in parsers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
10.5 Application language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
10.5.1 COBOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
10.5.2 PL/I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
10.5.3 C and C++. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
10.5.4 Assembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
10.5.5 Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Contents vii
viii XML Processing Options on z/OS
Figures
1-1 XML from Example 1-1 on page 2 transformed to HTML using XSLT stylesheet. . . . . 13
2-1 XML Toolkit for z/OS architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2-2 Parsing architecture for Enterprise COBOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2-3 PL/I and z/OS XML System Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2-4 CICS Transaction Server XML architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2-5 IMS DB XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2-6 DB2 with pureXML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3-1 CICS TS and XML/SOAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3-2 CICS transform statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3-3 IMS Web Services connectivity solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3-4 IMS and Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3-5 DB2 as a Web services provider. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3-6 DB2 as a Web services consumer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3-7 WebSphere Application Server and SOAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3-8 Enterprise Service Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3-9 DB2 as an XML warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3-10 IMS as a middle tier or warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3-11 IMS as a Java/XML application server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3-12 MQ messaging triggered listener applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4-1 CICS as a service requester . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5-1 HTML result from XSLT transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5-2 CICS as Web services requester . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5-3 CICS Web Services Assistant. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6-1 z/OS XML System Services parsed data stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6-2 Buffers usages and flow in XML System Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6-3 Flowchart for invoking z/OS XML System Services . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6-4 One way of invoking z/OS XML System Services parser in case of x’1302’ reason code
74
6-5 String ID exit processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6-6 Flow of parsing with existing parsers (top) and z/OS-specific parsers (bottom) . . . . . . 77
6-7 Validating parse with z/OS-specific classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7-1 COBOL and z/OS XML System Services combinations . . . . . . . . . . . . . . . . . . . . . . . . 86
7-2 PL/I and z/OS XML System Services combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7-3 Invoking the XML toolkit from within COBOL-PL/I, single document input . . . . . . . . . . 89
7-4 Invoking the XML Toolkit from within COBOL-PL/I for multiple input documents . . . . . 89
7-5 Invoking the XML toolkit from within COBOL-PL/I for a single output document . . . . . 90
8-1 Simple non-validating parsing using z/OS XML System Services . . . . . . . . . . . . . . . . 94
8-2 Non-validating parsing with z/OS XML System Services using String IDs . . . . . . . . . . 94
8-3 Non-validating z/OS XML System Services parsing with a predefined set of String ID
values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
8-4 Two-step validating parse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8-5 One-step validating parse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8-6 Validating parse with predefined String ID assignments. . . . . . . . . . . . . . . . . . . . . . . . 97
8-7 Validating parse with predefined String ID assignments returning String IDs. . . . . . . . 98
8-8 Validating parse returning predefined String IDs and the OSR’s String ID table . . . . . 99
8-9 SAX parsing main program overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8-10 Overview of document handler in SAX parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8-11 DOM parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not give you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of
express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring
any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs.
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
CICS® OS/390® System z9®
DB2® OS/400® System z®
DRDA® pureXML® WebSphere®
eServer™ Rational® z/OS®
i5/OS® Redbooks® z9®
IBM® Redpaper™ zSeries®
IMS™ Redbooks (logo) ®
Language Environment® System z10™
Interchange, and the Shadowman logo are trademarks or registered trademarks of Red Hat, Inc. in the U.S.
and other countries.
Java, and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other
countries, or both.
Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other
countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
This IBM® Redbooks publication presents a broad perspective of the XML processing
capabilities of z/OS® . It begins with a high level view of the IBM products currently
implementing XML-specific features. It covers common design patterns and the products that
use them. It provides an overview and in-depth coverage of the two primary XML activities:
Generating valid XML
Parsing XML
The authors have included examples of simple and complex procedures, all of which have
been tested. They have included cautions and alternatives for common issues and pitfalls.
This book is helpful to anyone trying to learn about the various IBM products that provide
XML-oriented services and how they fit into existing applications. It is also valuable to
developers needing to gauge the pros and cons of the ways of generating and consuming
XML. It provides working examples to those needing a fast path to coding XML applications.
Mike Ebbers is a Consulting IT Specialist and an ITSO project leader. He has worked for IBM
since 1974, mostly on mainframe systems, and has been with the ITSO since 1994. He
produces Redbooks and educational materials on a wide variety of topics.
Mogens Conrad is an IT Architect working for IBM Denmark. He has more than 30 years of
experience working with different types of mainframes, primarily for the financial industry. His
areas of expertise includes CICS®, DB2®, MQ, Business Continuity and Infrastructure. He is
member of CAF, CICS Architectural Forum.
Michael Todd is a Application Architect with DST Systems, Inc. in Kansas City, Missouri,
U.S.A. He has 34 years of experience in a variety of computing related fields including 19
years with IBM mainframes. His areas of expertise include financial applications and real-time
process control systems.
Figure 1 The XML team: Mike Todd, Mogens Conrad, Hans-Dieter Mertiens, Nagesh Subrahmanyam, Mike Ebbers
Special thanks to the following people for their assistance and reviews:
Joseph Bostian
Matthew Cousens
Chris Larsson
Stephen Dulin
Bill Carey
IBM Systems & Technology Group, System z Software, Poughkeepsie NY
David Cargill
Susan Malaika
Jane Man
Gary Mazo
Tom Ross
Christian Strauer
Bob Haimowitz
International Technical Support Organization, Poughkeepsie Center
Susann Thomas
IBM Application Integration & Middleware Solutions Specialist, Germany
Peter Elderon
IBM Software Group, Rational® PL/I Compilers and Architecture Software Architect
Your efforts will help increase product acceptance and customer satisfaction. As a bonus, you
will develop a network of contacts in IBM development labs, and increase your productivity
and marketability.
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about this book or
other IBM Redbooks® publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an e-mail to:
redbooks@us.ibm.com
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
Preface xix
xx XML Processing Options on z/OS
1
XML was developed in 1998 and is now widely used. It is one of the most flexible ways to
automate Web transactions. XML is derived as a subset from Standard Generalized Markup
Language (SGML) and is designed to be simple, concise, human readable, and relatively
easy to use in programs on different platforms. For more information about the XML standard,
see the following Web page:
http://www.w3.org/XML
As with other markup languages, XML is built using tags. Basic XML consists of start tags,
end tags, and a data value between the two. In XML you create your own tags, with a few
restrictions. Example 1-1 shows a simple XML document.
While the XML syntax is simple, it is difficult to parse and transform an XML document into a
form that is usable to programming languages. Therefore, it is essential to have access to
efficient parsing and transformation tools.
XML contains document type and schema definitions. These are used to specify semantics
(allowable grammar) for an XML document. We discuss these next.
1.1.1 DTD
A document type definition, or DTD specifies the kinds of tags that can be included in your
XML document, the valid arrangements of those tags, and the structure of the XML
document. The DTD defines the type of elements, attributes, and entities allowed in the
documents, and can also specify some limitations to their arrangement. You use a DTD to
ensure you do not create an invalid XML structure. The DTD defines how elements relate to
one another within the document’s tree structure. You can also use it to define which
attributes can be used to define an element and which are not allowed. In other words, a DTD
defines your own language for a specific application.
The DTD can be stored in a separate file or embedded within the same XML file. If it is stored
in a separate file, it might be shared with other documents.
XML documents referencing a DTD will contain a <!DOCTYPE> declaration, which either
contains the entire DTD declaration for an internal DTD, or specifies the location of an
external DTD.
In Example 1-4 we have the same XML document, but the DTD declaration is included in the
document
It is difficult to give a general outline of the elements of a schema, due to the large number of
elements that can be used. The purpose of the W3C XML Schema Definition Language is to
provide an inventory of XML markup constructs with which to write schemas. Example 1-5 is
a simple document that describes the information about a book. Throughout this book,
references to schemas are to W3C definitions of a schema.
Because the XML Schema is a language, there are several options to build a possible
schema that covers the XML document. Example 1-6 is a simple and feasible design.
It is clear that Example 1-6 on page 4 is an XML document because it begins with an XML
document declaration. The schema element opens our schema that contains the definition of
the schema namespace. Then we define an element named “book.” This is the root element
in the XML document. We decided it is a complex type because it has attributes and non-text
children. We begin to declare the children elements of the root element book. W3C XML
Schema lets us define the type of data, as well as the number of possible occurrences of an
element. For more information about possible values for these types, refer to the specification
documents from W3C. The schema URLs are:
http://www.w3.org/TR/xmlschema-1/
http://www.w3.org/TR/xmlschema-2/
W3C XML Schema allows us to define data types and use these types to define our attributes
and elements. It also allows the definition of groups of elements and attributes. In addition,
there are several ways to arrange relationships between elements.
Documentation for XML schemas can be defined by the xsd:documentation element, and
processing instructions for applications can be included with the xsd:appinfo element. More
details are available on the Web at the following Web page:
http://www.w3.org/TR/NOTE-xml-schema-req
1.2.2 xsd
xsd is used both as an acronym for an XML schema definition (also called xsd schema or
simply schema) and as the file extension for files containing XML schema definitions.
Clearly there is a problem with the element <title>. It appears here in two different contexts.
This situation complicates things for processors and might cause ambiguities. We need a
mechanism to distinguish between the two and apply the correct semantic description to the
each tag. The cause of this problem is that this document uses only one common name
space.
The solution to this problem is namespaces. Namespaces are a simple and straightforward
way to distinguish names used in XML documents. By providing the related namespace when
an element is being validated, the problem is solved.
As you can see in Example 1-8, the <title> tag is used twice, but in different contexts: within
the <author> element and within the <book> element. Note the use of the xmlns keyword in
the namespace declaration.
After a prefix is defined on an element it can then be used by all descendants of that element.
In Example 1-8, we specified the relevant namespace prefix before each element to illustrate
the relationship of each element to a given namespace. However, if the prefix is not specified,
the element will be in the default namespace if one has been specified or in no namespace if
Example 1-9 is similar to Example 1-8 on page 6, but it uses default namespaces to produce
the same namespace relationships.
For more information about namespaces, refer to the following Web page:
http://www.w3.org/TR/REC-xml-names
The namespace is defined using a Namespace declaration. In Example 1-10 you see the
namespace declarations from the previous example.
A namespace is declared using the attribute xmlns, followed by the namespace prefix (here
authr and bk). The namespace declaration is given a value to further identify the declaration.
The namespace value must be an IRI or URI identifier, but the value is only used as a unique
string. It can point to existing files or Web sites but will not reference those files.
1.4 Encoding
Document encoding refers to the character scheme used to create document content. The
XML standard requires that, by default, XML documents be created using one of three
Unicode character sets. However, you can use any character set you like, so long as the XML
declaration at the beginning of the document declares the character set used. This is done by
providing an encoding= declaration. It is essential that the encoding declaration provide a
name that sending and receiving programs understand and upon who’s meaning they agree.
Because Unicode is the default encoding, many parsers handle Unicode XML documents.
However, if you are certain all documents delivered to an application will be in EBCDIC-only
or ASCII-only, using an EBCDIC-only or ASCII-only parser is acceptable. Keep in mind that
there are many EBCDIC character sets, not just one. There are many ASCII character sets as
well. Consequently, it is of paramount importance that parsers be able to discern which
character set encoding was used to create the document.
Unicode documents can (but are not required to) include an encoding declaration such as in
Example 1-11.
To enable the parser to recognize the encoding of a non-unicode document, the XML
declaration at the start of the document must include an encoding= definition. Example 1-12
shows the encoding declaration for some common EBCDIC character sets.
The subject of character encoding and how to determine a document’s encoding is broad
enough to merit an entire chapter. See Chapter 11, “XML and character encoding issues” on
page 145 for more details regarding document encoding.
XML 1.1 was first published in 2004, and the current version of XML 1.1 is a second edition
from 2006. Read more about the XML 1.1 second edition at:
http://www.w3.org/TR/2006/REC-xml11-20060816/
The two current versions of XML (XML 1.0 fifth edition and XML 1.1 second edition) are
nearly identical, but XML 1.1 allows a wider range of characters to be used in data and
attribute values.
For mainframe developers it is important to know that XML 1.1 allows the mainframe
end-of-line character (NEL). While some operating systems use carriage return (CR) as the
end-of-line terminator, others use line feed (LF) or CR with LF. z/OS UNIX uses the new line
(NEL) character. Inclusion of the NEL character as a valid whitespace character means that
XML documents created on the mainframe can be parsed on any platform supporting XML
1.1 without first converting the NEL characters to CR or LF.
1.6.1 Well-formedness
To be well-formed, documents must conform to the basic rules for XML documents. Table 1-1
lists a subset of the requirements for a well-formed XML document.
About Table 1-1: A binary zero (null) character is not part of well-formed XML. Binary
zeroes can interfere with parsing of documents in unexpected ways. For example, if you
generate or copy an XML document into a variable, and the variable contained null
characters before the generate/copy, there might be nulls following your document. Even
though you might think these nulls are not part of your document, the parser would give
you an error anyway.
1.6.2 Validation
The process of checking to see if an XML document conforms to a schema or DTD is called
validation. This is in addition to checking a document for compliance to XML's core concept
of syntactic well-formedness. All XML documents must be well-formed, but it is not required
that a document be valid unless the XML parser is validating. When validating, the document
is also checked for conformance with its associated schema.
Non-extractive parsing has the goal of overcoming the limitations of DOM and SAX.
VTD-XML is an example of non-extractive XML parsing.
Data binding is a form of XML processing where data is made available as a hierarchy of
custom, strongly typed classes.
A stylesheet converts the XML in Example 1-1 on page 2 to an HTML page. The stylesheet
selects some of the elements in the XML document, discards others, and adds the necessary
text to make the result a legal HTML document. See Figure 1-1 on page 13.
z/OS XML Systems Services does not use or provide SAX or DOM interfaces. It is a
buffer-oriented interface that allows the input XML to be provided in pieces and the parsed
result to be consumed in pieces. There is no limit on document size using this approach.
Documents of many gigabytes can be processed.
While z/OS XML System Services does not provide SAX or DOM interfaces, the intermediate
format produced by z/OS XML System Services can be used to support SAX and DOM
parsers as well as non-traditional parsing requirements.
A number of XML-enabled products for System z use z/OS XML System Services behind the
scenes to gain advantage from its high-performance and consistent parsing results. For
example, COBOL and PL/I provide the option of using z/OS XML System Services. DB2
PureXML support also uses z/OS XML System Services for some of its XML processing.
z/OS XML provides the option to offload most of the XML parsing to a zAAP or zIIP specialty
engine, when present.
An illustration and description of the returned XML parsed data stream is available in 6.1.1,
“Parsed XML data stream” on page 67.
The XML Toolkit Parser is an implementation of the IBM XML4C parser. XML4C is based
upon open source code from the Xerces Apache project of the Apache Software Foundation.
The Toolkit Parser allows C++ applications to parse XML documents using either the DOM or
SAX 2.0 interfaces. Some C++ programming experience is required to use the Toolkit Parser,
C++ Edition.
The Toolkit Parser, C++ Edition provides document validation against an XML schema or
DTD as well as just doing non-validating or well-formedness checking. The Toolkit parser
provides the option (using special classes) of using z/OS XML System Services to perform
the parsing without validation or parsing with validation (XML Schema validation only). This
provides the option to offload most of the XML parsing to a zAAP specialty engine when
present.
One unique aspect of the XML Toolkit for z/OS parser is that it always returns the parsed
names and values in UTF-16 regardless of the character encoding of source XML. This might
require customization of the invoking programs.
XML Toolkit
HTML or text
for z/OS
XSLT
XML
processor XML
C++ SAX
XML XML ToolKit parser
application
z/OS XML System Services
COBOL programs can acquire XML documents from numerous sources (sequential files, MQ
messages, HTTP requests), use the XML PARSE statement to parse the XML document into
COBOL data structures, and then process those data structures.
It is also possible to invoke the z/OS XML Toolkit XSLT processor from COBOL. You can also
invoke the XML Toolkit for z/OS Parser from COBOL, but some glue code is recommended to
deal with differences between the C and COBOL run-time error handling semantics. There is
an example of this glue code within XML Toolkit for z/OS User’s Guide, SA22-7932.
PLISAXA parses documents where the entire document is available in memory. PLISAXB
parses documents where the entire document is available within a file. PLISAXB limits the
document size to less than 2 GB. Additionally, PLISAXB copies the entire document file into
memory before parsing. Therefore, sufficient memory must be available to the task to perform
that file-to-memory copy.
PLISAXC provides parsing using XML System Services and parses documents available in
one or more buffers. PLISAXC is the only PL/I built-in XML parsing subroutine which provides
the ability to offload most of the parsing work to a zAAP.
PL/I applications can also invoke z/OS XML System Services directly. This might be required
to access parsing functionality which is not available through the language’s parser interface.
It is also possible to invoke the z/OS XML Toolkit XSLT processor from PL/I. You can also
invoke the XML Toolkit for z/OS Parser from PL/I, but some glue code is recommended to
deal with differences between the C and PL/I run-time error handling semantics. There is an
example of the glue code and COBOL calling the glue code within the XML Toolkit for z/OS
User’s Guide, SA22-7932.
PL/I
XML Toolkit
Parser C++
z/OS XML Edition
System
Services
Both methods require you prepare the parsing or generation using the CICS WebServices
Assistant or RD/z to generate conversion modules.
CICS Web Services can process and respond to incoming SOAP requests. In addition, it can
issue SOAP requests to—and process the SOAP responses from—other Web Services.
SOAP data structures are built upon XML and require XML parsing. SOAP requests and
responses often contain application provided data in XML format.
CICS has the ability to parse and preserve binary data within XML with the XML-binary
optimization Packaging (XOP) specification for XML and SOAP Message Transmission
Optimization Mechanism (MTOM) specification for SOAP.
Support for CICS Web Services pipeline is available beginning in V4R1 using z/OS XML
System Services for their on-demand parsing.
Business processes where the primary use of XML data will be Java applications that also
use IMS might benefit from IMS XML DB when integrating new XML processing requirements
within existing IMS applications. See Figure 2-5.
XML Schema
IMS DBD
book
author author
seq seq
RDz XML converters can be created from COBOL and PL/I applications in either bottom-up
mode (starting from an existing COBOL or PL/I data structure) or in a meet-in-the-middle
mode by mapping between existing XML schema or a WSDL file and an existing COBOL or
PL/I data structure. In the meet-in-the-middle mode, the latest RDz versions (in addition to
mapping XML elements) allow mapping of XML attributes or even a mixture of elements and
attributes. In a top-down mode, RDz integrates CICS-provided tooling that support CICS Web
services and CICS XML Transformation.
For both COBOL and PL/I, RDz can supply code that allows generation of XML documents
without requiring the use of XML GENERATE language. XML generation capabilities of the
code produced by RDz are also more flexible in that they allow for customization (such as
selective element omission or renaming of the default element names that are generated from
the data structure names).
RD/z can produce XML converters in either COBOL or PL/I. When COBOL is used, the option
to use z/OS XML System Services and offloading XML processing to a zAAP processor is
provided.
WebSphere Application Server has the ability to parse and preserve binary XML content
through MTOM and XOP. Many WebSphere products have the ability to initiate SOA
processing using XML-based SOAP and REST protocols.
XML parsing using Java is currently distinct from other IBM XML-enabled products. Java XML
does not currently employ z/OS XML System Services, for example. But all Java processing
is offloadable to zAAP processors. Java employment of SAX, DOM, JDOM, JAXP, TrAX,
XPath, and XSLT are beyond the scope of this IBM Redbooks publication.
Business processes where the primary use of XML data will be within applications that also
use SQL might realize significant synergy and savings by using DB2's pureXML capabilities.
pureXML can decrease the cost of development, deployment, maintenance, and support
when integrating new XML processing requirements within existing SQL/DB2-based
applications.
DB2 for z/OS transparently uses the capabilities of z/OS XML System Services for some of its
parsing. The portion of XML processing performed by z/OS XML can be directed to a zAAP or
a zIIP processor, depending on whether DB2 is executing in task or SRB mode, respectively.
Additionally, for distributed database requests, additional non-XML processing can be
directed to a zIIP processor.
XML Query
Relational Tables
XML Query
XML Join
results Query
pureXML features are also available in DB2 for LUW. The many capabilities of pureXML are
not discussed in this book. For more information about pureXML, see the IBM Redbooks web
site at the following Web page:
http://www.redbooks.ibm.com
The Linux for System z environment provides no support for XML in the operating system
itself. However there are many packages available under Java or with programming
languages. Beginning with version 1.4, Java contains the API for XML Processing (JAXP). So
it contains also the SAX and DOM parsing interfaces. There is also Xerces available on Linux
so you can integrate SAX or DOM style parsing into a C++ program. The Apache Xerces
Project also provides packages for Java and Perl. Another package is libxml2, which supports
C and perl, another important programming language on Linux. Of course, a well-defined
strategy to use Open Source in your production environment is a key to successful
implementation.
There also are commercial products available that support XML on Linux for System z. These
include databases such as Tamino from Software AG and DB2 for Linux (including Linux on
System z. Other products deal with XML documents in the WebSphere family of products.
Files can be created by FTP and other file transmission methods, traditional batch jobs, and
newer workloads such as native batch Java, WebSphere Application Server applications,
WebSphere Message Broker, WebShere Process Server, and applications running under
z/OS UNIX. XML document files are not always newly created files. Configuration information
is a common use for permanent XML documents that are parsed at regular intervals such as
every application startup.
Generated XML is often stored within sequential data sets until they are transmitted or copied
to an external media such as tape or CD. XML might also appear as stream-out or SYSOUT
data from a job step.
Messaging for request-reply and publish-subscribe are common exit points on System z.
System z can be the sender or receiver in a send-and-forget application.
System z is now a frequent consumer of SOA and Web services. In these scenarios, XML
exiting System z with no corresponding input request has become commonplace.
Traditional batch jobs can generate XML in one or more steps and consume it in other steps.
It is important to recognize that z/OS XML System Services is being used by new IBM and
third-party products every day. Even if a processing model discussed here does not use z/OS
XML System Services today, it might do so soon.
It is also possible to use empty files to trigger the submission of a batch job. In this scenario,
the XML input exist elsewhere (such as within a database or messaging queue).
Exchanging files between programming languages, hardware and software platforms and
external systems always carries some risk of incorrect character translations. This can
happen due to automatic character conversions such as an FTP non-binary transfer or
deliberate conversions such as when using DRDA®. See 11.4, “Exchanging XML documents
between heterogeneous systems” on page 158 for additional information.
CICS TS
V3R1
CICS Web
Services
WSDL
COBOL, PLI,
C, C++ commarea Web Services WSBind
definition Assistant file
After the CICS application has completed, the updated commarea is converted from
commarea format into SOAP/XML and returned to the requester. Newer releases of CICS
allow channels and containers to replace commareas. Channels and containers address the
32 KB size restriction of commareas.
CICS TS
V4.1
CICS
Channel and
Containers application program
XML String
EXEC CICS
Data in program structure Transform
XMLtoData
COBOL, PLI,
C, C++ commarea
definition
Web Services WSBind
Assistant file
XML Schema,
WSDL
definition
W
WebSphere IICF S Ja va
WAS We bSp here
WebSphere D RA
D C ompo ne nt
IIC F JDBC
IICF / IMS
L EJB / Bea n
Clie nt TCP/IP ODBA IMS
DB DB
When IMS receives requests for Web Services using the SOAP protocol, XML converters are
used to convert XML-based requests into program-friendly data areas which are passed to
IMS application programs through an internal queue. After the application has completed, the
updated data area is then converted to XML and returned to the requester. See Figure 3-4.
p1 Inbound
XML Converter
In
Q
XML Traditional
WebSphere Application IMS Driver COBOL
Server V4 or above Connector Program
P1
Out
Q
p1 Outbound
XML Converter
Web Services
Requester SOAP Server ODBC
HTTP JDBC
DB2
Stored
Procedure
As a Web services consumer, SQL statements request a WEB service through DB2 user
defined functions (UDFs). The service is invoked by the UDF and the UDF is requested by
including it within an SQL statement. The parameters passed to the UDF can be any
combination of native DB2 data types, including XML. See Figure 3-6.
WebSphere
Connector MQ
J2EE Java applications Messaging
ODB
C
JDB
C
DB2
Message enrichment (adding to the content) and transformation are common tasks and are
often performed upon the XML portion of these messages. These products not only generate
and consume XML internally, they also exchange messages with each other and other
products, as shown in Figure 3-8.
Some applications require an XML warehouse. The XML data will be stored and retrieved, but
there is no requirement to parse or validate the data. Other applications might require parsing
and processing of the data in addition to storing and retrieving the original XML string.
In scenarios where the original XML string is to be retrieved, storing it as XML ensures that
the original text can be returned. If the XML were converted into relational tables, it might be
possible to regenerate the original XML documents or data.
DB2 9 for z/OS provides new XML data types that allow XML to be stored as XML. The stored
XML can also be parsed and processed as needed. Applications that must access the XML
as XML frequently can benefit from storing the data in XML form.
For applications that must frequently access the data as XML and frequently access the data
as relational tables, it might be worthwhile to store the data in both its original XML form and a
relational form. See Figure 3-9.
Original
XML
XML shred
DB2 Relational
Data
XML query
In scenarios where the original XML string is to be retrieved, storing the XML as XML ensures
that the original XML can be returned. If the XML is converted into IMS structures, it might be
possible to regenerate the original XML document, as illustrated in Figure 3-10.
XM L S chem a
IM S D B D
book
@ year seq P C B : B IB 2 1
BOOK
xs:d ate YEAR T IT L E P U B L IS H P R IC E
XM L title ch oice p u b lish er p rice
IM S
D ocum en ts xs:strin g xs:strin g xs:d ecim al
0:o o 0:o o
D a ta
AUT H E D IT
LAST F IR S T LAST F IR S T AFFL
au th o r au th o r
seq seq
IMS/Java applications can also parse and process the XML data as needed, as shown in
Figure 3-11.
IMS DB Metadata
Business Logic XML Shredder,
XML Materializer
Code
IMS Dep. Region
Transaction and DLI
Message IMS Java
Processing App
Database Customer Code
View
A JDBC/SQL XML-DB
p
p
DB IMS Java Class Library
Mapping Base
to DL/I
APIs JNI
JDBC, JCA
Java to C
interface
interface
DB2 V8 for System z provides an MQ reading and processing facility called MQListener.
MQListener monitors designated MQ queues and executes a listener program when
messages are present. The listener program then reads a queue message, calls a
user-provided DB2 stored procedure providing the message content as a parameter, then
commits. Each message is an independent unit of work. See Figure 3-12.
DB2 stored procedures can be written in assembler, COBOL, PL/I, REXX, C, C++ and SQL
Procedure Language. MQListener provides the MQ reading mechanism while the user
provided stored procedures provide the XML parsing and business processing. These stored
procedures might be reusable within your Web service and other business applications.
When large volumes of messages are possible, reading and processing multiple messages
simultaneously might be required. MQListener supports multi-threading and can initiate more
than one listener thread per queue.
For more information regarding MQListener, see DB2 Version 9.1 for z/OS Application
Programming and SQL Guide, SC18-9841.
pp
message
rA
Trigger
ene
t
Lis
te
itia
In
Trigger
Monitor
See 5.1, “Generating XML from COBOL” on page 42 for details regarding the syntax, options,
and results of XML GENERATE.
C IC S T S
C IC S C IC S W e b
t r a n s a c tio n S e r v ic e s
B u s in e s s
a p p lic a t io n
CO M M AREA or S e r v ic e
S O A P M essage P r o v id e r
C o n t a in e r
A batch utility helps you generate the necessary XML generator modules and commareas,
either from a XML schema definition or from a existing comm-area.
Note: In the recently released version of CICS TS 4.1, there is a new feature, CICS XML
Assistants (DFHLS2SC and DFHSC2LS), that allows a CICS application to do XML
processing (generation and parsing) separate from any Web services connotations.
The XML Toolkit XSLT Processor, C++ Edition is designed to transform XML to XML, XML to
HTML, and XML to text.
Tip: A binary zeros (null) character is not part of well-formed XML. Do not introduce binary
zeros or any other illegal character in XML or by not initializing FILLER, or by setting the
parameter length or null indicator incorrectly for a DB2 stored procedure parameter
containing XML.
There are several options for generating XML strings. Each of the examples in this section
uses the COBOL data description shown in Example 5-1 and accompanying initialization
procedure as the source for XML string generation.
* Initializing Procedure
MOVE 'lastnName1' to lastn (1)
MOVE 'firstnName1' to firstn (1)
MOVE 'DE' to country (1)
MOVE 32767 to redbooks (1).
MOVE 'lastnName2' to lastn (2)
MOVE 'firstnName<' to firstn (2)
MOVE 'DK' to country (2)
MOVE 1 to redbooks (2).
basic element-style
The simplest use of XML GENERATE generates an XML string, but not a complete
document. Example 5-2 illustrates the required COBOL syntax and the resulting XML string.
Produces:
<personnel-redbooks-list>
<personnel>
<name>
<lastn>lastnName1</lastn>
<firstn>firstnName1</firstn>
</name>
<a2>
<country>DE</country>
<redbooks>32767</redbooks>
</a2>
By default, the generated XML string does not contain a standard XML document header. The
optional COUNT IN clause returns the number of generated character encoding units:
number of bytes for UTF-8 and single byte character sets or number of double-bytes for
UTF-16.
The generated XML element names are identical to the COBOL data element names.
COBOL data names often contain hyphens, and while use of hyphens is accepted within
XML, common XML convention uses underscores in place of hyphens. In Enterprise COBOL
V4R2 and later, users can use underscores in data names. For older programs, a simple
conversion of hyphen to underscore in element-names-only is shown in 5.1.4, “Hyphens in
element names” on page 46.
Note that the XML string is generated as a single contiguous string with no whitespace. The
generated string is presented here as it is displayed by an XML editor.
Produces:
<?xml version="1.0" encoding="IBM-037" ?>
<personnel-redbooks-list>
<personnel>
<name>
<lastn>lastnName1</lastn>
<firstn>firstnName1</firstn>
</name>
<a2>
<country>DE</country>
<redbooks>32767</redbooks>
</a2>
</personnel>
<personnel>
<name>
<lastn>lastnName2</lastn>
attribute-style
The WITH ATTRIBUTES clauses causes XML GENERATE to produce an attribute-style XML
string rather than the default element-style string. See Example 5-4.
Produces:
<personnel-redbooks-list>
<personnel>
<name lastn="lastnName1" firstn="firstnName1" />
<a2 country="DE" redbooks="32767" />
</personnel>
<personnel>
<name lastn="lastnName2" firstn="firstnName<" />
<a2 country="DK" redbooks="1" />
</personnel>
</personnel-redbooks-list>
Attribute values are always designated with an equals symbol (=) and enclosed within
quotation marks. Because attributes have no end-tags, attribute-style requires fewer
characters. This reduction in space is sometimes attractive for large documents.
Produces:
This automatic conversion is not part of the XML standard, but it is extremely useful for finding
and removing program defects. Where unintentional inclusion of illegal characters is likely,
programs should check for XML-CODE 417 and execute the appropriate recovery.
Alternatively, programs can validate fields in advance of the XML GENERATE. As we show in
Example 5-6, you can define a set of allowable characters to use within a COBOL class test
to identify problem characters.
dash-o-underscore.
perform varying pos from 1 by 1 until pos > charcnt
if xmldoc (pos:1) = ‘<‘
move 1 to tagstate
end-if
if tagstate = 1
if xmldoc (pos:1) = ‘”’
move 1 to quotestate
else
move 0 to quotestate
end-if
end-if
if tagstate = 1 and quotestate = 0 and xmldoc (pos:1) = ‘-’
move ‘_’ to xmldoc (pos:1)
else
if xmldoc (pos:1) = ‘>’
move 0 to tagstate
end-if
end-if
end-perform.
XML GENERATE generates element names that are identical to their source field names. To
produce lower-case element names, use lower-case COBOL data item names.
Alternatively, you can use a procedure similar to Example 5-7 on page 46 to shift upper-case
to lower case by replacing
if tagstate = 1 and quotestate = 0 and xmldoc (pos:1) = ‘-’
move ‘_’ to xmldoc (pos:1)
with
if tagstate = 1 and quotestate = 0
move function lower-case (xmldoc (pos:1)) to xmldoc (pos:1)
As with Example 5-7 on page 46, only element names are affected. Attribute names, attribute
values and element values will not be changed.
If one or more input data items is defined with PIC N, USAGE NATIONAL or GROUP USAGE
NATIONAL, then the output data item must be a national item.
When provided, the ENCODING clause must specify a codepage from the supported list of
code pages. See the appropriate revision of the Enterprise COBOL for z/OS Programming
Guide for the list which applies to your currently installed release.
We created a simple PL/I program to show the usage of XMLCHAR with which we build a
simple XML string. The complete example is given in B.1, “PL/I example to generate XML” on
page 165.
The output of this example is given in Example 5-8. It has been edited from the 131 character
wide listing into a more readable form here.
Through Enterprise PL/I 3.6 nothing was done towards conversion into another codepage, for
example, UTF-8. With Version 3.7 of Enterprise PL/I a new built-in function MEMCONVERT
was provided. It converts the data in a source buffer from the specified source codepage to a
specified target codepage, stores the result in a target buffer, and returns an unscaled REAL
FIXED BINARY value specifying the number of bytes written to the target buffer.
MEMCONVERT is based on CUNLCNV, which comes with z/OS support for Unicode. It might
be helpful to look at z/OS Support for Unicode: Using Unicode Services, SA22-7649.
We added a small piece of code according to the description above. The result of the
conversion is shown in Example 5-9. The first part (1) of the hexadecimal printout shows the
input to MEMCONVERT, which is encoded in 037. The second part (2) shows the same string
encoded in UTF-8. For better reading we have added the clear text to the first row of the 037
encoded text.
With XMLCHAR, and MEMCONVERT you should be able to handle XML generation in a way
that the final XML document can also be sent to other platforms such as UNIX or Windows®.
See the XML Toolkit for z/OS user’s guide for detailed information about how to set up your
environment to enable XSLT transformation processing on z/OS:
http://www-03.ibm.com/servers/resources/ixmza290.pdf
With the toolkit you receive programming samples to exploit the SAX, DOM, and XALAN API
for XSLT processing. The samples are shipped in non-XPLINK version, but you are able to
build your own versions and bind with XPLINK for better performance.
In Example 5-10, we have used the XALAN API to transform a XML document to a HTML
page. It shows the XML document used as input to the transformation.
</xsl:stylesheet>
First we used the XALAN command line interface to perform the transformation.
Example 5-12 shows the command line used.
After running the Xalan command the resulting file library.htm will contain the html code
seen in Example 5-13.
The Xalan command line interface is the simplest way to activate the Toolkit and transform a
XML document. On z/OS you have two other alternatives: Call the XSLT APIs from C or C++
program from the UNIX environment or from the MVS environment. The Toolkit provides a
number of sample programs to be used as a starter and you might enhance these examples
with you own code.
In Example 5-14, we have moved the source code for sample program SimpleTransform to a
MVS data set and compiled and linked it on MVS. To apply to MVS naming rules,
SimpleTransform is renamed SMPLTRNS and Xalan memory manager module used in the
program is renamed XALANMMI.
For both the compile and the link-edit jobs, consult your systems programmer to get the
library names used on your system.
To run SimpleTransform on the MVS environment, we made a small source code change. The
original SimpleTransform assumes that the program is invoked with a current directory
pointing to the input files. When running as a batch job on z/OS, we do not have a current
directory and the program source is changed to include the entire path to the input files.
Example 5-16 shows the JCL.
#include <xalanc/Include/PlatformDefinitions.hpp>
#if defined(XALAN_CLASSIC_IOSTREAMS)
#include <iostream.h>
#else
#include <iostream>
#endif
#include <xercesc/util/PlatformUtils.hpp>
#include <xalanc/XalanTransformer/XalanTransformer.hpp>
#include "XALANMMI.hpp"
/**
* Example of the ICU's customizable memory management which can
* be used conjunction with Xalan's pluggable memory management feature
*/
#if defined(XALAN_USE_ICU)
#include "unicode/uclean.h"
void*
icu_malloc(const void * /* context */, size_t size)
{
return s_memoryManager.allocate(size);
}
void*
icu_realloc(const void * /* context */, void * mem, size_t size)
{
s_memoryManager.deallocate(mem);
#endif
int
main(
int argc,
char* /* argv */ݨ)
{
XALAN_USING_STD(cerr)
XALAN_USING_STD(endl)
if (argc != 1)
{
cerr << "Usage: SimpleTransform"
<< endl
<< endl;
}
else
{
try
{
XALAN_USING_XERCES(XMLPlatformUtils)
XALAN_USING_XERCES(XMLUni)
XALAN_USING_XALAN(XalanTransformer)
XalanMemoryManagerImpl memoryManager;
#ifdef XALAN_USE_ICU
UErrorCode status = U_ZERO_ERROR;
if(U_FAILURE(status))
{
cerr << "Initialization of ICU failed! "
<< endl
<< endl;
return -1;
}
#endif
&memoryManager );
// Initialize Xalan.
XalanTransformer::initialize( memoryManager );
{
// Create a XalanTransformer.
XalanTransformer theXalanTransformer( memoryManager );
theResult = theXalanTransformer.transform
("/u/conrad/SimpleTransform/library.xml",
"/u/conrad/SimpleTransform/library.xsl",
"/u/conrad/SimpleTransform/library.htm");
if(theResult != 0)
{
cerr << "SimpleTransform Error: \n" << theXalanTransformer.getLastError()
<< endl
<< endl;
}
}
// Terminate Xalan...
XalanTransformer::terminate();
// Terminate Xerces...
XMLPlatformUtils::Terminate();
return theResult;
}
CICS TS
User transaction
Data mapping
Pipeline
Handler
Handler
Handler
SOAP REQUEST
Service
Provider
SOAP RESPONSE
The applications program passes data to the CICS Web Services Interface using the
command EXEC CICS INVOKE SERVICE. The request is passed through a pipeline and
sent to the target provider. The information in the wsbind file associated with this Web service
enables CICS to convert data delivered in the language structure into an XML document. The
pipeline handlers convert the XML document into a SOAP message. You are able to add extra
pipeline handlers for security, journaling, or other purposes
If the application expects to receive a reply from the provider, the response is returned
through the pipeline, converted from XML to the corresponding language structure, and
returned to the application program through the same channel as used for submitting the
request.
Bottom up
CICS HFS
PDS
Web Services
Assistant WSBind
Language
Structure
Data mapping WSDL
DFHLS2WS
Top down
PDS
The Web Services Assistant supports COBOL, PL/I, C, and C++. Example 5-18 shows a
COBOL call.
For more information about CICS Web Services, see the following publications at the
following Web page:
http://www.redbooks.ibm.com
Implementing CICS Web Services, SG24-7657
Securing CICS Web Services, SG24-7658
Developing Web Services Using CICS, WMQ and WMB, SG24-7425
Conversion modules are available to prepare for the TRANSFORM statement with the same
tools as used for CICS Web Services. See 5.4.1, “CICS Web Services Assistant” on page 58.
Validation
The default for the TRANSFORM statement is that checking is performed to ensure the
message is well-formed. For testing purposes it is possible to add a control for valid XML. To
enable validation you must:
Ensure that the XML binding and the schema are in the same location on z/OS UNIX. The
XMLTRANSFORM resource defines these files to CICS. You can use the INQUIRE
XMLTRANSFORM command to check the location of each file.
Turn validation on for the application. Use the CEMT or SPI command SET
XMLTRANSFORM(name) VALIDATION, where name is the XMLTRANSFORM resource.
The result from the validation control is not returned to the application, but only
communicated through written messages. Check the system log to learn if the XML
transformation is valid:
Message DFHML0508 indicates that the XML was successfully validated
Message DFHML0507 indicates that the validation failed.
These functions can be combined to create well-formed XML strings or documents. We have
a series of examples, beginning with Example 5-20, and we show at the bottom of each what
the instructions produce.
Note: All of the results for these examples have been formatted with white space for ease
of interpretation. The DB2 functions actually return the XML string as one contiguous
stream with no white space.
Produces:
<dept>2.</dept>
By default, all DB2 XML generating functions return an internal XML data type that cannot be
placed directly into a host variable. XMLSERIALIZE allows us to convert the internal XML
data type into a character data type. This character version can be returned to a program host
variable. XMLSERIALIZE also replaces special characters with the appropriate escape
sequences (for example, < into <).
Produces:
<dept>2.</dept>
Arrays of elements or arrays of strings are returned with the XMLAGG function, as shown in
Example 5-23.
Produces:
<lname>CONRAD</lname>
<lname>MERTIENS</lname>
<lname>SUBRAHMANYAM</lname>
Produces:
<lname>CONRAD</lname><fname>MOGENS</fname>
<lname>MERTIENS</lname><fname>HANS-DIETER</fname>
<lname>SUBRAHMANYAM</lname><fname>NAGESH</fname>
Produces:
<emp>
<lname>CONRAD</lname><fname>MOGENS</fname>
</emp>
<emp>
<lname>MERTIENS</lname><fname>HANS-DIETER</fname>
</emp>
<emp>
<lname>SUBRAHMANYAM</lname><fname>NAGESH</fname>
</emp>
The XMLELEMENT (Example 5-26 on page 63) can return binary values in either base64 or
hex.
Produces:
<emp>
<lname>CONRAD</lname>
<fname>MOGENS</fname>
<bioid>8PLx8fHyQEA=</bioid>
</emp>
<emp>
<lname>MERTIENS</lname>
<fname>HANS-DIETER</fname>
<bioid>8PHx8fHxQEA=</bioid>
</emp>
<emp>
<lname>SUBRAHMANYAM</lname>
<fname>NAGESH</fname>
<bioid>9/Hw8PPyQEA=</bioid>
</emp>
The XMLATTRIBUTES function (shown in Example 5-27) can be used to return attribute-style
XML rather than element-style.
Example 5-28 DB2 Complete document with declaration and root element
exec sql select
xmlserialize ( xmlelement (name "emp_list",
xmlagg ( xmlelement (name "emp",
xmlattributes ( emplst.last_name as lname,
emplst.first_name as fname
)
) order by
emplst.last_name
)
) as clob(1000)
including xmldeclaration
) as "xmlout"
into :xml-string
from testdb.employees emplst
end-exec.
Produces
<?xml version="1.0" encoding="UTF-8"?>
<emp_list>
<emp LNAME="CONRAD" FNAME="MOGENS"/>
<emp LNAME="MERTIENS" FNAME="HANS-DIETER"/>
<emp LNAME="SUBRAHMANYAM" FNAME="NAGESH"/>
</emp_list>
Notice that the preceding XML declaration specified encoding as UTF-8. DB2 assumes all
documents will be output in UTF-8. If you are outputting the document with some other
character encoding, you might want to generate your declaration as a simple text string and
concatenate it in your program to your generated XML string.
DB2 for z/OS includes three other XML functions that are not typically used to directly
generate XML:
XMLCAST allows generated XML text to be converted into other host variable types; for
example, a numeric XML element can be cast into a COBOL numeric host variable
XMLTABLE extracts data from an XML document and converts it into one or more rows of
data
XMLPARSE parses input XML documents, with optional schema validation. The resulting
document can be stored as an XML column or shredded into conventional relational data
Note: The z/OS XML System Services parser does not have any Language Environment
dependencies. You can use the z/OS XML parser even in service request block (SRB)
mode.
z/OS XML System Services uses a buffer-in buffer-out technique. Input to and output from the
parser might span multiple buffers. This allows the application to handle large XML strings
that would not fit in its memory. Consequently, the XML parse does not provide a SAX or
DOM type interface.
Further, the application is responsible for managing the buffers. However, buffer management
is simplified because the input text stream can be broken at any point, even in the middle of a
multi-byte character. More information regarding buffer management can be found in 6.1.3,
“Buffer handling with XML System Services” on page 68.
The application is responsible for reading in the XML string, providing flexibility to the
application. The XML string can be read from a MVS data set, a file in the z/OS UNIX file
system, or VSAM cluster, to name a few possible sources.
z/OS XML System Services does not provide services for generating XML streams or
documents. XML generation must be accommodated through other services.
z/OS XML System Services are compatible with multi-threaded and multi-tasking
environments. The interfaces are written so that they communicate through a thread/task
level control block (parser instance memory area or PIMA). This provides the thread/task level
separation required for reliable multi-threading and multi-tasking.
GXLHXEC_TOK_BUFFER_INFO
record length
datastream options
parse status
buffer length used
offset to error record
GXLHXEC_TOK_START_ELEM
record length
GXLHXEC_TOK_START_ELEM
record length
GXLHXEC_TOK_CHAR_DATA
record length
GXLHXEC_TOK_START_ELEM
record length
GXLHXEC_TOK_CHAR_DATA
record length
GXLHXEC_TOK_ATTR_NAME
record length
GXLHXEC_TOK_ATTR_VALUE
record length
GXLHXEC_TOK_END_ELEM
record length
GXLHXEC_TOK_END_ELEM
record length
GXLHXEC_TOK_END_ELEM
record length
The parsed XML data stream contains unique record type identifiers for the different
components within an XML document. Among the record types are:
XML declaration, when present
Start of an element
Character value for an element, when present
End of an element
Name of an attribute
Value of an attribute
Namespace declaration
These record types are similar to the SAX parsing callback events. Not coincidentally, the
records tend to appear in the same order SAX callback events occur. However, not every SAX
event is represented in the XML parsed data stream.
Applications can invoke z/OS XML System Services directly using a simple API. This
approach is useful within languages lacking direct XML parsing support, such as REXX and
assembler. It is also useful when z/OS XML System Services features are required but not
available within the host language’s built-in features. It can also facilitate non-traditional
parsing needs. For example, a complete SAX parsing implementation might be far more
A key benefit of z/OS XML System Services is the ability to redirect portions of XML parsing
to zIIP or zAAP specialty engines, which can lower software costs. Most software products
using z/OS XML System Services run in task mode and will offload most of the XML
processing to a zAAP when available.
z/OS V1R10 added the ability to parse a document with validation. A document can be
validated against an XSD schema. For z/OS V1R9, the validation capability was added to the
parser with PTF UA44802.
The interface between the application and the XML parser is a call/return structure. The
application invokes services provided by the XML System Services with the necessary
parameters. The called service then returns the results. Based on a return code and a reason
code provided by parser, the application will then control its flow. Data is exchanged in an
input, and an output buffer. More details on this can be found in 9.1.4, “Basic loop to manage
the input and output buffer” on page 119.
Figure 6-2 on page 69 shows the buffers flowing between the application and the z/OS XML
System Services. We assume that every initialization has been done and the parse can take
place. For the moment we also assume that validation is not required. On the left side a XML
string is provided to the parse. In this example we assume that the complete document is too
large to be read into a single buffer. The reader function of the application fills a buffer and
advances this as parsing progresses. Logically, the reader slides a buffer window over the
XML string. The address of the position of the window is passed to z/OS XML System
Services routine gxlprs(), which is the parsing routine. Also passed to gxlprs() is the address
of an output buffer where the results are stored.
Output Buffer
Reading
Application Filling
Acting
Reson code
Return and
XML System Services
Upon return from the parser the application must be able to handle spanned buffers. Spanned
buffers occur because either the text in the input buffer is consumed, or the parsed data
stream completely fills the output buffer, but there is residual data left in the input buffer. In
these cases the z/OS XML System Services parser returns a conditional success return code
(XRC_WARNING), and a reason code that indicates which buffer caused the spanning
condition. The parser also informs the caller about the number of bytes not parsed or unused
in each buffer. The parser also advances pointers to the next byte to work on (input) or
unused by the parser (output).
After the parsed data stream is processed, the application should then handle the spanning
buffer, and can optionally manage the other buffer as well. The reason codes (in hexadecimal)
to look for are x”1301”, x”1303”, and x”1304”. More details can be found in 9.1.4, “Basic loop
to manage the input and output buffer” on page 119.
There is no requirement that the application needs to break the input buffer at logical
boundaries. That is, even if the current input buffer ends in the middle of a tag name, z/OS
XML System Services will handle that break and resume with the next byte when it is made
available. z/OS XML System Services handles buffers ending between bytes of a multi-byte
character. This keeps program design much simpler.
This section describes another way to parse a XML document (after the reason code x’1302’
was reported) from the point of the last successful parsing attempt. To implement this option,
there are two items that must be tracked after every successful parse. Note that a successful
parse also means a return code of 4. Firstly, the number of bytes (of the XML document)
parsed successfully is accumulated in an integer variable. Secondly, after every successful
parse, the PIMA is saved in a storage location similar to the original PIMA. This is done in
case the first attempt to parse the document fails with this reason code. If so, then the PIMA
to be used for subsequent retries is identical to the one returned by the initialization routine:
GXL1INI (GXL4INI).
At the point when x’1302’ is encountered, the following steps need to be taken:
1. From the beginning of the XML document (stored in a file, and so forth), read (but do not
process) as many bytes as saved in the integer variable for count of bytes successfully
processed until then.
2. Read the XML document further with as many bytes as required to fill the input buffer.
3. Pass the saved PIMA to the argument for PIMA of GXL1PRS (GXL4PRS).
4. Call GXL1PRS (GXL4PRS).
Before the parser is invoked, the arguments are set out as:
PIMA is set to the saved PIMA
input_buffer_addr is set to the input buffer having bytes from the document beyond the
point of last successful parse.
The flowchart for using this option is shown in Figure 6-4 on page 74. The dashed arrows
mean some more processing might be required but has been left out from this flowchart as it
is out of scope.
An OSR contains a binary representation of schema. Use of this binary representation allows
the validate process to be faster and more efficient. Schemas are sometimes long. Among
the other efficiencies brought by using OSRs, there is no need to parse the schema before
parsing the document you are trying to validate. OSRs are meant to be reused. That is, you
can load an OSR and validate many documents. You can validate only one document at a
time, of course.
z/OS XML System Services provides a tool to convert one or more XSD files into an OSR.
This tool is provided as a command for z/OS UNIX, and it can also be run from batch. A
callable service is also provided for generating OSRs through a program called gxluGenOSR.
This specific service is only available as a function for C/C++ users. There is no Assembler
service available to request this service. Further, the program from which the generator is
called must establish an environment in which Java can run. For an example of how to
achieve this, see 9.1.1, “Creating an OSR” on page 114.
By default, a z/OS XML System Services parse will return complete names each time, no
matter how many times they appear. By providing a User StringIDHandler exit routine, you
enable the z/OS XML System Services parser to replace those names with 4-byte StringIDs.
Use of 4-byte values in place of reoccurring text strings can substantially reduce the overall
size of the returned parsed data stream. Another advantage to StringIDs is that they are
much cheaper to search for and compare.
Note: Due to the amount of calls to the StringIDHandler exit caused by the number of
strings to handle and the potentially high number of searches in a StringIDTable adequate
means needs to be taken to ensure proper performance.
When StringIDs are in use, the application requesting the parse can, when it needs to, use
the provided string lookup routine to convert StringIDs to their original text value.
Note: The communication between the application and the StringIDHandler exit is
achieved through the system services parameter area. This area is not only a parameter
area, it can also be the storage for the StringIDTable. Examples in 9.1.6, “Get a StringID
exit working” on page 122 use this area in this manner. By this means the application can
provide the exit with a table already established when the OSR was generated or with
predefined string/StringID pairs.
name 21
Note: String IDs persist across multiple parses within a parser instance
Figure 6-5 String ID exit processing
When a parser instance is used to parse several documents, the StringID Table (and its string
IDs) persist until the end of the parser instance.
The XML Parser, C++ Edition has implementations of SAX2 and DOM on z/OS V1R9 or later
but with slight alterations that allows it to use z/OS XML System Services. The support for
validation using XML Systems Services was provided in z/OS V1R10 onwards. Usage of
z/OS XML System Services can, in many cases, improve performance and allows for the
workload to be offloaded to zAAP speciality engine, if present. Thus, an application has two
choices for parsing. First, parse with the existing parser classes for which z/OS XML System
Services are not used. Second, parse with z/OS-specific parser classes where z/OS XML
System Services will be used. The flow for these options is depicted in Figure 6-6 on page 77.
This is done through a new function, PLISAXC. This routine uses XML System Services
under the covers. It is similar to the older routines PLISAXA and PLISAXB. It supports:
Documents larger than 2 GB
Documents coded in UTF-8
Namespaces
The application program provides an XML string in a buffer, and its length to PLIXSAXC. The
buffer does not need to contain a complete XML string. If the parser finds that the XML string
is incomplete, it will invoke an event to trigger the application program to provide more XML.
Enterprise COBOL for z/OS V4R2 accepts COBOL data item names and program names
with underscores. XML GENERATE automatically produces element and attribute names that
exactly match the source COBOL data names. By using underscore in the source COBOL
data names, it will automatically produce element and attribute names with underscores.
SPECIAL-NAMES.
XML-SCHEMA schema-name1 IS ‘file name or path’,
schema-name2 IS ddname-or-environment-variable.
osr-data-item, when provided, must contain a complete OSR. When referencing an OSR with
the FILE clause, the FILE clause must reference a SPECIAL-NAMES XML-SCHEMA entry.
XML-SCHEMA entries can reference a literal containing the file or path name, or they can
reference a ddname or environment variable name.
A CLOB host variable is assumed to be in the program’s default CCSID. XML is always stored
in UTF-8 (CCSID 1208). Therefore, XML within CLOB host variables will typically undergo a
conversion to UTF-8. When selecting XML into a CLOB, it will typically undergo a conversion
from UTF-8.
A DBCLOB host variable is assumed to be in UTF-16 CCSID (1200). These will also undergo
a conversion to and from UTF-16.
A BLOB host variable is binary and assumed unsafe for conversions. Therefore, a host
variable of type SQL TYPE IS XML AS BLOB is assumed to XML with character encoding
UTF-8.
CLOB_FILE, BLOB_FILE and DBCLOB_FILE are file reference locators. They allow you to
provide the data set name or path. When used for input, DB2 will read the contents of the
files. When used for output, DB2 will write to these files. XMLPARSE can parse data directly
from a file reference locator.
For detailed information about XML host variable types, see the DB2 Version 9.1 for z/OS
Application Programming and SQL Guide. Additional information can be found in DB2 9
pureXML Guide, SG24-7315 and DB2 9: pureXML Overview and Fast Start, SG24-7298.
7.1.1 Enterprise COBOL for z/OS applications and z/OS XML System Services
COBOL applications can invoke an XML parse through the XML PARSE statement. See
Figure 7-1. The native COBOL parser will be invoked for programs compiled using the
XMLPARSE(COMPAT) option, while the z/OS XML System Services parser will be invoked by
programs using the XMLPARSE(XMLSS) option. Neither use of XML PARSE provides
document validation prior to V4R2. Using Enterprise COBOL V4R2 with the
XMLPARSE(XMLSS) compiler option plus the VALIDATING WITH phrase of the XML PARSE
statement will invoke the validating parser. When using XMLPARSE(XMLSS), the portion of
the processing performed by z/OS XML System Services will be offloaded to a zAAP when
one is present.
Alternatively, as Figure 7-1 shows, COBOL applications can invoke the z/OS XML System
Services parser through the provided assembler API using COBOL CALL ... USING or the
provided C API using COBOL CALL ... USING ... RETURNING .... Scenarios where
document validation is required could use this approach in place of, or in addition to, the use
of XML PARSE. When using z/OS XML System Services through APIs for parsing, the
application must navigate the intermediate format created by z/OS XML System Services.
Additionally, the application must manage both input and output buffers for z/OS XML System
Services.
COBOL
XML PARSE CALL
zAAP zAAP
The native COBOL XML parser is a high speed, low function option. It should be considered
when maximum performance is a significant requirement. The native COBOL parser’s
well-formedness checking does not identify all cases of improper XML form. The XML parser
cannot do document validation.
Use of the z/OS XML System Services parser through COBOL’s XML PARSE verb provides
much more functionality than the native COBOL parser.
7.1.2 Enterprise PL/I for z/OS programs and XML System Services on z/OS
Enterprise PL/I for z/OS provides three SAX-style parsing routines (see Figure 7-2):
PLISAXA
PLISAXB
PLISAXC
PLISAXC uses the z/OS XML System Services parser while PLISAXA and PLISAXB use
PL/I-provided routines. PLISAXC is the only built-in routine with the option of offloading most
of the XML parsing work to a zAAP.
Alternatively, PL/I applications can invoke the z/OS XML System Services parser using the
provided assembler or C API. Scenarios where document validation is required could use this
approach in place of, or in addition to, the use of PLISAXC. When using z/OS XML System
Services for parsing, the application must navigate the intermediate format created by z/OS
XML System Services. The application must also manage both input and output buffers for
the z/OS XML System Services.
PL/I
CALL
zAAP
C can use either the z/OS XML System Services assembler interface or the z/OS XML
System Services C++ interface depending upon whether XPLINK is needed.
The XML Toolkit for z/OS provides to C++ applications both DOM and SAX interfaces. When
the zSAX2XMLReader virtual class is used, z/OS XML System Services will be used and
XML parsing is eligible for offload to a zAAP/zIIP.
In case your software has the requirement to run under control of an SRB, you need to fulfill
particular requirements. One requirement is that you cannot use program code that is
dependent on Language Environment. This can be achieved by writing a C program and
using the bare metal C code variant of it. Another option is to use Assembler. In either case,
you can use z/OS XML System Services in such an environment because these services are
not using the Language Environment.
7.2.3 Invoking the XML Toolkit XSLT Processor from within COBOL-PL/I
applications
XML Toolkit
XML Transformed COBOL, PL/I
for z/OS
XML parse
XSLT processor
Figure 7-3 Invoking the XML toolkit from within COBOL-PL/I, single document input
To process many documents, the application could iteratively position non-transformed XML,
invoke the toolkit, and parse the result, as shown in Figure 7-4.
COBOL, PL/I
Application
XML Toolkit
Invoke XSLT for z/OS
XSLT processor
Transformed
Parse XML
Figure 7-4 Invoking the XML Toolkit from within COBOL-PL/I for multiple input documents
Figure 7-5 Invoking the XML toolkit from within COBOL-PL/I for a single output document
7.2.4 Invoking the XML Toolkit Parser from within COBOL-PL/I applications
COBOL and PL/I applications can invoke the XML Toolkit for z/OS Parser, C++ Edition. It is a
good practice to invoke a C glue routine to invoke the appropriate C error handling semantics.
Examples of COBOL calling the toolkit for z/OS Parser with a C routine are available within
the XML Toolkit for z/OS User’s Guide, SA22-7932.
7.3.3 When to use the z/OS XML parser rather than the PL/I native parser
Some design scenarios require use of the z/OS XML System Services parser rather than the
native parser. There are also scenarios where use of the z/OS XML System Services parser
provides benefit, though it is not required.
Large documents
The PLISAXA parser requires the entire document be available in contiguous memory.
The z/OS XML System Services parser will accept documents in one or more buffers.
PLISAXB requires the entire document be available in a file and places a 2 GB limit upon
the total text length. Additionally, PLISAXB reads the entire content of the file into memory.
Therefore, there must be sufficient memory available to contain the entire document.
z/OS XML System Services’ multi-buffer arrangement allows processing of large
documents.
Document text arrives in pieces
When whole documents are constructed from multiple segments (MQ messages, HTTP
requests, sockets, file records, and so forth), it might be easier to provide each segment
separately to the z/OS XML System Services parser rather merge multiple segments into
a single large memory area for the native PL/I parser.
Document is provided in UTF-8
The z/OS XML System Services parser processes documents encoded in UTF-8 directly.
The native PL/I parser requires converting the entire document from UTF-8 to UF-16 prior
to parsing.
Document containing namepaces
The z/OS XML System Services parser parses namespaces. As of Enterprise PL/I for
z/OS V3R8, the native parser does not parse namespaces.
7.3.4 When to change from COBOL and PL/I native parsers to z/OS XML
System Services parsers
Changing existing applications to z/OS XML System Services parsers should be considered
when sufficient zAAP/zIIP eligible work exists to justify purchasing zAAP or zIIP engines. Use
of z/OS XML System Services parsers will increase the benefit gained from the speciality
engines.
There are some differences between the output of z/OS XML System Services parsers and
non-z/OS XML System Services parsers. Application changes are sometimes required to
change from one to the other. For example, for COBOL there are detailed instructions on how
to migrate from the native parser to XML System Service parser in Enterprise COBOL for
z/OS Version 4.2 Compiler and Runtime Migration Guide, GC23-8527.
Figure 8-1 Simple non-validating parsing using z/OS XML System Services
Figure 8-2 Non-validating parsing with z/OS XML System Services using String IDs
z/OS XML System Services will invoke the user-provided User String Handler routine for each
unique text value. The User String Handler searches the StringID table and returns the
assigned StringID if one already exists, or adds the new string to the StringID table and
returns the newly assigned StringID value.
While processing the XML-parsed data stream, the application can use the GXLSYM31
(GXLSYM64) StringID service (lookup) routine to retrieve the text string value for a given
StringID.
One of the advantages of providing your own User String Handler routine is that it allows you
to use a predefined set of StringID assignments. This allows a set of business applications to
use the same StringID values for the same text values in every XML document they parse.
See Figure 8-3.
Figure 8-3 Non-validating z/OS XML System Services parsing with a predefined set of String ID values
Note: Use of predefined StringID values can provide substantial processing efficiency for
some parsing processes. It can be used to avoid StringID to Text lookup calls and facilitate
navigation of the XML parsed data stream and sparse parsing.
Sparse parsing is the act of quickly isolating only those portions of the XML parsed data
stream which are germane to the business process. This in contrast to navigating the
entire parsed data stream.
After illustrating basic two-step and one-step processes, we will discuss three more advanced
models that use additional features of the z/OS XML System Services parser.
Two-step process
The two-step process separates the creation of the OSR from the use of the OSR. To create
an OSR, the application must initialize the OSR generator environment, load the XSD files
and generate the OSR. The resulting OSR can be stored and used as needed.
Other steps can then load the OSR, initialize the parse environment, make the XML string
available, request a parse, and process the returned XML parsed data stream. When loaded,
the OSR can be used over and over again.
StringID table
Generate OSR z/OS XML
OSR
StringID
Write OSR table
Step 1
Step 2
Read and Load OSR z/OS XML
Note 2
Note: Document parsing performance is improved for strings within the XML document
which appear within the OSR.
You do not need to provide a User String Handler to the OSR generation step to obtain the
benefits of using a String ID table within the OSR. The OSR generator provides a default
String Handler that builds the appropriate String ID table. The use and benefits of providing
your own User String Handler within OSR generation will be discussed in “Validating parsing
with predefined StringIDs returning StringIDs” on page 98.
One-step process
The one-step process combines all the steps from the two-step process and provides
identical results and benefits. See Figure 8-5 on page 97.
StringID table
Generate OSR z/OS XML
OSR
Write OSR StringID
table
Document parsing performance is improved for strings within the XML document that appear
within the OSR.
XSD
OSR
OSR
StringID Complete OSR
table GENERATE
StringID table
call
User
Predefined
String
StringID
Handler
assignments
Load OSR z/OS XML
XSD
OSR
OSR
StringID Complete OSR
table GENERATE
StringID table
call
StringID Table User
User
String
String Predefined
lookup
Handler
Handler StringID
assignments
z/OS XML Load OSR
l
c al
Figure 8-7 Validating parse with predefined String ID assignments returning String IDs
Use of a User String Handler during parsing provides a smaller XML parsed data stream. It
includes String IDs for the XML string and names appearing within the XSD schema. Use of a
predefined StringID assignment within the User String Handler allows consistent assignment
of specific text strings to specific String ID values.
When using a String Handler for parsing we suggest you write your own and use it for both
parsing and OSR generation. Two different executable modules are required. OSR generation
expects the String Handler routine to use OS linkage conventions and establish its own
dynamic storage. The parser requires the String Handler routine to use special prolog and
epilog code for establishing dynamic storage. Language Environment conforming routines
cannot be used for a parser-driven String Handler routine because Language Environment
routines cannot execute in enclave-SRB mode. It is possible to create both executable
modules from a single source module. For information regarding creating these executables,
see 9.1.6, “Get a StringID exit working” on page 122.
OSR
OSR
StringID Complete OSR
table GENERATE
StringID table
call
StringID Table User
User
String
String Predefined
lookup
Handler
Handler StringID
assignments
z/OS XML Load OSR
call
z/OS XML Extract String
OSR’s
Table
StringID Table
Parse wih z/OS XML
XML Validation XML parsed
Data stream
Process data stream
Figure 8-8 Validating parse returning predefined String IDs and the OSR’s String ID table
Figure 8-9 shows the main program flow for a SAX2 parser application and the corresponding
code.
SAX2XMLReader* parser =
Create an instance of the parser
XMLReaderFactory::createXMLReader();
SAX2CountHandlers handler;
Declare and register Handlers parser -> setContentHandler(&handler);
parser -> setErrorHandler(&handler);
The document handler is the main interface in a SAX application. Figure 8-10 shows how the
SAX parser transfers information to the application by calling methods in the document
handler.
Example 8-1 describes the different interfaces to the document handler. The SAX parser
application must be able to use the most important of these interfaces to process the XML
documents.
virtual void characters (const XMLCh *const chars, const XMLSize_t length)=0
Receive notification of character data.
virtual void processingInstruction (const XMLCh *const target, const XMLCh *const
data)=0
Receive notification of a processing instruction.
The DOM parsing has the same main flow as the SAX parsing regarding initialization and
termination. Figure 8-11 shows the main program flow for a DOM parser application and the
corresponding C++ code.
XercesDOMParser * parser =
Create an instance of the parser new XercesDOMParser();
DOMtreeErrorReporter * errReporter =
new DomTreeErrorReporter();
Declare and create Handlers DOMWriter * theSerializer =
((DOMImplementationLS*)impl)->createDOMWriter();
The DOM parser application must navigate through the DOM tree. Example 8-2 on page 103
illustrates this task. Example 8-2 on page 103 shows the XML document used.
The parsing statement in the application built the DOM tree in memory. The relationship
between the elements is shown in Figure 8-12. The entire document tree is not detailed here.
Root element
<library> Siblings
Parent
Child
CharacterData CharacterData
US Mike Ebbers CharacterData CharacterData CharacterData
XML processing 2009 0000123458
Attribute Element on z/OS
“country” <author>
CharacterData CharacterData
DE Hans-Dieter Mertiens
The figure details the relation from element to element and the relation between an element
and the corresponding value and eventual attributes. To navigate through the DOM tree, the
application must use the different interfaces to DOMNode to ensure all needed information in
the XML document is handled.
virtual bool isSupported (const XMLCh *feature, const XMLCh *version) const =0
Tests whether the DOM implementation implements a specific feature and that feature is
supported by this node.
virtual void * setUserData (const XMLCh *key, void *data, DOMUserDataHandler *handler)=0
Associate an object to a key on this node.
virtual void * getFeature (const XMLCh *feature, const XMLCh *version) const =0
This method makes available a DOMNode's specialized interface.
Non-standard Extension
A DOM parser application is more complex and consumes more memory (to hold the entire
document in its internal tree representation) than a SAX parser application. But it has the
advantage that the applications are able to search the DOM tree without reading all elements
or attributes, plus the possibility to reenter specific elements.
8.3 COBOL
The z/OS XML System Services parser will be used for programs using XML PARSE and
compiled with the XMLPARSE(XMLSS) compile option. For an example of XML PARSE, see
Enterprise COBOL for z/OS Version 4.2 Programming Guide, SC23-8529.
The z/OS XML System Services parser can also be called directly from a COBOL application.
For an example of COBOL calling z/OS XML System Services directly, see B.4, “Enterprise
COBOL program to query XML document declaration” on page 180 and B.5, “C program to
query XML document declaration” on page 183’.
8.4 PL/I
The z/OS XML System Services parser will be used for programs using PLSIAXC.
The z/OS XML System Services parser can also be called directly from a PL/I application. For
an example of PL/I calling z/OS XML System Services directly, see 7.1.2, “Enterprise PL/I for
z/OS programs and XML System Services on z/OS” on page 87.
Pipeline
Handler
Handler
Handler
Handler Business
Logic
CICS Web Services automatically starts the CICS transaction related to the XML message.
The application program receives the data in the corresponding channel and reads in the data
container by container.
The business application replies to the requester by placing the reply data in containers in the
same channel the request was received. Data is then transformed to XML by the pipeline
according to the information in the WSBind file.
Both programs also create the WSBIND file. The tool contains two programs (see Figure 8-14
on page 108).
Top down
PDS
The wsbind file contains information to let CICS create main storage blocks to map data
between XML and language structures. Example 8-4 shows a sample WSBIND job.
Conversion modules are available to prepare for the TRANSFORM statement with the same
tools as used for CICS Web Services. See 8.5.1, “CICS Web Services Assistant” on
page 107.
If your conversion modules is created from an XML schema and not a language structure,
there might be more than one transformation between XML and language structure. To allow
you to control the assignment of the correct language structure, two extra parameters,
ELEMNAME and ELEMNAMELEN, on the TRANSFORM statement inform you of the root
element name. Knowing the root element name, it is possible to assign the correct data
structure to the containers received. See Example 8-6.
If your application can receive XML messages that are derived from different XML schemas,
you need to inspect the XML string to determine the relevant conversion. To handle this, first
call the TRANSFORM API, get the element name, and decide from the element name which
XMLTRANFORM is needed to parse the entire string. Then call the TRANSFORM API again
with the relevant XMLTRANSFORM option, as shown in Example 8-7.
The result from the validation control is not returned to the application, but only
communicated through log messages. Check the system log to ascertain whether the XML
transformation is valid:
Message DFHML0508 indicates that the XML was successfully validated.
Message DFHML0507 indicates that the validation failed.
XMLPARSE will STRIP or PRESERVE whitespace when specified. XMLPARSE will also
accept XML strings from columns and SQL expressions. XMLPARSE only accepts
well-formed XML documents as defined in XML 1.0.
For additional information, see DB2 Version 9.1 for z/OS Command Reference, SC18-9844.
Sub-schemas are only required when the primary schema document references other
schema documents. The ENABLE DECOMPOSITION subcommand allows the schema to be
referenced by the DECOMPOSE command processor command, which is used to shred
XML.
The registered schema is not tied to one specific table or column. It can be used to validate
any document that complies to the schema. Example 8-10 shows several (of many) variations
for the insert with validation syntax.
INSERT INTO table (column list) VALUES (col-1-val, col-2-val, XMLPARSE (DOCUMENT
SYSFUN.DSN_XMLVALIDATE(:xml-host, 'SYSXSR', :host-schema-name)))
INSERT INTO table (column list) VALUES (col-1-val, col-2-val, XMLPARSE (DOCUMENT
SYSFUN.DSN_XMLVALIDATE( CAST :char-host AS CLOB, :host-sysxsr-schema-name)))
INSERT INTO table (column list) VALUES (col-1-val, col-2-val, XMLPARSE (DOCUMENT
SYSFUN.DSN_XMLVALIDATE( xml-file-reference-variable ,'SYSXSR.schema')))
The IBM Information Management Software for z/OS Solutions Information Center includes
additional examples and the most up-to-date information. The IBM Information Management
Software for z/OS Solutions Information Center can be found at the following Web page:
http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp
For additional information regarding XML parsing within DB2, refer to the following resources:
DB2 Version 9.1 for z/OS SQL Reference, SC18-9854
DB2 Version 9.1 for z/OS Application Programming and SQL Guide, SC18-9841
DB2 9: pureXML Overview and Fast Start, SG24-7298
DB2 9 pureXML Guide, SG24-7315
There might be requirements to use this tool outside the z/OS UNIX shell, for example, the
XSD/OSR combination is part of a change management system. In this case, it might be
more convenient to run this tool in a pure batch environment. One way is to use the batch
interface routine from z/OS UNIX, BPXBATCH. Another way to address this requirement is to
make xsdosrg an executable that can be loaded from a partitioned data set. This allows
xsdosrg to be within the standard JOBLIB/STEPLIB libraries. Specifically, the output can be
easily put into archiving tools. We discuss the necessary steps to achieve this.
1. Copy over the xsdosrg tool to a partitioned data set extended (PDSE). This can be done
from the z/OS UNIX shell using the following copy command.
(cp):
cp /bin/xsdosrg "//'HDM.SG7810.LOADE(XSDOSRG)'"
After that we have an executable version of xsdosrg in the PDSE. In our example it is
HDM.SG7810.LOADE.
Note: You need to keep track of the service level of xsdorsg to avoid running an
outdated version. Alternatively, you can copy the xsdosrg program to a temporary
PDSE before executing it. Executable modules can be copied using the program
management binder, IKJEFT01 TSO command processor, or BPXBATCH UNIX batch
command processor. Example 9-1 shows use of the program management binder to
copy the xsdosrg executable before executing it.
The next step is to create the JCL to execute xsdosrg in batch. Example 9-1 shows the job
we used in our project.
The slash after the PARM= separates those options that are specific to Language
Environment from those which are passed to the program, here xsdosrg. With the -v we
asked for details of the run, the -o test.osr identifies the output file, and with the -l &L
we pass the name of a list file to xsdosrg.
The more important settings here are those which help to establish the necessary
environment for xsdosrg to run. These are directed by the DD name of CEEOPTS to
Language Environment. The important settings are the reference to the file with the
environment variable settings (ENVAR) and the POSIX(ON) setting.
The settings to establish the environment are shown Example 9-2. These are the LIBPATH
and the CLASSPATH environment variables we used. These options are stored in the
/u/hdm/sg7810/xsd2osr.env file. The necessary definitions can be found in the XML
System Services User’s Guide and Reference, SA23-1350.
Your working environment on z/OS UNIX might also need these path settings. A good
place to set the definitions is the .profile script.
2. When this batch job is started, a z/OS UNIX environment is created based on various
sources. One is the OMVS segment that is assigned to the user ID under which the job
runs. In our case this is HDM. When the job starts it will be positioned with a current
working directory of /u/hdm. As we have stored our XSD files in the directory
/u/hdm/sg7810 we need to define the source directory of the XSD files also. Otherwise it
would be assumed to be in HDM’s home directory. Example 9-3 shows the content of our
file xsd.lst.
3. As a final check, look at the directory /u/hdm/sg7810 to see that the OSR has been stored
correctly. See Example 9-5.
Example 9-6 on page 117 shows the necessary calls in an assembler program to do a
validating parse using the XML System Services. In this extract, we do not show the read
routines to do a get of the OSR and the XML string into storage. You see that there is only a
small difference between a simple parse and a validating parse.
In a COBOL or PL/I program, the sequence would be the same if using the XML System
Services parser through APIs.
For an example of a validating parse in COBOL using the XML PARSE statement, see
Enterprise COBOL for z/OS Version 4.2 Programming Guide, SC23-8529.
Example 9-7 Program flow for navigating XML parsed data stream
current_record = buffer_addr;
do while( current_record < buffer_current_addr );
bufptr = current_record;
current_record += bufptr->gxl_reclen;
end;
With the selections you get a first frame for your program using the z/OS XML System
Services parser.
For a complete listing of the PL/I program, see Example B-2 on page 167.
From the z/OS Users Guide and Reference we get the following explanations.
1301, XRSN_BUFFER_INBUF_END, The end of the input buffer has been reached.
1303, XRSN_BUFFER_OUTBUF_END The end of the output buffer has been reached
1304, XRSN_BUFFER_INOUTBUF_END The end of both buffers have been reached.
Example 9-8 shows how we have handled these three reason codes. The excerpt is taken
from the full example shown in Example B-2 on page 167. The only place where new input is
provided to the input buffer is with the read in the line marked 1.
For the most part, this excerpt corresponds to the flow chart in Figure 6-3 on page 72.
Example 9-8 Error checking for navigating XML parsed data stream
if return_code = 4 then
select( reason_code );
when( gxl_rsn_buffer_inbuf_end
,gxl_rsn_buffer_inoutbuf_end )
do; /* 1301, 1304 */
call read_xml( xmldocument ); 1
rc = gxl1trm( pima_addr,
return_code, reason_code );
We also counted in our small program the number of occurrences of each of these reason
codes. This might also be helpful in your environment as it provides an indication how well the
sizes of the input and output buffer are selected. The relation of these are dependent on the
individual XML document processed, so there is no fundamental rule available how to size
them. A good starting point is 4 kB for the input buffer, and 8 kB for the output buffer.
The XML string is coded with codepage IBM-037. We pass this information as a parameter to
the parser. The buffer returned by the parser is shown in Example 9-10 on page 121.
At 2 we start with the listing of the elements contained in our string. Each starts with the
indicator of _START_ELEM. As we have several consecutive elements nested
(<racfunload><user><name><first> we find the first returned data at 3. Which belongs to the
last tag started which is <first> in our case.
It is the responsibility of your application to navigate through these records and move the
values into the right variable.
When you parse the XML document you might provide a StringIDTable exit. It is suggested to
use the same exit which you provided when the OSR was generated. This suggestion is
mainly due to the fact that the exits are then based on the same source. Further, you have
control over the assignment of a number (function) to a string, even across multiple schemas
(XSD files).
The parser does not provide an environment to run a program which uses Language
Environment services. This is usually the case with all C, C++, PL/I, or Cobol code on z/OS.
One solution to have a program without Language Environment services is to use Assembler.
Another way to achieve this is to use the Metal C version of a C program. The Metal C
(-qmetal) option has been provided with the XLC Compiler since z/OS V1R9. The compiler
generates code that does not have Language Environment run-time dependencies. So, with
one source written in C, it is possible to support both instances: the generation of the OSR
and the parsing.
The XLC C compiler also provides means to add Assembler instructions within the C code.
These instructions conform with the different linkage conventions at the time the parse takes
place. The linkage statements are placed between prolog and epilog statements, as shown
in Example 9-11 on page 124. Based on this it should be easy to create these two exits based
on the same source code. If necessary you can add more code in Assembler, but this is the
basic requirement.
The process to handle both types of sources is shown in Figure 9-1 on page 123. For our
purposes we copied the StringIDHandler exit example provided in
SYS1.SAMPLIB(GXLESTRI) into our z/OS UNIX environment. We renamed it to StrIDx.c On
the left side of Figure 9-1 on page 123 the C code is compiled the usual way with the C
compiler. We do it from the z/OS UNIX command line. Also from a z/OS UNIX shell we
executed a XLC command that generates a ‘bare metal C’ version of our StringID handler
exit. This happens on the right side of Figure 9-1 on page 123.
The process might also be executed in a batch environment using usual jobs.
C source
StrIDx.c
* Note: for display purposes only the xlc command is split across lines
Figure 9-1 Creating a string ID handler exit routine
Note:
You need to use the XLC compiler interface. Otherwise you cannot get bare Metal C
output. In addition, with the c89 command you will get an error near line 300.
It absolutely necessary to force XLC to include the header for the Metal C conversion
from /usr/include/metal.
We then copied the Assembler version of it (stridx.s) into a partitioned data set for the
following steps.
When this StringIDHandler exit routine is being used as an exit to the XML System Services
parser:
A prolog and an epilog are required. This is required to take care of the different linkage
conventions at parse time. Also needed in this example is the setup of a Dynamic Stack
Area (DSA). More details are below.
In the workarea, and immediately following the DSA/Stack space, is the storage that will
be mapped to the string ID table (XSI) as the exit structures it.
The DSA is used because this exit uses local variables. It would also be necessary if you
need to use other services from the C runtime library. For a complete list of services that
might be used in a bare metal C environment, see the z/OS Metal C Programming Guide and
Reference, SA23-2225.
At first we show the necessary elements that have to be added to the C program to make it
usable in both environments. As the housekeeping (or the linkage conventions) are different
from those established by the C compiler, we need to add the necessary assembler
instructions. Example 9-11 shows this detail.
Here we show how to establish addressability of the small DSA we need. It is conditionally
implemented using the define __IBM_METAL__ 1. See Example 9-12. For the Metal C
environment the code following is included, otherwise just the XSI is addressed from the
system services parameter area.
Note: The system services parameter is used as work area for the StringIDHandler exit.
For more information about the use of Metal C, see z/OS Metal C Programming Guide and
Reference, SA23-2225.
Counter
StringIDHandler
In our PL/I example we used the fetch 1 built-in function to load the service from steplib and
to get the entry point address at the same time. The StringIDHandler exit is contained in load
module STRIDX. See Example 9-13.
The address of this vector is passed to the initialization routine gxlpinit(), GXL1INI, or
GXL4INI depending on the language or addressing mode you use. See Example 9-14.
Example 9-14 Passing system services parameter to the x/OS XML parser
rc = gxl1ini( pima_addr, pima_len,
ccsid, features,
/* sysnull(), sysnull(), */
xsv_p, addrdata(xsi_p) ,
return_code, reason_code );
Note: When you create a separate load module for your exit, such as to fetch this from a
library, you need to provide the correct entry point.
Note: An exit written in Assembler does not need a DSA. But in case it needs work
space, you can do it the same way as described here. This would keep your exit free of
further storage handling.
Space for the work area. If you have prepared a StringIDTable when the OSR was
generated you should provide the data from that table to the exit. Remember to take care
of location dependent data. In case you have not generated a StringIDTable earlier, the
exit always will create a fresh one.
With these preparations we were able to get this StringIDHandler exit working. Example 9-16
shows the output for the same XML document we used in Example 9-10 on page 121.
With this StringIDHandler exit running, the output buffer from the parser now contains a
number instead of the name of the element. According to the list in Example 9-17 the string
ID 15 (1) is returned for the element name of uid.
The system service vector is set up as described in “How to populate the System Service
Vector” on page 125.
As this exit builds up a tree structure more data areas are needed. Unfortunately, the example
does not apply the necessary structure by itself. It must be provided by the application. As in
the previous example, this area is then passed as the system services parameter.
Header information
ID list area
Dynamic Area
Free Area
In Example 9-18 we show the implementation using a PL/ I structure. The data area needs to
be allocated on a full word boundary. At 0 we set some constants:
Maximum symbol size
Initial string ID number
Maximum IDs expected
In Example 9-18, total space, and free space 1 are set to the size of this structure. The
necessary pointers are stored as shown at 2.
Example 9-18 PL/I structure used as system services parameter for GXLE1IDI
dcl 1 xsi aligned ,
2 xsi_eye fixed bin (63) /* just 8 bytes 4 the eye */
/* will be set to XSIEYECA*/
init(1) , /* 00 */
/* 00 */
2 xsi_sym_max_size fixed bin (31) init(64) , 0 /* 08 */
2 xsi_diag_code fixed bin (31) , /* 0C */
2 xsi_next_id fixed bin (31) init(1) , 0 /* 10 */
2 xsi_max_id fixed bin (31) init(500) , 0 /* 14 */
2 xsi_total_size fixed bin (31) 1 , /* 18 */
2 xsi_free_space fixed bin (31) 1 , /* 1C */
2 xsi_curr_null fixed bin (31) init(0) , /* 20 */
2 xsi_curr_free pointer , /* 24 */
2 xsi_tree_null fixed bin (31) init(0) , /* 28 */
2 xsi_tree_head fixed bin (31) INIT(0) , /* 2C */
2 xsi_dyn_null fixed bin (31) init(0) , /* 30 */
2 xsi_dyn_area31 pointer 2 , /* 34 */
2 xsi_list_null fixed bin(31) ,
2 xsi_list_ptr pointer 2 ,
2 xsi_id_list (16*1024) fixed bin (31) , /* 00 */
2 xsi_dyn_area(16*1024) fixed bin (31) , /* 00 */
We provided the entry address of our exit through a fetch instruction as we did in the previous
example.
Running our little parser application with this exit gives the same output as with the previous
example. We show a small excerpt from our output listing in Example 9-19. At 1 the element
first is returned as an identifier with a value of 9.
1 3 ***xml
2 36 ***http://www.w3.org/XML/1998/namespace
3 5 ***xmlns
4 29 ***http://www.w3.org/2000/xmlns/
5 5 ***space
6 10 ***racfunload
7 4 ***user
8 4 ***name
9 5 ***first 1
10 4 ***last
11 3 ***tso
Note: Because it is a good idea to use the same exit at the time when the OSR is
generated and when the parse takes place, we need to point out that this example does
not provide an easy means to save the table and reload it at parse time. This is due to the
fact that much of the control information kept in the system services parameter area is not
relocatable. Nevertheless, the example provides for an efficient search and insert strategy.
rc=00000000 rsn=000000001
strIDTbl_l =00003d8a2
strIDTbl_p at 19f7e03c3
XSTR_TBL_NUMSTR 0000018f4
XSTR_TBL_STRBUFFOFFSET 000025885
stepp at 19F7E04C 6
XSTR_TBLENTRY_STRID 00000001 _STRLEN 00000005 _OFFSET 00000000
string xmlns
stepp at 19F7E064 6
XSTR_TBLENTRY_STRID 00000002 _STRLEN 00000003 _OFFSET 00000006
string xml
stepp at 19F7E07C 6
XSTR_TBLENTRY_STRID 00000003 _STRLEN 0000001D _OFFSET 0000000A
Both kinds of interface routines provides for 64 bit support.That is, if your program can run in
an 64-bit environment or is designed to run with 64-bit addressing the functions provided
byXML System Services can be used. Currently this is only possible from C/C++ programs, or
with Assembler programs.
The services routines are an intrinsic part of z/OS. As many other services these are loaded
at IPL time and can be addressed through the Communication Vector Table (CVT). You can
address each of these routines with quite simple declarations.
Example 9-21 shows how to call the XML System Services from a PL/I program.
Example 9-21 Call gxlini from a PL/I program going through the CVT
dcl cvt pointer based( ptrvalue(16) );
dcl cvtcsrt pointer based( ptradd(cvt,544) );
dcl csrt(19) pointer based( cvtcsrt );
dcl
gxl1ini entry(
pointer byvalue, /* parse instance memory area (pima) */
fixed bin(31) byaddr, /* pima length */
fixed bin(31) byaddr, /* ccsid of document */
fixed bin(31) byaddr, /* parse feature flags*/
pointer byaddr, /* vector of system service routines */
pointer byaddr, /* system service routine parameter */
fixed bin(31) byaddr, /* return code */
fixed bin(31) byaddr /* reason code */
)
limited
based( ptradd(csrt(19),16) )
returns( fixed bin(31) byvalue);
For each of these services there are also stubs available in SYS1.CSSLIB. It might be more
according to the rules in your shop to access the XML System Services through these stubs.
You then need to include the SYS1.CSSLIB in your SYSLIB concatenation in the step the
binding of your program takes place. The Call 2 statement is the trigger to generate an
indication for the Binder to include this service from SYS1.CSSLIB. Example 9-22.
For PL/I the ’retcode’ 1 in the options() part of the declaration is the indicator to later have the
return code available. According to the linkage conventions, it is returned in register 15.
We used the compile, bind, and go procedure IBMZCBG as it is provided by IBM. See
Example 9-23, the SYS1.CSSLIB is added using JCL override. 1
Example 9-23 PL/I compile, bind and go with modification to include SYS1.CSSLIB
//GXLCLG4 JOB (999,POK),NOTIFY=HDM,REGION=0M,
// CLASS=A,MSGCLASS=T,MSGLEVEL=(1,1)
/*JOBPARM SYSAFF=SC80 << keep it on our test system
/* procedure to compile, bind and execute a pl/i program
//*PROC JCLLIB ORDER=(IBMZ.SIBMZPRC)
//G EXEC IBMZCBG,LNGPRFX='IBMZ',
// PARM.PLI='SOURCE,OPTIONS,OBJECT,NEST,XREF',
// PARM.BIND='XREF,COMPAT=MIN,LIST=ALL,MAP'
//PLI.SYSIN DD DISP=SHR,DSN=HDM.SG7810.SOURCE(GXL4)
//BIND.SYSLIB DD
// DD DISP=SHR,DSN=SYS1.CSSLIB 1
//GO.XMLIN DD DISP=SHR,DSN=HDM.SG7810.XML0100K
For examples of COBOL invoking z/OS XML System Services, see B.4, “Enterprise COBOL
program to query XML document declaration” on page 180 and B.5, “C program to query
XML document declaration” on page 183.
All header files can be found in /usr/include when you are in the z/OS UNIX. At that place
they can also be included into a batch job that compiles a program using XML System
Services. The headers are also stored in SYS1.SIEAHDRV.H and can be accessed from
there.
C/C++ programs using XML System Services need to have set the Language Environment
option XPLINK turned on. This can be achieved by exporting the environment variable
_CEE_RUNOPTS="XPLINK(ON)". Another way to turn this option on is to add a #pragma
option to the C/C++ program. See Example 9-24 on page 134.
#include <gxlhosrg.h>
#include <gxlhxec.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
A C/C++ program needs to be compiled with the dll option turned on. Further, to link with
routines provided by XML System Services you need to include these with the DLL’s definition
side deck files (.x) from /usr/lib. See Example 9-25.
Example 9-25 Command to compile and link a C/C++ program from the z/OS UNIX shell
HDM § SC80:/u/hdm/sg7810>cat mk
c89 -o gstridtab -Wc,dll gstridtab.c /usr/lib/gxlxxml1.x /usr/lib/gxlxosr1.x
HDM § SC80:/u/hdm/sg7810>
From /usr/lib they are also accessible for batch jobs to compile and bind a C/C++ program.
If you prefer to use a partitioned data set, the side files are stored in SYS1.SIEASID.
The dlls can be found in /usr/lib. They are also contained in SYS1.SIEALNKE.
10.2 XPLINK
Extra Performance Linkage (XPLINK) is a z/OS feature that provides high performance
subroutine call and return mechanisms. This results in short and highly optimized execution
path lengths.
Object oriented programming is built upon the concept of sending messages to objects that
result in the object performing some actions. The message sending activity is implemented as
a subroutine invocation. Subroutines, known as member functions in C++ terminology, are
normally small pieces of code. The characteristic execution flow of a typical C++ program is,
of many subroutine invocations to small pieces of code. Programs of this nature benefit from
the XPLINK optimization technology.
MVS has a standard subroutine calling convention which can be traced back to the early days
of System/360. This convention was optimized for an environment in which subroutines were
more complex, there were relatively few of them, and they were invoked relatively infrequently.
Object oriented programming conventions have changed this. Subroutines have become
simpler but they are numerous, and the frequency of subroutine invocations have increased
by orders of magnitude. This change in the size, numbers, and usage pattern of subroutines
made it desirable that the system overhead involved be optimized. XPLINK is the result of this
optimization.
To avoid performance penalties from swapping between XPLINK and non XPLINK
environments, you be must careful to bind your application to the correct environment.
If you call z/OS XML System Services directly and not through the toolkit, the number of
environment swaps will be relatively small and the effect on the overall performance should
be minimal.
10.3.1 zAAP
The IBM System z Application Assist Processor (zAAP) is available on all IBM System z10™,
IBM System z9®, IBM eServer™ zSeries® 990 (z990), and IBM eServer zSeries 890 (z890)
systems. The zAAP specialty engine provides an attractively priced execution environment for
new Web-based applications and SOA-based technologies, such as:
Java
For customers who desire the powerful integration advantages and traditional qualities of
service of the IBM mainframe platform.
XML
For customers who desire cost effective XML parsing services on z/OS, z/OS XML
System Services, when running in task (TCB) mode, can exploit the zAAP for eligible XML
workloads.
In addition, the IBM XML Toolkit for z/OS V1.9 was enhanced so eligible workloads can use
z/OS XML System Services non-validating parsing. This means eligible XML Toolkit
processing (for non-validating parse requests) can exploit the zAAP and also obtain improved
performance. This function is available on the XML Toolkit for z/OS V1.9 with PTFs UA40707
and UA40708.
With XML Toolkit for z/OS v1.10 you are able to perform validating parsing using z/OS XML
System Service as the underlying parser and gain the zAAP offload advantages. Validating
parsing using z/OS XML System Services is different from validating parsing using the XML
Toolkit for z/OS alone.
IBM Enterprise PL/I V3.8 was enhanced with a new XML parse subroutine, PLISAXC, which
allows the optional use of z/OS XML System Services and the zAAP, when present.
10.3.2 zIIP
The IBM System z Integrated Information Processor (zIIP) is available on all System z10 and
System z9 servers. It is designed to help free-up general computing capacity and lower
overall total cost of computing for select data and transaction processing workloads for
business intelligence (BI), ERP and CRM, and select network encryption workloads on the
mainframe. When executed in SRB mode, XML System Services for both validating and
non-validating parsing is offloaded to the zIIP, when present.
Figure 10-1 summarizes the circumstances under which a zIIP or a zAAP can process an
XML workload.
Any software using z/OS XML System SRB zIIP 100% of z/OS XML Systems z/OS 1.7
Services non validating parsing Services parsing eligible for zIIP DB2 V9 New Function Mode
Any software using z/OS XML System TCB zAAP 100% of z/OS XML System z/OS 1.9
Services validating parsing Services parsing eligible for zAAP XML Toolkit for z/OS v1.10
Enterprise COBOL V4.2
DB2 V9 New Function Mode
Any software using z/OS XML System SRB zIIP 100% of z/OS XML System z/OS 1.9
Services validating parsing Services parsing eligible for zAAP DB2 V9 New Function Mode
Applications using Java-based XML parser TCB zAAP 100% of Java-based XML parsiing Any z/OS, system z processor with
in IBM SDK eligible for zAAP zAAP support
Any software performing XML
parsing/processing I Java
The next topics give you an overview of how to optimize the infrastructure to get maximum
benefit from your System z installation by the use of the zAAP speciality engine.
COBOL Application
XML Toolkit
If you use the built-in parser in COBOL to process XML documents, you must migrate to
COBOL compiler version 4.1 to take advantage of the zAAP offload engines.
CICS Web Services from version 4.1 uses XML System Services as the parsing mechanism
for some of its processing. This is offloaded to a zAAP if one is available. CICS TS 3.2 and
lower versions do not have this enhancement.
If you call XML Toolkit for z/OS from a COBOL application you have two options:
Using z/OS specific classes or standard classes
XPLINK or non-XPLINK
If you have access to an offload engine, choose the z/OS-specific classes to gain the lower
CPU cost on the zAAP processor.
Use the non-XPLINK versions of the parser library to eliminate the overhead from
environment swapping.
10.5.2 PL/I
As in COBOL you have a number of choices when handling XML documents in a PL/I
application. Figure 10-3 on page 141 outline the main possibilities.
The more current your IBM software inventory, the more you can take advantage of the zAAP
offload engines.
XML Toolkit
Enterprise PL/I for z/OS V3R8 provides the option to use z/OS XML System Services as the
underlying parsing technology and thereby enables you to offload the parsing process to the
zAAP specialty engine.
CICS TS 4.1 is required to offload CICS Web Services XML parsing work to a zAAP
processor.
The options when calling the Toolkit from a PL/I application are similar to COBOL:
Using z/OS-specific classes or standard classes
XPLINK or non-XPLINK
If you have access to an offload engine, choose the z/OS-specific classes to gain the lower
CPU cost on the zAAP processor.
Use the non XPLINK versions of the parser library to eliminate the overhead from
environment swapping.
Services directly. C or C++ programs running in TCB mode and directly invoking XML System
Services are eligible for offload to the zAAP processor. Such programs running in SRB mode
are eligible for offload to the zIIP processor.
C/C++ Application
XML Toolkit
10.5.4 Assembler
The interface to the XML Toolkit is C++, so an Assembler application (see Figure 10-5) must
use Language Environment bindings to call the C++ API. Assembler programs running in
TCB mode and directly invoking XML System Services are eligible for offload to the zAAP
processor. Such programs running in SRB mode are eligible for offload to the zIIP processor.
Assembler Application
XML Toolkit
Java Application
Native XML
Java classes
General General
zAAP zAAP zAAP zAAP
Purpose Purpose
CPU CPU CPU CPU
CPU CPU
11.1.1 Definitions
We start with some basic definitions.
Character
A character is an “atomic” symbol used in a language.
Character set
A character set is a collection of all characters that comprise a given language. The language
might be a conventional human language or an unconventional language such as musical
symbols, Morse code, Braille, mathematical symbols, and so forth.
Character encoding
A scheme that maps a character to bytes on a computer. For example, the ISO 88591 series
of standards (IS0 8859-1, ISO 8859-2, and so forth) uses 8-bit for encoding characters and is
popular because it is usually sufficient for the English and other Latin-based languages.
ASCII
ASCII stands for American Standard Code for Information Interchange. It is a 7-bit character
encoding scheme based on ordering of English alphabet. It was the most widely used
encoding scheme in computers until it was surpassed by UTF-8.
1 ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encoding. The series of standards
consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, and so forth. There are 15 parts, excluding
the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been
disbanded.
ISO/IEC 8859 parts 1, 2, 3, and 4 were originally Ecma International standard ECMA-94.
In June 2004, the ISO/IEC working group responsible for maintaining eight-bit coded character sets disbanded and
ceased all maintenance of the ISO/IEC 8859 series. In the area of character encoding, ISO now concentrates on
the Universal Character Set (ISO/IEC 10646).
Unicode
Unicode is a universal computing industry standard that codifies characters of most
languages used with computers. The various encodings of the character sets in Unicode are
referred to as Unicode Transformation Format (UTF). For example, UTF-16, UTF-8,
UTF-EBCDIC, and so forth.
Whitespace
The term whitespace refers to characters that do not have a visual mark when displayed but
still occupy space or memory. The most common example is the space character (Unicode
U+0020, EBCDIC x’40’, and so forth). Other examples can be carriage return (CR), line feed
(LF) and horizontal tab, and so forth. The XML 1.0 specification allows usage of the following
whitespace characters:
HORIZONTAL TAB (U+0009)
LINE FEED (U+000A)
CARRIAGE RETURN (U+000D)
SPACE (U+0020)
Note:
The most common end of line character on z/OS is the newline (NEL) character. It is
x’15’ in EBCDIC and x’85’ in Unicode. For example, on z/OS, the \n string in C
converts to NEL and is often inserted in byte-oriented file systems such as
Hierarchical File System (HFS).
The XML 1.1 specification supports NEL and the z/OS XML System Services parser
is compliant.
The XML Parser, C++ Edition accepts documents with NEL as line termination
characters. They are normalized to LF by the parser.
When using z/OS XML System Services, the NEL will be normalized to EBCDIC NL
characters.
NEL is not allowed in the XML 1.0 recommendation, but is nevertheless supported
in the toolkit and XMLSS.
BOM
BOM (Byte Order Mark) has the Unicode code-point U+FEFF and is also referred to as
Zero-Width No-Break Space. This character is used to denote the endianness of the text
encoded in either UTF-16 or UTF-32. The BOM is placed as the first characters to indicate
the endianness of the file or character stream. It will be placed as x’FEFF’ to indicate big
endianness and as x’FFEF’ to indicate little endianness.
An XML document can be encoded in any scheme. What is required is for all parties creating
and consuming the XML document to agree upon the names used in the encoding=
declaration of the document. To ensure, a common set of encoding names are used, consider
using the character sets described in IANA2 (Internet Assigned Numbers Authority).
UTF-8
UTF-8 (8-bit UCS/Unicode Transformation Format) is a multi-byte encoding scheme for
Unicode that can represent any character in the Unicode character set. It provides backward
compatibility with ASCII. The CCSID for UTF-8 is 1208.
UTF-8 is also a variable length character encoding scheme. Some characters require only
one byte, some two, some three or more. UTF-8 is backward compatible with ASCII in that the
first 127 characters are mapped identically to ASCII.
UTF-16BE
UTF-16 (16-bit UCS/Unicode Transformation Format) is a variable length encoding scheme
for Unicode that maps each character to one or two 16-bit words. The CCSID for UTF-8 is
1200.
UTF-16BE (Big Endian) is one of two UTF-16 encoding schemes. UTF-16BE and UTF-16LE
(little endian) differ only in the order of the bytes for a character. UTF-16BE is the preferred
encoding for System z. UTF-16LE can be converted to UTF-16BE when required.
Important: After doing this conversion, be sure to update the encoding= statement in
the new document to reflect the new encoding.
The XML Toolkit will always convert XML documents into UTF-16BE before beginning the
parsing.
The XML Toolkit will return parsed data in UTF-16BE regardless of the encoding of the
input data stream.
The z/OS XML System Services expects the XML document to be in one of the supported
CCSID as listed in Appendix A, “Supported character encoding” on page 163”. The output
parsed data stream is the same as the passed CCSID. To determine the CCSID of the
XML document, assembler API GXL1QXD (31 bit) or GXL4QXD (64 bit) can be called. For
more information see z/OS XML System Services User’s Guide and Reference,
SA23-1350.
With the XMLPARSE(XMLSS) option in effect, the encoding might also be determined with
the optional ENCODING phrase in the XML PARSE statement. When XMLPARSE(COMPAT)
is in effect, the encoding might also be determined by inspecting the first few bytes of the
document or the encoding declaration in the XML document.
Encoding overrides
When the XMLPARSE(XMLSS) compiler option is in effect, some items might not be in effect
or might be superseded by something else. These are listed:
Any encoding declaration in the XML document is ignored.
When the XML document is in a national data item, then the encoding is determined as
UTDF-16BE (CCSID 1200). Therefore, the optional ENCODING phrase in XML PARSE
statement must be omitted or be specified with value as 1200. Also, the CODEPAGE
compiler option is ignored in this case.
If the XML document is in a alphanumeric data item, then the CCSID value mentioned in
the optional ENCODING phrase of the XML PARSE statement takes precedence over the
CODEPAGE compiler option.
Generating XML
The option for generation of an XML document is also available in z/OS Enterprise COBOL
V3R3 with the XML GENERATE statement. The encoding considerations when generating an
XML document are as follows:
When the optional ENCODING phrase is omitted, the encoding of the generated XML
document is determined by the data type of the receiving item. If the receiving item is
alphanumeric, then the encoding is as mentioned in CODEPAGE compiler option. If the
receiving item is of national type, then the encoding is UTF-16BE (CCSID 1200).
When the optional ENCODING phrase is specified, then it must be one of the following
choices:
– 1200 (UTF16-BE) if the receiving item is of type national.
– 1208 (UTF-8) if the receiving item is of type alphanumeric.
With the XMLPARSE(COMPAT) compiler option, an exception will be raised if the XML
declaration does not begin at the first byte. Thus, BOM is not accepted for parsing.
For more details on handling XML data in COBOL, refer Enterprise COBOL for z/OS,
Language Reference Version 4 Release 1, SC23-8528 and Enterprise COBOL for z/OS,
Programming Guide Version 4, Release 1, SC23-8529.
Determining encoding
Broadly speaking, the encoding of the XML document is determined by the routine by looking
at three places. First, the document is inspected for the first few bytes to look for the basic
encoding. Second, the routines look for the encoding attribute in the XML declaration, if
specified. Third, the PLISAX call is examined to determined the code page value. If it is
omitted, then the default or specified value of CODEPAGE compiler option will be used.
For more information about these routines, see Enterprise PL/I for z/OS Programming Guide,
SC27-1457.
Encoding overrides
If the value of code page specified in the PLISAX call is omitted then the value specified (or
obtained as a default) in the CODEPAGE compiler option will be in effect.
Generating XML
These routines do not support generation of XML. To generate XML, the built-in function
XMLCHAR must be used.
BOM
The BOM is neither inserted by the XMLCHAR built-in function nor is it honored by the
routines for parsing.
Before the sample applications are described, the XML files being used are shown
throughout this section. These XML files have been saved in a z/OS UNIX file system for
convenience. These files could have been saved in traditional MVS datasets as well.
In z/OS XML System Services, the GXL1QXD (GXL4QXD) service will report the XML
declaration in the document. This report will be based on the structure of GXLYQXD. B.5, “C
program to query XML document declaration” on page 183 shows the COBOL source for
calling this service. Each of the samples in this section show an XML document and report
created out of calling the service. The report has been split into two parts: the first part does
not report the XML flags, whereas the second part reports them in hexadecimal display. The
first part also shows the length of the document read that is not a part of the output structure
but is being emitted by the COBOL program. Refer to z/OS XML System Services User
Guide, SA23-1350 to learn more about the service and the output structure.
If this XML document were to be received onto a PC using FTP (with or without binary
transfer option) and viewed in an editor that shows all characters (such as a hex editor),
then the line termination characters.They will be seen as x’0A’ with translation and x’15’
without translation.
When this file is passed as input to the COBOL program calling the GXL1QXD service,
Example 11-3 and Example 11-4 show the results obtained.
Example 11-3 Result of GXL1QXD service for IBM01141 encoded XML document
Buffer Length : 000000226
QXD Version : 1
XML Autodet Value : 8
XML Autodet CCSID : 37
XML Version : 1
XML Release : 0
XML Spec CCSID : 1141
XML Reserved : 0
XML Decl Length : 58
Example 11-4 Result (XML flags) of GXL1QXD service for IBM01141 encoded XML document
XML Flags 1 : Ø
EDD4C988A4F44444447484
743063172010000000A000
----------------------
XML Flags 2 : Ö
EDD4C988A4F444444474E4
743063172020000000A000
As UTF-8 is available in the Windows and Linux environment, you can use simple editors on
these platforms. In Windows Notepad, this can be done by selecting UTF-8 as the Encoding
option in the “Save As” dialog box as shown in Figure 11-1.
A file encoded in UTF-8 can also be created on System z by some of the options listed as
below:
Creating a file in EBCDIC and using iconv() for the conversion.
Calling z/OS Unicode for conversion. See z/OS Support for Unicode: Using Unicode
Services, SA22-7649.
Using a programming language such as COBOL, PL/I, C/C++ and achieve the translation
using the services provided by these languages.
Saving a file in HFS with FILETAG attribute set and automatic conversion turned on (see
11.4.6, “Saving ASCII files on System z” on page 161
Figure 11-1 Windows Notepad dialog box with encoding option highlighted
This file should be then transferred to z/OS UNIX (for instance, $HOME/xml/sample-utf8.xml)
on z/OS with transfer type as binary. When this file is passed as input to the COBOL program
calling the GXL1QXD service, Example 11-6 and Example 11-7 show the results obtained.
Example 11-6 Result of GXL1QXD service for UTF-8 encoded XML document
Buffer Length : 000000254
QXD Version : 1
XML Autodet Value : 7
XML Autodet CCSID : 1208
XML Version : 1
XML Release : 0
XML Spec CCSID : 1208
XML Reserved : 0
XML Decl Length : 58
Example 11-7 Result (XML flags) of GXL1QXD service for UTF-8 encoded XML document
XML Flags 1 : ä
EDD4C988A4F444444474C
743063172010000000A00
---------------------
XML Flags 2 : Ö
EDD4C988A4F444444474E
743063172020000000A00
3
For a UTF-8 file, BOM is not necessary. The Unicode standard allows for the BOM but does not recommend it. See
page 36, Chapter 2, in Unicode 5.0.0 version of the Unicode Standard.
http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf
Example 11-9 Result of GXL1QXD service for UTF-16BE encoded XML document
Buffer Length : 000000504
QXD Version : 1
XML Autodet Value : 5
XML Autodet CCSID : 1200
XML Version : 1
XML Release : 0
XML Spec CCSID : 1200
XML Reserved : 0
XML Decl Length : 118
Note the length of the buffer above. The XML document used in this case is same as in
example 11-5 except for the value of encoding attribute. In case of document with UTF-8
encoding, we had 251 bytes for the characters and three bytes for BOM. The 251 bytes
include 8 bytes for line feed characters and 3 bytes more for special characters which
occupied two bytes each. (Recall, UTF-8 is a variable 8-bit encoding scheme.) Thus,
multiplying 251 bytes by 2 for UTF-16 representation and adding two more bytes for BOM, we
have a total of 504 bytes.
Example 11-10 Result (XML flags) of GXL1QXD service for UTF-16BE encoded XML document
XML Flags 1 : ä
EDD4C988A4F444444474C
743063172010000000A00
---------------------
XML Flags 2 : Ö
EDD4C988A4F444444474E
743063172020000000A00
We touch upon various transport protocols supported on the System z to communicate with
external systems. Then we describe the encoding considerations with regards to XML
documents. However, if the XML document were to travel around various points in an
enterprise (or outside), it is safest to have the encoding attribute specified in the document
itself rather than rely on any other mechanisms or assumptions. Should the XML document
undergo conversion, the value of the encoding attribute should be changed accordingly.
11.4.1 HTTP
The encoding information can be provided in the HTTP header itself. If character conversion
is likely, having encoding declarations in the HTTP header is better because a server or a
client might assign a higher precedence to this declaration than to an in-document
declaration. If XHTML pages are being used as XML, the meta charset declaration in the
HTTP header should not be used. It should be used for HTML or XHTML served as HTML.
11.4.2 WebSphere MQ
WebSphere MQ applications communicate with MQ servers using IBM-defined data
structures. All MQ applications contain data structures for an object description, message
description, get options, or put options. WebSphere MQ applications running on z/OS must
use the same character set as their local MQ server to view or update these data structures.
Therefore, if the z/OS MQ servers are set to use ccsid 500, each z/OS MQ application should
be written in and compiled as ccsid 500. Alternatively, the application must explicitly convert
character fields to and from the local MQ server’s ccsid and its own ccsid.
Users can define their own user message types and provide corresponding conversion
routines for them. For user-defined message types, message receive requests that specify
the MQGMO_CONVERT option will invoke the user-provided conversion routines as each
message is received.
MQFMT_STRING is the format senders should designate for messages that consist solely of
displayable characters. Messages consisting entirely of XML should be sent using the
MQFMT_STRING format. When message retrieval specifies MQGMO_CONVERT for
messages sent as MQFMT_STRING, the entire message body is converted, if necessary, to
the receivers requested character set. For example, an XML message sent from an EBCDIC
ccsid 500 application that specified MQFMT_STRING is automatically converted in EBCDIC
ccsid 1047 for an application that receives messages using MQGMO_CONVERT and
requests ccsid 1047.
After each GET message request, the CODEDCHARSETID field is set to the ccsid specified
by the sender for unconverted messages, or to the receiver’s requested ccsid for converted
messages.
The default (and most common) MQ message format for sending applications is
MQFMT_NONE. MQGMO_CONVERT will not convert messages of type MQFMT_NONE.
The unconverted message will be returned to the receiver with a warning condition. However,
the receiver can use the updated CODEDCHARSETID field to convert the message body
from the sender’s ccsid to the receiver’s ccsid.
Note that character set conversion with or without MQGMO_CONVERT might corrupt the
byte-order-marks (BOM) of XML Unicode documents.
When exchanging XML documents with System z through FTP, there are a couple of choices.
Receive the document with translation enforced and treat the document as normal
EBCDIC data. However, it must be ensured that, the encoding attribute in the XML
declaration (if present) is not in conflict with the result of conversion.
Receive the document with translation not enforced and treat the document as ASCII
(ISO8859-x) or Unicode data. As above, the encoding attribute in the XML declaration (if
present) should not conflict with the encoding of the document.
If the document is being received into a Hierarchical File System (HFS) with file tagging
and automatic conversion enabled, extra care should be taken. The goal should be to
ensure that there are no unnecessary translations happening. Should the document
should undergo translation, it should not conflict with the encoding attribute in the XML
declaration, if present. See 11.4.6, “Saving ASCII files on System z” on page 161 for more
details.
The following examples show when character conversion could happen when exchanging
data (in general and XML documents in particular) between different servers on possibly
different platforms.
The SQL statement maybe converted to UTF-8 for parsing when the text is passed in a
PREPARE statement in an ASCII application.
Changing the value of special register CURRENT APPLICATION ENCODING SCHEME
to a value different from the encoding scheme of the data to be retrieved.
Value of ENCODING option in BIND statement is different from the encoding scheme of the
target server.
When using DB2 as a Web service provider or consumer, then the SOAP message that is
exchanged should be saved in XML column of a table. Instead, if the response from a Web
service is received into a CLOB, DBCLOB, GRAPHIC, or VARGRAPHIC column, then
See more in DB2 Version 9.1 for z/OS Internationalization Guide, SC19-1161.
11.4.5 CICS
With CICS Transaction Server for z/OS Version 3.1, the concept of containers and channels
were introduced. With the advent of containers, the limit of 32 K of a COMMAREA was
overcome to allow for exchange of larger amount of data. An application program can send or
receive any number of containers.
This feature of containers and channels is perfectly suited for exchange of large XML
documents. The CICS Transaction Server provides APIs to put (PUT CONTAINER) or get
(GET CONTAINER) data into containers with or without conversion. The code page for data
conversion is provided as a CCSID value (numeric) or one of the IANA-registered charset
name for the code page. If the code page is not mentioned, the default value is taken from
LOCALCCSID system initialization parameter. See the following Web page:
https://publib.boulder.ibm.com/infocenter/cicsts/v4r1/topic/com.ibm.cics.ts.doc/lp
aths/channels_lp_overview.html
If an ASCII file was sent in without translation and saved in the z/OS UNIX file system, the
program reading (or writing) to that file should take into account the file tag and the
enablement of automatic conversion. Another environment variable to consider when
automatic conversion is in effect is _BPXK_CCSIDS. This environment variable is a pair of
EBCDIC/ASCII CCSIDs that are used during automatic conversion.
Therefore, when a program is reading (or writing) a file from an HFS area, the data
transferred to (or written from) the program is already in the required encoding. A conversion
before a read or after a write operation is not necessary. In fact, a conversion may lead to
incorrect results. The implication of XML documents saved in HFS is the encoding of the XML
document should match with the target of conversion that will happen if automatic conversion
was in effect. This should also not conflict with the encoding attribute in the XML declaration,
if present. Neither should it conflict with what is passed for PIMA creation.
1140, 37 Latin-1 / Open Systems. 1140 has support for euro. EBCDIC
1145, 284 Spain, Latin America. 1145 has support for euro.
Example B-1 also contains some code to show the conversion from one codepage into
another. Here we use the PL/I built-in function MEMCONVERT, the codes follows after the 1
/* 1 */
first = 'Hans-Dieter'; Last = 'Mertiens' ;
next = addr(buffer);
left = stg(buffer);
written = xmlchar( Personnel, next, left );
next += written;
left -= written;
Put skip edit ( buffer ) (A(written)) ;
next = addr(buffer);
left = stg(buffer);
written = xmlchar( Personnel, next, left );
Put skip edit ( buffer ) (A(written)) ;
Put skip list ( written) ;
end; /* main */
dcl
gxl1qxd entry (
fixed bin (63) byaddr , /* work area for qxd */
fixed bin(31) byaddr , /* work area length */
pointer byaddr, /* input buffer */
fixed bin(31) byaddr , /* input buffer length */
pointer byaddr , /* pointer to return_data */
fixed bin (31) byaddr , /* return code */
fixed bin (31) byaddr /* reason cide */
)
options (asm , inter, retcode ) /*
returns ( fixed bin(31) byvalue) */ ;
dcl
gxl1ini entry(
pointer byvalue, /* parse instance memory area (pima) */
fixed bin(31) byaddr, /* pima length */
fixed bin(31) byaddr, /* ccsid of document */
fixed bin(31) byaddr, /* parse feature flags*/
pointer byaddr, /* vector of system service routines */
pointer byaddr, /* system service routine parameter */
fixed bin(31) byaddr, /* return code */
fixed bin(31) byaddr /* reason code */
)
limited
dcl
gxl1prs entry(
pointer byvalue, /* parse instance memory area (pima) */
fixed bin(31) byaddr, /* options flags */
pointer byaddr, /* xml document address */
fixed bin(31) byaddr, /* xml document length left (bytes) */
pointer byaddr, /* output buffer address */
fixed bin(31) byaddr, /* output buffer length left (bytes) */
fixed bin(31) byaddr, /* return code */
fixed bin(31) byaddr /* reason code */
)
limited
based( ptradd(csrt(19),20) )
returns( fixed bin(31) byvalue );
dcl
gxl1trm entry(
pointer byvalue, /* parse instance memory area (pima) */
fixed bin(31) byaddr, /* return code */
fixed bin(31) byaddr /* reason code */
)
limited
based( ptradd(csrt(19),24) )
returns( fixed bin(31) byvalue );
dcl
gxl_feat_strip_comments fixed bin(31)
value('80000000'xn),
gxl_feat_tokenize_whitespace fixed bin(31)
value('40000000'xn),
gxl_feat_cdata_as_chardata fixed bin(31)
value('20000000'xn);
dcl
gxl_rsn_mask fixed bin(31)
value('0000ffff'xn);
dcl
/* per xmlss, 128k minimum */
gxl_min_pima_size fixed bin(31) value(128*1024),
/* per xmlss, 128 byte minimum */
gxl_min_output_buffer_size fixed bin(31) value(128),
/* expansion factor input->output */
gxl_io_factor fixed bin(31) value(150),
/* minimum size for qxd */
gxl_min_qxdwork_size fixed bin (31) value('8000'xn)
;
dcl
1 gxl_bufrec based,
2 gxl_rechdr,
3 gxl_rectyp ordinal gxl_record_type,
3 gxl_recflg fixed bin(8) unsigned,
3 gxl_recrsd fixed bin(8) unsigned,
3 gxl_reclen fixed bin(31),
2 gexl_recfms union,
3 gxl_bufinf,
4 gxl_dsopts fixed bin(32) unsigned,
4 gxl_prstat fixed bin(16) unsigned,
4 gxl_resrvd fixed bin(16) unsigned,
4 gxl_bufusd fixed bin(64) unsigned,
4 gxl_bufrof fixed bin(64) unsigned,
3 gxl_errrec,
4 gxl_retcod fixed bin(32) unsigned,
4 gxl_rsncod fixed bin(32) unsigned,
3 gxl_lvpairs char(0);
dcl
1 gxl_lvdata based,
2 gxl_vallen fixed bin(31),
2 gxl_valchs char( 1 refer(gxl_vallen) );
pima_addr = addr(pima);
pima_len = stg(pima);
ccsid = 37;
if return_code ^= 0
| reason_code ^= 0 then
do;
put skip list( 'init failed!!!' );
put skip list( rc );
put skip list( return_code );
put skip list( reason_code );
call pliretc(16);
end;
else
do;
obsize
= max( gxl_min_output_buffer_size,
mxmm_doc_len * gxl_io_factor / 100 );
buffer_addr = alloc(obsize);
buffer_len = obsize;
do loop;
return_code = 0;
reason_code = 0;
/*
*/
put skip list( '<<<<before invoking the parser' );
put skip list( 'doclrem=' || document_len_remaining );
put list( 'docacur= ' || hex(document_current_addr) );
put skip list( 'buflrem=' || buffer_len_remaining );
put list( 'bufaddr= ' || hex(buffer_current_addr) );
current_record = buffer_addr;
do while( current_record < buffer_current_addr );
bufptr = current_record;
/*
put skip list( bufptr->gxl_reclen );
*/
select( bufptr->gxl_rectyp );
when( gxl_rt_buffer_info )
do;
put skip list( hex(bufptr->gxl_bufusd) );
put list( hex(bufptr->gxl_bufrof) );
end;
when( gxl_rt_xml_decl )
do;
lvaddr = addr(bufptr->gxl_lvpairs);
put skip list( lvaddr->gxl_vallen );
put list( lvaddr->gxl_valchs );
lvaddr += stg(gxl_vallen) + lvaddr->gxl_vallen;
put skip list( lvaddr->gxl_vallen );
put list( lvaddr->gxl_valchs );
lvaddr += stg(gxl_vallen) + lvaddr->gxl_vallen;
put skip list( lvaddr->gxl_vallen );
put list( lvaddr->gxl_valchs );
end;
when( gxl_rt_start_elem )
do;
lvaddr = addr(bufptr->gxl_lvpairs);
put list( lvaddr->gxl_vallen );
current_record += bufptr->gxl_reclen;
end;
if return_code = 4 then
select( reason_code );
when( gxl_rsn_buffer_inbuf_end
,gxl_rsn_buffer_inoutbuf_end )
do; /* 1301, 1304 */
call read_xml( xmldocument );
rc = gxl1trm( pima_addr,
return_code, reason_code );
if (reads = 0) then do ;
open file (xmlin) input;
put skip list ('xmlin opened');
end ;
if ( reads = 1 ) then do ;
call gxl1qxd ( qxdw(1) ,
gxl_min_qxdwork_size,
addrdata(xmldocument) ,
length(xmldocument),
p_qxdanswer,
return_code,
reason_code ) ;
put skip list ('plretv() returns ', pliretv());
rc = pliretv() ;
if (return_code = 0) then do ;
reason_code = iand( reason_code, gxl_rsn_mask );
put skip list( 'rc= ' || rc );
put list( ' return_code= ' || return_code );
put list( ' reason_code= ' || hex(reason_code) );
put skip list( heximage(p_qxdanswer,32,' ') );
put skip list( heximage(p_qxdanswer+32,32,' ') );
end;
else do ;
put skip list ('gxl1qxd returns rc', return_code,
' rsn ', reason_code ) ;
end;
#include <gxlhosrg.h>
#include <gxlhxec.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <iconv.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <time.h>
void * oima_p;
unsigned long oima_l, n_strings, str_offset, strid, strid_l, strid_o;
char handler_parms[128];
char * osr_buf, * stepc, * strid_id ;
char b [256] ;
char * d , * c ;
int osrbuf_l, i;
GXLHXSTR * strIDTbl_p;
GXLHXSTR_TBLENTRY * stepp ;
int strIDTbl_l;
int osr_size, num;
int rc, rsn;
iconv_t cd ;
size_t ij, ik ;
if (argc < 2) {
printf("\n we need at least 1 parameter the osr\n");
exit(-1);
} /* check arg counter */
if (stat(argv[1], &info) != 0) {
perror("stat() error");
exit (-1) ;
} /* could not stat() */
else
{ puts("stat() returned the following information about ");
printf("\n%s\n",argv[1]);
printf("created: %s\n", ctime(&info.st_createtime));
printf(" uid: %d\n", (int) info.st_uid);
printf(" gid: %d\n", (int) info.st_gid);
osr_size = info.st_size ;
printf(" size: %d\n", (int) info.st_size);
} /* check for availability & size */
fprintf(stderr,"\nsizeof(GXLHXSTR) %d, sizeof(GXLHXSTR_TBLENTRY) %d",
sizeof(GXLHXSTR), sizeof(GXLHXSTR_TBLENTRY) );
osrbuf_l = osr_size ;
gxluLoadOSR (oima_p, (void *)osr_buf, osrbuf_l, &rc, &rsn);
if (rc == GXLHXRC_SUCCESS) { /* OSR load succeeded */
strIDTbl_l = gxluGenStrIDTable
(oima_p, &strIDTbl_p, &rc, &rsn);
} /* generator initialized */
} /* end of main() */
For the sake of simplicity, this program was written for a small XML document, so the entire
document was read into a working storage variable.
End-Read .
Read-Exit .
Exit .
*
String-XML-data .
Add XML-Doc-Len to QXD-Input-Buffer-Length
String Read-Buffer (1:XML-Doc-Len) delimited by size
into QXD-Input-Buffer
with pointer Length-pointer .
Perform Read-XML-Doc
thru Read-Exit .
String-Exit .
Exit .
zample-utf8.xml
created: Mon Aug 31 09:21:47 2009
uid: 58
gid: 0
size: 254
QXD_Version 1x( 1)
QXD_XML_Autodet_value 7x( 7)
QXD_XML_Autodet_CCSID 4B8x( 1208)
QXD_XML_Specified_CCSID 4B8x( 1208)
QXD_XML_Decl_Len 3Ax( 58)
QXD_XML_Version 1x( 1)
QXD_XML_Release 0x( 0)
QXD_XML_Flag1 C0x( 192)
QXD_XML_Flag2 E0x( 224)
HDM @ SC80:/u/hdm/sg7810>
/* #pragma runopts(RPTOPTS(ON)) */
#pragma runopts(XPLINK(ON))
#include <gxlhosrg.h>
#include <gxlhxec.h>
#include <gxlhqxd.h>
#include <stdlib.h>
#include <stdio.h>
#include <iconv.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <time.h>
void * oima_p;
unsigned long oima_l, n_strings, str_offset, strid, strid_l, strid_o;
char handler_parms[128];
char * xml_buf, * stepc, * strid_id ;
int xmlbuf_l, i;
long work_area_length = GXLHXEC_MIN_QXDWORK_SIZE;
long work_area[((GXLHXEC_MIN_QXDWORK_SIZE)+7)/4];
GXLHQXD * rt_d ;
int xml_size, num;
int rc, rsn;
size_t ij, ik ;
FILE * xmlin ;
if (argc < 2) {
printf("\n we need at least 1 parameter the XML doc name\n");
exit(-1);
} /* check arg counter */
if (stat(argv[1], &info) != 0) {
perror("stat() error");
exit (-1) ;
} /* could not stat() */
else
{ puts("stat() returned the following information about ");
printf("\n%s\n",argv[1]);
printf("created: %s\n", ctime(&info.st_createtime));
printf(" uid: %d\n", (int) info.st_uid);
printf(" gid: %d\n", (int) info.st_gid);
printf(" size: %d\n", (int) info.st_size);
} /* check for availability & size */
gxlpQuery ( work_area,
work_area_length ,
(char *)(xml_buf) ,
xml_size ,
&rt_d,
&rc,
&rsn ) ;
if (rc > 0 ) {
fprintf(stderr,
"\ngxlpQuery returned rc=%6d(%8X) rsn=%6d(%8X)\n", rc, rsn);
}
/*
int QXD_Version;
unsigned int QXD_XML_Autodet_value;
unsigned int QXD_XML_Autodet_CCSID;
unsigned short QXD_XML_Version;
unsigned short QXD_XML_Release;
unsigned int QXD_XML_Specified_CCSID;
/*****************************************************************/
/* QXD Flag1 */
/* - standalone bit is on if standalone = yes */
/* - bom is on if byte order mark detected in the doc */
/* - encoding undetected bit is on if encoding not auto-detected */
/*****************************************************************/
/* unsigned char QXD_XML_Flag1 */
else {
fprintf(stderr,
"\nQXD_Version %8Xx(%8d) ",
rt_d->QXD_Version , rt_d->QXD_Version );
fprintf(stderr,
} /* end of main() */
For documents with encoding values other than EBCDIC, the COBOL functions
NATIONAL-OF and DISPLAY-OF have been used to output data in EBCDIC format. This
program accepts the encoding of the document through job step PARM parameter. It is a four
digit value to be padded with leading zeroes for the CCSID of the document. If this parameter
is omitted, the program passes 1208 (UTF-8) as the default CCSID. This program was tested
with the XML document to be parsed passed as a HFS file and referred in this program with
the DD name TSTXML. The output of the program is also an HFS file and is referred in this
program with the DD name OUTDATA.
Example B-7 COBOL program calling z/OS XML System Services on z/OS for parsing
Identification Division .
Program-ID XMLPRS1 .
*
*This is a sample program to invoke z/OS XML System Services
*to a parse an XML document without validation.
*
*This program will read from file as many times as required to
*fill the Input-Buffer to be sent to the z/OS XML parser. The
*size of this buffer has been set in this program to be 4KB.
*
*The output buffer has been set to 4KB for the sample XML document
*taken for this program. This size of the buffer ensured that, we
*do not hit x'1302' but we do encounter x'1303' or x'1304' reason
*codes.
*
*In practice though, an estimate of output buffer being twice the
*size of input buffer is good.
*
*
* Return Code Reason Code Short Description
* ----------- ----------- -----------------
* 4 1301 Input Buffer ended
* 8 1302 Output Buffer small
* 4 1303 Output Buffer ended
* 4 1304 Input and Output Buffer end
*
*Handles encoding : UTF-16BE (1200), UTF-8 (1208) and
* EBCDIC encodings. Default : 1208
*
*See SYS1.MACLIB(GXL*) for data structures and constant declar-
*ations.
*
*aanemali.
*
Environment Division .
*
Input-Output Section .
File-Control .
Select TstXML assign to TSTXML
Organization is line sequential .
*
1100-Read-Doc-Into-Buffer .
Perform 9000-Read-File thru 9000-Exit
If End-Of-File = "Y"
Go to 1100-Exit
1200-Init-XMLSS .
Move spaces to GXL1INI-PIMA
Compute GXL1INI-PIMA-Length = XEC-NVParse-Min-PIMA
If Parm-Length = 0
Move XEC-ENC-UTF-8 to GXL1INI-CCSID
Else
Move Parm-CCSID-L to Parm-CCSID
Move Parm-CCSID-Num to GXL1INI-CCSID
End-If
Compute GXL1INI-Flags = XEC-Feat-Strip-Cmnts
*Uncomment below line if running on v10
* + XEC-Feat-Full-End
Move zero to GXL1INI-Return-Code
GXL1INI-Reason-Code
.
Call XMLSS-GXL1INI
using GXL1INI-PIMA
GXL1INI-PIMA-Length
GXL1INI-CCSID
GXL1INI-Flags
Omitted
Omitted
GXL1INI-Return-Code
GXL1INI-Reason-Code
.
If GXL1INI-Return-Code > 0
2000-Do-Parse .
Initialize GXL1PRS-Structure
Set GXL1PRS-Input-Buff-Address to
address of Input-Buffer
Set GXL1PRS-Output-Buff-Address to
address of Output-Buffer
Move Input-Buffer-Length to GXL1PRS-Input-Buff-Bytes
Compute GXL1PRS-Output-Buff-Bytes = 5 * 1KB
.
Call XMLSS-GXL1PRS using
GXL1INI-PIMA
GXL1PRS-Options
GXL1PRS-Input-Buff-Address
GXL1PRS-Input-Buff-Bytes
GXL1PRS-Output-Buff-Address
GXL1PRS-Output-Buff-Bytes
GXL1PRS-Return-Code
GXL1PRS-Reason-Code
returning Call-Return-Code
.
Move spaces to Message-String
Move GXL1PRS-Return-Code to Svc-Return-Code
Move GXL1PRS-Reason-Code to Svc-Reason-Code-FW
Move Svc-Reason-Code-B to Svc-Reason-Code
Move Input-Buffer-Length to FW-To-Edited
String "Length Input Buffer : " delimited by size
FW-To-Edited delimited by size
" GXL1PRS Return Code : " delimited by size
Svc-Return-Code delimited by size
" Reason Code : " delimited by size
Svc-Reason-Code delimited by size
into Message-String
Display Message-String
.
Move GXL1PRS-Return-Code to GXL-Return-Codes
Evaluate true
When XRC-Success
Perform 2100-Process-Output thru 2100-Exit
2100-Process-Output .
Move 1 to OP-Buffer-Start
Move 8 to OP-Buffer-Length
Move Output-Buffer (OP-Buffer-Start:OP-Buffer-Length)
to Output-Buffer-Common-Header
Compute OP-Buffer-Start = OP-Buffer-Length + 1
Compute OP-Buffer-Length = OP-Buff-Rec-Length - 8
Move Output-Buffer (OP-Buffer-Start:OP-Buffer-Length)
to Buffer-Info-Record
.
If BI-Error-Offset > 0
Move spaces to Message-String
Perform 2110-Get-Error-Record thru 2110-Exit
Go to 2100-Exit
End-If
.
Compute OP-Buffer-Start = OP-Buffer-Start +
OP-Buffer-Length
Compute OP-Buffer-Length = 0
.
Move space to Scan-End
Perform 2120-Scan-Output-Buffer thru 2120-Exit
until Scan-End = "Y"
.
2100-Exit .
Exit .
2110-Get-Error-Record .
Move Output-Buffer(BI-Error-Offset + 9 : 4)
to Parse-Return-Code-X
Move Output-Buffer(BI-Error-Offset + 13 : 4)
to Parse-Reason-Code-X
Move Output-Buffer(BI-Error-Offset + 17 : 8)
to Error-Offset-X
Move Error-Offset-B to FW-To-Edited
String "Error offset in doc : " delimited by size
FW-To-Edited delimited by size
into Message-String
Display Message-String
.
2110-Exit .
Exit .
2120-Scan-Output-Buffer .
2121-XML-Declaration .
Move "XML Version : " to Leader-String
Perform 4000-Move-Pointer thru 4000-Exit
Perform 6000-Build-Output thru 6000-Exit
Perform 7000-Move-Ahead thru 7000-Exit
Perform 9100-Write-File thru 9100-Exit
.
Move "XML Encoding : " to Leader-String
Perform 5000-Move-Pointer-Next thru 5000-Exit
Perform 6000-Build-Output thru 6000-Exit
Perform 7000-Move-Ahead thru 7000-Exit
Perform 9100-Write-File thru 9100-Exit
.
Move "XML Standalone : " to Leader-String
Perform 5000-Move-Pointer-Next thru 5000-Exit
Perform 6000-Build-Output thru 6000-Exit
Perform 7000-Move-Ahead thru 7000-Exit
Perform 9100-Write-File thru 9100-Exit
2122-XML-Start-Element .
Move "Start Element name : " to Leader-String
Perform 4000-Move-Pointer thru 4000-Exit
Perform 6000-Build-Output thru 6000-Exit
Perform 7000-Move-Ahead thru 7000-Exit
Perform 9100-Write-File thru 9100-Exit
.
Move "Start Element namespace : " to Leader-String
Perform 5000-Move-Pointer-Next thru 5000-Exit
Perform 6000-Build-Output thru 6000-Exit
Perform 7000-Move-Ahead thru 7000-Exit
Perform 9100-Write-File thru 9100-Exit
.
Move "Start Element namespace prefix : " to Leader-String
Perform 5000-Move-Pointer-Next thru 5000-Exit
Perform 6000-Build-Output thru 6000-Exit
Perform 7000-Move-Ahead thru 7000-Exit
Perform 9100-Write-File thru 9100-Exit
.
2122-Exit .
Exit .
2123-XML-End-Element.
*Uncomment all the commented lines below if XEC-Feat-Full-End
*feature flag is in effect.
* Move "End Element name : " to Leader-String
Compute OP-Buffer-Start = OP-Buffer-Start + 8
* Perform 4000-Move-Pointer thru 4000-Exit
* Perform 6000-Build-Output thru 6000-Exit
* Perform 7000-Move-Ahead thru 7000-Exit
* Perform 9100-Write-File thru 9100-Exit
* .
* Move "End Element namespace : " to Leader-String
* Perform 5000-Move-Pointer-Next thru 5000-Exit
* Perform 6000-Build-Output thru 6000-Exit
* Perform 7000-Move-Ahead thru 7000-Exit
* Perform 9100-Write-File thru 9100-Exit
* .
* Move "End Element namespace prefix : " to Leader-String
* Perform 5000-Move-Pointer-Next thru 5000-Exit
* Perform 6000-Build-Output thru 6000-Exit
* Perform 7000-Move-Ahead thru 7000-Exit
* Perform 9100-Write-File thru 9100-Exit
.
2123-Exit .
Exit .
2124-XML-Attrib-Name.
Move "Attribute name : " to Leader-String
Perform 4000-Move-Pointer thru 4000-Exit
Perform 6000-Build-Output thru 6000-Exit
2125-XML-Attrib-Value.
Move "Attribute value : " to Leader-String
Perform 4000-Move-Pointer thru 4000-Exit
Perform 6000-Build-Output thru 6000-Exit
Perform 7000-Move-Ahead thru 7000-Exit
Perform 9100-Write-File thru 9100-Exit
.
2125-Exit.
Exit .
2126-XML-Char-Data.
*Uncomment the below commented part, if the sufficient space has
*been allocated to the output dataset in proportion to the size
*of XML document.
* Move "Char data : " to Leader-String
Perform 4000-Move-Pointer thru 4000-Exit
Perform 6000-Build-Output thru 6000-Exit
Perform 7000-Move-Ahead thru 7000-Exit
* Perform 9100-Write-File thru 9100-Exit
.
2126-Exit.
Exit .
2127-XML-NS-Decl.
Move "Namespace prefix : " to Leader-String
Perform 4000-Move-Pointer thru 4000-Exit
Perform 6000-Build-Output thru 6000-Exit
Perform 7000-Move-Ahead thru 7000-Exit
Perform 9100-Write-File thru 9100-Exit
.
Move "Namespace URI : " to Leader-String
Perform 5000-Move-Pointer-Next thru 5000-Exit
Perform 6000-Build-Output thru 6000-Exit
Perform 7000-Move-Ahead thru 7000-Exit
Perform 9100-Write-File thru 9100-Exit
.
2127-Exit.
7000-Move-Ahead .
Compute OP-Buffer-Start = OP-Buffer-Start + Length-Binary.
7000-Exit .
Exit .
9000-Read-File .
Read TstXML into Read-Buffer
at end
Move "Y" to End-of-File
End-Read .
9000-Exit .
Exit .
9100-Write-File .
Write OutData-Rec from Write-Buffer .
9100-Exit .
Exit .
8000-Done .
Close TstXML OutData .
If GXL1INI-Called = "Y"
Call XMLSS-GXL1TRM using GXL1INI-PIMA
GXL1TRM-Return-Code GXL1TRM-Reason-Code
End-If
.
If GXL1PRS-Return-Code > GXL1TRM-Return-Code
Move GXL1PRS-Return-Code to Return-Code
else
Move GXL1TRM-Return-Code to Return-Code
End-If
.
8000-Exit .
Exit .
Example B-8 Enterprise COBOL program for parsing (without validation) and buffer management.
Identification Division .
Program-ID XMLPRS2 .
*
*This is a sample program to invoke z/OS XML System Services
*to a parse an XML document without validation.
*
*This program will read from file as many times as required to
*fill the Input-Buffer to be sent to the z/OS XML parser. The
*size of this buffer has been set in this program to be 4KB.
*
*The output buffer has been set to 4KB for the sample XML document
*taken for this program. This size of the buffer ensured that, we
*do not hit x'1302' but we do encounter x'1303' or x'1304' reason
*codes.
*
*In practice though, an estimate of output buffer being twice the
*size of input buffer is good.
*
*
* Return Code Reason Code Short Description
* ----------- ----------- -----------------
* 4 1301 Input Buffer ended
* 8 1302 Output Buffer small
* 4 1303 Output Buffer ended
* 4 1304 Input and Output Buffer ended
*
*Handles encoding : UTF-16BE (1200), UTF-8 (1208) and
* EBCDIC encodings.
*
*See SYS1.MACLIB(GXL*) for data structures and constant declar-
*ations.
*
*
Environment Division .
*
Input-Output Section .
File-Control .
Select TstXML assign to TSTXML
Organization is line sequential .
*
Select OutData assign to OUTDATA
Organization is line sequential .
*
Data Division .
File Section .
Example B-9 shows code that can traverse the records generated by the z/OS XML parser:
// Local prototypes
int ReadFile(char *, char *, size_t);
void PrintBufInfo(void *);
void PrintXMLDecl(void *);
void PrintEAName(void *);
void PrintNSName(void *);
void PrintPI(void *);
void PrintDTD(void *);
void PrintAuxInfo(void *);
void PrintError(void *);
void PrintLenVal(GXLHXEH_VALUE *, int);
// Local constants
#define REC_STREAM_BUF_LEN 1048576
#define LINE_WIDTH 80
#define SUCCESS 0
#define OFF 0
#define ON 1
// -------------------------------------------------------------
// Macros for traversing the z/OS XML parse record stream
// -------------------------------------------------------------
// Navigate to the next record in the binary stream.
#define NEXTREC(p) (GXLHXEH_RECORD *)((int)p + p->XEH_RecLen)
// -------------------------------------------------------------
// Mainline code
// -------------------------------------------------------------
int main(int argc,
char **argv)
{ // start of main
char *pZOSXmlFileName;
char ZOSXml[REC_STREAM_BUF_LEN];
int lZOSXml;
int i = 0;
int j = 0;
int rc = 0;
// Get the name of the file with the z/OS XML datastream.
if (argc >= 0)
pZOSXmlFileName = argv[1];
else
{ // no file name secified
printf("Error - no file name specified.\n");
rc = FAILURE;
goto done;
} // no file name secified
if (!lZOSXml)
{ // read the z/OS XML failed
rc = FAILURE;
goto done;
} // read the z/OS XML failed
// -------------------------------------------------------------
// This main loop traverses the record stream, and prints out
// what it sees.
// -------------------------------------------------------------
printf("\n----------------------------------------------------------\n");
// First, output the token type. Note that these token types
// are order in the switch statement below based roughly on the
// frequency with which they are likely to appear in a given
// document.
printf("+%04X ",i);
switch(RECTOK(pRec))
{ // token type switch
case GXLHXEC_TOK_START_ELEM:
printf("START_ELEM - "); break;
case GXLHXEC_TOK_END_ELEM:
printf("END_ELEM - "); break;
case GXLHXEC_TOK_ATTR_NAME:
printf("ATTR_NAME - "); break;
case GXLHXEC_TOK_ATTR_VALUE:
printf("ATTR_VALUE - "); break;
case GXLHXEC_TOK_NS_DECL:
printf("NS_DECL - "); break;
case GXLHXEC_TOK_CHAR_DATA:
printf("CHAR_DATA - "); break;
case GXLHXEC_TOK_COMMENT:
printf("COMMENT - "); break;
case GXLHXEC_TOK_BUFFER_INFO:
printf("BUFFER_INFO - "); break;
case GXLHXEC_TOK_XML_DECL:
printf("XML_DECL - "); break;
case GXLHXEC_TOK_START_CDATA:
printf("START_CDATA - "); break;
case GXLHXEC_TOK_END_CDATA:
printf("END_CDATA - "); break;
case GXLHXEC_TOK_WHITESPACE:
printf("WHITESPACE - "); break;
case GXLHXEC_TOK_PI:
printf("PI - "); break;
case GXLHXEC_TOK_DTD_DATA:
printf("DTD_DATA - "); break;
case GXLHXEC_TOK_UNRESOLVED_REF:
printf("UNRES_REF - "); break;
case GXLHXEC_TOK_AUX_INFO:
printf("AUX_INFO - "); break;
case GXLHXEC_TOK_ERROR:
printf("ERROR - "); break;
default:
printf("UNRECOGNIZED - ");
break; // default
} // token type switch
switch(RECTOK(pRec))
{ // token type switch
case GXLHXEC_TOK_START_ELEM:
case GXLHXEC_TOK_END_ELEM:
case GXLHXEC_TOK_ATTR_NAME:
PrintEAName(pVals);
break;
case GXLHXEC_TOK_NS_DECL:
PrintNSName(pVals);
break;
case GXLHXEC_TOK_ATTR_VALUE:
case GXLHXEC_TOK_CHAR_DATA:
case GXLHXEC_TOK_COMMENT:
case GXLHXEC_TOK_UNRESOLVED_REF:
PrintLenVal((GXLHXEH_VALUE *)pVals,4);
printf("\n");
break;
case GXLHXEC_TOK_BUFFER_INFO:
PrintBufInfo(pVals);
break;
case GXLHXEC_TOK_XML_DECL:
PrintXMLDecl(pVals);
break;
case GXLHXEC_TOK_PI:
case GXLHXEC_TOK_DTD_DATA:
PrintDTD(pVals);
break;
case GXLHXEC_TOK_AUX_INFO:
PrintAuxInfo(pVals);
break;
case GXLHXEC_TOK_ERROR:
PrintError(pVals);
break;
default:
break; // default
} // token type switch
printf("\n----------------------------------------------------------\n");
// -------------------------------------------------------------
// Local subroutines
// -------------------------------------------------------------
// *************************************************************
// ReadFile - read a file and return and return all of its
// contents.
// *************************************************************
ssize_t ReadFile(char *pFileName,
char *pFileBuf,
size_t lFileBuf)
{ // start of ReadFile
ssize_t nBytesRead = 0; // number of bytes read
int fd; // descriptor for file to read
if (fd >= 0)
{ // file is open
nBytesRead = read(fd,pFileBuf,lFileBuf);
if (nBytesRead <= 0)
close(fd);
} // file is open
else
{ // file open failed
printf("Error - open %s failed\n", pFileName);
nBytesRead = 0;
} // file open failed
return(nBytesRead);
} // end of ReadFile
// *************************************************************
// PrintBufInfo - print record-specific info for buffer info records.
// *************************************************************
void PrintBufInfo(void *pVal)
{ // start of PrintBufInfo
GXLHXEH_BUFINFODATA *pBIVal;
#ifdef _LP64
printf(" BuflenUsed: %016x ErrOffset: %016x\n",
pBIVal->XEH_BuflenUsed,pBIVal->XEH_ErrOffset);
#else
printf(" BufLenUsed: %08x ErrOffset: %08x\n",
pBIVal->XEH_BufLenUsed,pBIVal->XEH_ErrOffset);
#endif
return;
} // end of PrintBufInfo
// *************************************************************
// PrintXMLDecl - print record-specific info for XML declarations.
// *************************************************************
void PrintXMLDecl(void *pVal)
{ // start of PrintXMLDecl
GXLHXEH_VALUE *pXDVal; // xml decl values
int fVal = OFF;
pXDVal = NEXTVAL(pXDVal);
if (VALLEN(pXDVal))
{ // encoding
printf(" Encoding: ");
PrintLenVal(pXDVal,0);
fVal = ON;
} // encoding
pXDVal = NEXTVAL(pXDVal);
if (VALLEN(pXDVal))
{ // standalone
printf(" Standalone: ");
PrintLenVal(pXDVal,0);
fVal = ON;
} // standalone
// End the line if any of the XML decl fields were present.
if (fVal)
// *************************************************************
// PrintEAName - print the name values for elements and atributes.
// *************************************************************
void PrintEAName(void *pVal)
{ // start of PrintEAName
GXLHXEH_VALUE *pEANVal; // element name values
pEANVal = NEXTVAL(pEANVal);
if (VALLEN(pEANVal))
{ // NS URI
printf(" NS URI: ");
PrintLenVal(pEANVal,0);
printf("\n");
} // NS URI
pEANVal = NEXTVAL(pEANVal);
if (VALLEN(pEANVal))
{ // NS prefix
printf(" NS prefix: ");
PrintLenVal(pEANVal,0);
printf("\n");
} // NS prefix
return;
} // end of PrintEAName
// *************************************************************
// PrintNSName - print the name values for a namespace.
// *************************************************************
void PrintNSName(void *pVal)
{ // start of PrintNSName
GXLHXEH_VALUE *pNSNVal; // namespace name values
if (VALLEN(pNSNVal))
{ // NS prefix
pNSNVal = NEXTVAL(pNSNVal);
if (VALLEN(pNSNVal))
{ // NS URI
printf(" NS URI: ");
PrintLenVal(pNSNVal,0);
printf("\n");
} // NS URI
return;
} // end of PrintNSName
// *************************************************************
// PrintPI - print the target name and value for a processing
// instruction.
// *************************************************************
void PrintPI(void *pVal)
{ // start of PrintPI
GXLHXEH_VALUE *pPIVal; // processing instruction values
if (VALLEN(pPIVal))
{ // PI target
printf(" Target: ");
PrintLenVal(pPIVal,0);
printf("\n");
} // PI target
pPIVal = NEXTVAL(pPIVal);
if (VALLEN(pPIVal))
{ // PI value
printf(" Value: ");
PrintLenVal(pPIVal,0);
printf("\n");
} // PI value
return;
} // end of PrintPI
// *************************************************************
// PrintDTD - print the values associated with a DTD.
// *************************************************************
void PrintDTD(void *pVal)
{ // start of PrintDTD
GXLHXEH_VALUE *pXDVal; // DTD values
pXDVal = NEXTVAL(pXDVal);
if (VALLEN(pXDVal))
{ // public identifier
printf(" PUBID: ");
PrintLenVal(pXDVal,0);
printf("\n");
} // public identifier
pXDVal = NEXTVAL(pXDVal);
if (VALLEN(pXDVal))
{ // system identifier
printf(" SYSID: ");
PrintLenVal(pXDVal,0);
printf("\n");
} // system identifier
return;
} // end of PrintEAName
// *************************************************************
// PrintAuxInfo - print the values associated with an auxilliary
// information record.
// *************************************************************
void PrintAuxInfo(void *pVal)
{ // start of PrintAuxInfo
GXLHXEH_AUX_VALUE *pAIVal; // aux info values
else
return;
} // end of PrintAuxInfo
// *************************************************************
// PrintError - print the values associated with an error record
// *************************************************************
void PrintError(void *pVal)
{ // start of PrintError
GXLHXEH_ERRINFODATA *pErrVal; // error values
#ifdef _LP64
printf(" Error offset: %016x\n",ERROFFSET(pErrVal));
#else
#ifdef _LONG_LONG
printf(" Error offset: %016x\n",ERROFFSET(pErrVal));
#else
printf(" Error offset: %08x\n",ERROFFSET(pErrVal));
#endif
#endif
return;
} // end of PrintError
// *************************************************************
// PrintLenVal - print the string from a length/value pair in 1
// or more chunks. The number of chunks is
// determined by the width of the output line we
// are using (LINE_WIDTH).
// *************************************************************
void PrintLenVal(GXLHXEH_VALUE *pLV,
int indent)
{ // start of PrintLenVal
char *p = &(VALTEXT(pLV));
int n = VALLEN(pLV);
char blanks[LINE_WIDTH];
if (p[i] == NEWLINE)
{ // found a newline
printf("%s%s\n",blanks,p);
i++; // skip past the newline
} // found a newline
else
{ // full chunk encountered
// The strings we are dealing with are not null-terminated.
// We use address and length when dealing with strings.
// This complicates printing, because we have to null-terminate
// each chunk as we print it. Save the character after the end
// of each chunk, replace it with a NULL, and print the chunk.
ch = p[LINE_WIDTH];
p[LINE_WIDTH] = NULLCHAR;
printf("%s%s\n",blanks,p);
p[LINE_WIDTH] = ch;
} // full chunk encountered
memcpy(strChunk,p,n);
strChunk[n] = NULLCHAR;
if (indent)
printf("%s%s",blanks,strChunk);
else
printf("%s",strChunk);
} // print a partial chunk
} // end of PrintLenVal
<person id="JQP">
<name>
<family>John</family>
<given>Public</given>
<middle>Q</middle>
</name>
<phone type="mobile">5555555</phone>
<email>JQP@acmepaperclip.com</email>
</person>
Example B-11 shows the output produced by this example code for the sample XML
document.
----------------------------------------------------------
Select the Additional materials and open the directory that corresponds with the IBM
Redbooks form number, SG247810.
ESDS VSAM Entry Sequenced Data Set DTD Data Type Definition
ITSO International Technical Support HFS z/OS UNIX Hierarchial File System
Organization HTML Hyper Text Markup Language
KSDS VSAM Key Sequenced Data Set IBM International Business Machines
OSR Optimized Schema Representation Corporation
SOAP Simple Object Access Protocol RRDS VSAM Relative Record Data Set
The publications listed in this section are considered particularly suitable for a more detailed
discussion of the topics covered in this book.
IBM Redbooks
For information about ordering these publications, see “How to get Redbooks” on page 246.
Note that some of the documents referenced here might be available in softcopy only.
Other publications
These publications are also relevant as further information sources:
XML
z/OS XML System Services User’s Guide and Reference, SA23-1350
XML Toolkit for z/OS User’s Guide, SA22-7932
DB2
DB2 Version 9.1 for z/OS Application Programming and SQL Guide, SC18-9841
DB2 Version 9.1 for z/OS SQL Reference, SC18-9854
DB2 Version 9.1 for z/OS Internationalization Guide, SC19-1161
COBOL
XML System Service parser in Enterprise COBOL for z/OS Version 4.2 Compiler and
Runtime Migration Guide, GC23-8527
Enterprise COBOL for z/OS, Language Reference Version 4 Release 1, SC23-8528
Enterprise COBOL for z/OS Programming Guide V4R1, SC23-8529
DB2 9 pureXML Guide, SG24-7315
DB2 9: pureXML Overview and Fast Start, SG24-7298
PL/I
Enterprise PL/I for z/OS Programming Guide, SC27-1457
WebSphere MQ
WebSphere MQ Intercommunication, Version 6.0, SC34-6587
WebSphere MQ Application Programming Reference Version 6.0, SC34-6596
z/OS topics
z/OS Support for Unicode: Using Unicode Services, SA22-7649
z/OS Metal C Programming Guide and Reference, SA23-2225
z/OS MVS Programming: Assembler Services Guide, SA22-7605
z/OS V1R11.0 UNIX System Services Planning, GA22-7800
z/OS V1R11.0 Language Environment Programming Reference, SA22-7562
z/OS V1R11.0 UNIX System Services Command Reference, SA22-7802
F K
FD TstXML 188 KSDS 26
fetch 126
file transfer protocol (FTP) 26, 152
FILETAG 161 L
language binding 68, 131, 133
Language Environment 66, 88, 122
G binding 142
GDG 26, 28 run-time dependancy 122
generating language structure 106
COBOL 38, 42, 45 left-over byte 71
PL/I 19, 38, 48 line feed (LF) 8, 147
Generating COBOL 81 Linux 23
GET CONTAINER 161
GROUP-USAGE National 149
gxl_lvpairs 117, 169 M
gxl_valchs 169 MEMCONVERT 48, 165
gxl_vallen 117, 169 message queue (MQ) 29
gxl1ini entry 132, 167 message queue interface (MQI) 33
Index 249
Select OutData 188 UNIX HFS 26
Select TstXML 180 UNIX z/FS 26
service request block (SRB) 66, 88 USAGE National 149
Simple API for XML (SAX) 10 user-defined functions (UDF) 31
SO-10646 46 UTF scheme 147
SOAP 22, 31 UTF-16BE 148
sparse parsing 95 UTF-8 148
SQL guide 83, 111
SQL Type
host variable 83 V
SQL type 83 valid XML 110
SRB 88 document 152
mode 22, 138 stream 17
Standard Generalized Markup Language (SGML) 2 string 21
Standard OS Linkage 132 validate 136
static init 167 validating parsing 97
String IDs validation 9, 110, 136
Non-validating parsing 94 validation COBOL 82
string IDs 76, 94, 98, 127 value space 181
String Message-Level 210 virtual DOMNode 104
StringID 75, 94–95, 97–98, 122 virtual void
StringID Table 76, 94, 130 endDocument 101
StringID value 95 endElement 101
StringIDHandler 75 ignorableWhitespace 101
StringIDTable 122, 176 processingInstruction 101
Supervisory Control And Data Acquisition (SCADA) 33 release 106
SYS1.CSSL IB 132, 180 resetDocument 101
SYS1.SAMPLIB 128 setDocumentLocator 102
SYSLIB 132 setNodeValue 105
SYSOUT 26 setPrefix 105
system services parameter area 75 setTextContent 105
system services vector 125 startDocument 102
System z 15, 25, 88–89, 145, 147–148 startElement 102
common exit points 26 VSAM
DB2 V8 35 ESDS 26
extensible means 88 KSDS 26
Linux 23 RRDS 26
preferred encoding 148
Rational Developer 21 W
WebSphere MQ Messaging 35 Web Service 38
XML documents 160 Web service 26, 28, 38, 160
XML processing 16 Web site 23, 239
XML-enabled products 16 WebSphere Application Server 22, 31
WebSphere Enterprise Service Bus 22, 32
T WebSphere ESB 32
tag name 69 WebSphere Message Broker 22, 26
task control block (TCB) 88 WebSphere MQ
toolkit 76, 88, 136, 147 application 158
toolkit parser 16 Intercommunication 159
transform statement (TS) 20, 28, 109 Messaging 35
transformation 16–17, 80, 88–89 WebSphere MQ Message Broker 32
XML to record 88 WebSphere MQ See messaging
XML to XML 88 WebSphere Process Server 22, 32
transforming 89 well-formed 9
whitespace 147, 188
carriage return 147
U horizontal tabulation 147
Unicode service 149 line feed 147
Unicode Transformation Format (UTF) 146 NEL 147
UNIX environment 100 newline 147
Index 251
command line 122
environment 115, 154
file system 152
shell 114
z/OS user 18, 90, 245
z/OS V1.9 137
IBM XML Toolkit 137
XML Toolkit 137
z/OS V3R3 17, 38
Enterprise COBOL 17
Enterprise PL/I 19
z/OS V4R2 18, 82
z/OS XML 16, 66, 137, 180
data stream 223
parse record stream 224
parser 66, 91, 187, 202
System Services 16, 27, 66, 86–88, 90–91, 94–95,
110, 113–114, 133, 137, 147–148, 163, 186–187, 245
GXL1INI 73
GXL1PRS 70, 73
GXL1QXD 149, 152
GXL4QXD 149, 152
gxluGenOSR 75
GXLYQXD 152
key benefit 68
PIMA 66, 70, 73
System Services assembler interface 88
System Services C 88
System Services C/C++ APIs 79
System Services combination 86
System Services non-validating parsing 137
System Services parse 75
System Services parser 18, 69, 86, 90, 95, 117, 202
System Services reason code
x1301 69–70, 119, 175
x1302 70
x1303 69–70, 119, 175
x1304 69, 71, 119, 175
System Services return code
x1302 70
System Services routine gxlprs 68
System Services user 70, 149
System Services user guide 152
Toolkit 39, 76, 88, 99, 136, 148
Toolkit XSLT processor 18
See also XML Toolkit
zAAP 16, 31, 68, 76, 86, 91, 137
zAAP processor 21, 138
CPU cost 140
zIIP 68, 91, 137
zIIP processor 22, 138
zAAP-eligible work 138
SG24-7810-00 0738433780