XML Transformation

Serializing objects and data into XML documents is a great way of sharing information between applications within an organization. However, communicating that data between companies can be difficult. For example, you can serialize an NSArray containing InventoryItem objects into an XML document and send that document to your business partners over the Internet. But, unless your business partners are also running WebObjects (in fact, they would have to be running the same version of WebObjects that you are running), they will find it difficult to make use of the document. Of course, they can create an XSLT stylesheet that transforms your XML document into a format that they can use, but you can make their job easier by doing the transformation yourself.

Because your application generates XML documents, you're in an excellent position for converting serialized-data documents into a standard format that the recipients of your documents can use. If you're comfortable with XSL Transformations (XSLT), you can create an XSLT file that WebObjects can use to transform the output of XML serialization into other formats.

While this document does not teach XSLT, this chapter gives you an overview of the transformation process. It contains the following sections:

Structure of Serialized Data in WebObjects

The structure of the XML documents created by the WebObjects XML serialization process is described by the woxml.xsd and woxml.dtd files, which are listed in The woxml.dtd file. Figure 4-1 illustrates the structure that the files define, while Listing 4-1 shows an example of a target document.

Figure 4-1  Diagram of the schema for WebObjects XML serialization
Diagram of the schema for WebObjects XML serialization

Listing 4-1  Example of a target document

<?xml version="1.0" encoding="UTF-8"?>
<content xmlns="http://www.apple.com/webobjects/XMLSerialization" xmlns:xsi="http://www.w3.org/2001/
XMLSchema-instance" xsi:schemaLocation="http://www.apple.com/webobjects/XMLSerialization
http://www.apple.com/webobjects/5.2/schemas/woxml.xsd">
    <int>5</int>
    <boolean>true</boolean>
    <ch>u</ch>
    <double>3.14</double>
</content>

XSL Transformations

This document does not teach you XSL Transformations (XSLT). There are several books available on the subject that explain the specification and different implementations of it in detail. However, this section explains some segments of the SimpleTransformation.xsl script used in this document's transformation-example project. You can find the entire listing of the transformation script in Listing B-3.

XSLT is a declarative language. This means that the transformation of an XML document is expressed as a set of rules or templates that are applied to elements of the source document to create elements of the target document. For example, you can specify a rule that changes every date element in a document to an invoice_date element.

Listing 4-2 shows the segment of SimpleTransformation.xsl that processes woxml:object elements.

Listing 4-2  Section of SimpleTransformation.xsl that processes woxml:object elements

<!-- Processes woxml:object elements. -->
<xsl:template name="process_object" match="woxml:object">
    <!-- extract class name -->
    <xsl:variable name="className">
        <xsl:value-of select="woxml:class/@name" />
    </xsl:variable>
 
    <!-- get base class name -->
    <xsl:variable name="class">
        <xsl:call-template name="basename">
            <xsl:with-param name="path" select="$className"/>
        </xsl:call-template>
    </xsl:variable>
 
    <!-- determine the element name -->
    <xsl:variable name="tag">
        <xsl:choose>
            <xsl:when test="$class='NSDictionary' or
                     $class='NSMutableDictionary'">
                <xsl:value-of select="'dictionary'" />
            </xsl:when>
            <xsl:when test="$class='NSArray' or $class='NSMutableArray'">
                <xsl:value-of select="'array'" />
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="$class" />
            </xsl:otherwise>
        </xsl:choose>
    </xsl:variable>
 
    <!-- create the element -->
    <xsl:element name="{$tag}">
        <xsl:choose>
            <xsl:when test="$class='NSDictionary' or
                     $class='NSMutableDictionary'">
                <xsl:call-template name="process_dictionary" />
            </xsl:when>
            <xsl:otherwise>
                <xsl:call-template name="process_object_content" />
            </xsl:otherwise>
        </xsl:choose>
    </xsl:element>
</xsl:template>

Here's an explanation of the numbered lines:

  1. Gets the class name that the object represents and stores it in a variable called className. The rule gets the class name from the name attribute of the woxml:class element of woxml:object.

  2. Calls a utility template that extracts the base class name from the fully qualified class name. This base class name is stored in the class variable.

  3. Determines the name of the element in the target document that corresponds to the woxml:object element of the source document. The element name is dictionary (when class is 'NSDictionary' or 'NSMutableDictionary'), array (when class is 'NSArray' or 'NSMutableArray'), or the name of the base class that the woxml:object element contains.

  4. Creates the element and its contents by invoking one of two templates: process_dictionary or process_object_content. The process_dictionary template creates a dictionary element in the target document using either two arrays (one for the keys and another for the values) or a set of item elements, each containing a key and a value element.

For more details on transforming XML documents using SimpleTransformation.xsl, see Transforming XML Documents. To learn XSLT, check out XSLT (published by O'Reilly) or XSLT Programmer's Reference (published by Wrox Press).

XML Parsers and XSLT Processors

An XML parser is software that allows you to read and write XML documents. An XSLT processor (also known as transformer) converts an XML document into another document, whose format can be XML, HTML, PDF, or any other format supported by the transformer. There are some parsers that can also convert XML documents, such as Microsoft's MSXML3 parser.

One of a parser's duties is to validate the input document, to make sure that it's well formed and that its contents conform to the document's XML Schema file or DTD file. The WebObjects XML Schema files are listed in The woxml.dtd file. In WebObjects the source document is not validated by default; however, you can turn validation on to debug an application.

WebObjects uses the Java API for XML Processing (JAXP), implemented in the javax.xml.parsers and javax.xml.tranform packages (including javax.xml.transform.sax, javax.xml.transform.dom, and javax.xml.transform.stream) to instantiate and communicate with the XML parser and XSLT transformer. This allows you to install your preferred parser and transformer for use by your applications. See the API documentation of those packages for additional details. You can also consult Sun's JAXP tutorial, located at https://jaxp.dev.java.net/.

A standard WebObjects installation includes the Xerces XML parser and the Xalan XSLT processor. However, thanks to JAXP, you can use other parsers and processors if you wish. Just install the pertinent JAR files on your computer, make sure that they are in the Java classpath, and point javax.xml.parsers.SAXParserFactory to the class that implements the factory class. For example, if the JAR file for the Crimson parser is in the classpath, you would add the following line to the Properties file of the application project (which you can find under the Resources group) or to the command line to set the property's value:

-D"javax.xml.parsers.SAXParserFactory=
org.apache.crimson.jaxp.SAXParserFactoryImpl"

Keep in mind that if you have two parser-factory classes in your classpath, the parser that your application actually uses may not be the one you want. The parser that is loaded last is the one that the application uses. The same applies to the system properties javax.xml.transform.TransformerFactory and javax.xml.parsers.DocumentBuilderFactory: The application that is loaded last determines the system-wide values of these properties.

Serialization and Transformation Performance

XML serialization is slower than binary serialization because data is converted to XML code while objects are serialized. XML deserialization is slower than binary deserialization because XML documents need to be parsed before their contents can be deserialized. However, the actual speed at which data is serialized and deserialized is highly dependent on disk and network throughput.

To maximize the performance of XML serialization and deserialization in WebObjects, make sure that XML validation is not turned on (it's turned off by default). You turn XML validation on or off by setting the NSXMLValidation property in the command line or the Properties file:

-DNSXMLValidation=<true|false>

XML-parsing technology should improve over time. In addition, as mentioned in XML Parsers and XSLT Processors, WebObjects uses JAXP to ensure that a standard API is used to communicate with the parser. This allows you to install and use parsers as they become available.