XML Serialization Essentials

XML serialization is a great way for applications to maintain state, read and write configuration files, and transfer data between processes, applications, and enterprises over a network, including the Internet. Because XML documents are text-based, you can view and modify serialized data with a text editor.

Java's binary serialization API (whose major classes are ObjectOutputStream and ObjectInputStream) provides an infrastructure that supports data serialization into binary form. Binary data, however, is not easily read by people nor appropriate for communication across disparate applications or systems.

WebObjects allows you serialize objects and data into XML documents using the API defined for binary serialization. The classes NSXMLOutputStream and NSXMLInputStream extend ObjectOutputStream and ObjectInputStream, respectively. These classes use the Java API for XML Processing (JAXP)to communicate with the XML parser. See XML Parsers and XSLT Processors for more information.

As in binary serialization, an NSXMLOutputStream object writes enough data to a stream for an NSXMLInputStream object to be able to reconstruct the object graph and data that the stream represents. This includes fully qualified class names, field names, and data types. This level of verbosity is adequate for serialization and deserialization by similar systems, but may not be appropriate for data transmission between companies, for example. Transforming an Array of Movies shows you how to transform the output of NSXMLOutputStream into a simpler XML document suitable for communication among business partners.

Most of this chapter is based on Sun's Java Object Serialization Specification,. If you are familiar with that document, you can just skim through the chapter. You should, however, read Application Security, as it contains information on how to set up the security manager to allow WebObjects's serialization classes to work unrestricted.

This chapter contains the following sections:

Serialization Process

To serialize objects and data you perform the following steps:

  1. Open an output stream of type java.io.OutputStream or a subclass of it.

  2. Initialize an NSXMLOutputStream with the output stream.

  3. Invoke the writeObject method to serialize objects or the appropriate write method to serialize primitive-type data (see the API documentation for the java.io.DataOutput interface for a list of primitive-data serialization methods).

  4. Close the OutputStream and the NSXMLOutputStream.

Listing 2-1 shows an example of a method that serializes an object and an integer value.

Listing 2-1  Example of a serialization method

/**
 * Serializes an object and an integer.
 */
public void serialize() {
    // Filename of the output file.
    String filename = "/tmp/example.xml";
 
    try {
        // Create a stream to the output file.
        FileOutputStream output_stream = new FileOutputStream(filename);
 
        // Create an XML-output stream.
        NSXMLOutputStream xml_stream = new NSXMLOutputStream(output_stream);
 
        // Write the data.
        xml_stream.writeObject("Hello, World!");
        xml_stream.writeInt(5);
 
        // Close the streams.
        xml_stream.flush(); // not really needed, but doesn't hurt
        xml_stream.close();
        output_stream.close();
    }
 
    catch (IOException e) {
        e.printStackTrace();
    }
}

When an object is serialized, all the objects it refers to are also serialized. But this brings up the issue of cyclic references or multiple references to the same object. The problem is addressed by uniquely identifying each object as it is serialized. As each object is written to the output stream, its id attribute is set to a number that is unique within the XML document being generated. References to previously serialized objects use those objects' identification numbers instead of writing additional copies of them. This method is also used by object instances when referring to their class descriptions. See Serializing Custom Objects to an XML Document for an example.

Deserialization Process

To deserialize data from an untransformed XML stream encoded with NSXMLOutputStream, you perform the following steps:

  1. Open an input stream of type java.lang.InputStream or a subclass of it.

  2. Initialize an NSXMLInputStream with the input stream.

  3. Invoke the readObject method to deserialize objects or the appropriate read method to deserialize primitive-type data (see the API documentation for the java.io.DataInput interface for a list of primitive-data serialization methods).

  4. Close the InputStream and the NSXMLInputStream.

Listing 2-2 shows an example of a method that deserializes an object and an integer value.

Listing 2-2  Example of a deserialization method

/**
 * Deserializes an object and an integer.
 */
public void deserialize() {
    // Filename of the input file.
    String filename = "/tmp/example.xml";
 
    try {
        // Create a stream from the input file.
        FileInputStream input_stream = new FileInputStream(filename);
 
        // Create an XML-input stream.
        NSXMLInputStream xml_stream = new NSXMLInputStream(input_stream);
 
        // Read the data.
        String theString = xml_stream.readObject();
        int theInt = xml_stream.readInt();
 
        // Close the streams.
        xml_stream.close();
        output_stream.close();
    }
 
    catch (IOException e) {
        e.printStackTrace();
    }
 
    catch (FileNotFoundException e) {
        e.printStackTrace();
    }
 
    catch (ClassNotFoundException e) {
        e.printStackTrace();
    }
}

When you deserialize an object, the original object graph is recreated by restoring the values of nontransient and nonstatic fields. Objects referred to in the original object graph are restored recursively. After deserializing an object with transient or static fields, you must set those fields to the appropriate values. See Validation of Deserialized Data and Secure Serialization for more information.

You may want to have the parser validate source documents before deserializing objects; this is helpful in debugging and when transferring data across a network, such as an intranet or the Internet. However, you incur a performance penalty when the parser validates the documents it processes. To turn on parser validation, set the NSXMLValidation system property to true. As a general rule, you should turn on validation during application development and turn it off in deployed applications.

Secure Serialization

When you deserialize an object, its private state is restored. To protect sensitive data you may have to remove certain fields from the serialization and deserialization processes. You can accomplish this in two ways:

To prevent serialization, a class must not implement the java.io.Serializable or java.io.Externalizable interfaces. In subclasses of classes that implement those interfaces, you can throw a NotSerializableException. Listing 2-3 shows an example of a class with a transient field.

Listing 2-3  Example of a secure class

/**
 * Encapsulates secret data.
 */
public class Secret extends Object implements Serializable {
    private transient String details;    // do not serialize
    private int id;
 
    /*
     * Creates a Secret object.
     *
     * @param id            identification
     * @param details       sensitive information
     */
    Secret(int id, String details) {
        super();
 
        this.id = id;
        this.details = details;
    }
 
    /*
     * Gets this secret's id.
     *
     * @return secret id.
     */
    public int id() {
        return this.id;
    }
 
    /*
     * Gets this secret's details.
     *
     * @return secret details.
     */
    public String details() {
        return this.details;
    }
}

Listing 2-4 shows a class that extends a serializable class, but inhibits instances from being serialized or deserialized.

Listing 2-4  Example of a class that disallows serialization and deserialization by throwing NotSerializableException

/**
 * This class must inhibit serialization and deserialization
 * of its instances.
 */
public class SuperSecret extends GeneralInfo {
    ...
 
    /**
     * Prevents deserialization.
     */
    private void readObject(ObjectInputStream stream) throws IOException,
                    ClassNotFoundException {
        throws new java.io.NotSerializableException("SuperSecret");
    }
 
    /**
     * Prevents serialization.
     */
    private void writeObject(ObjectOutputStream stream) throws IOException {
        throws new java.io.NotSerializableException("SuperSecret");
    }
}

Validation of Deserialized Data

Sometimes, especially when deserializing objects with transient or static fields, you may want to validate an object before it is returned to the method that invoked readObject. To do that, you invoke the registerValidation method to tell the ObjectInputStream which object to notify when the deserialized object graph has been restored, but before readObject returns. The callback method is named validateObject. If the object's data is invalid, validateObject throws an InvalidObjectException. For more information, see the API documentation on java.io.ObjectInputStream, java.io.ObjectInputValidation, java.io.InvalidObjectException, and com.webobjects.foundation.xml.NSXMLObjectInputStream.

Listing 2-5 shows an example of a class that validates the data of an object using validateObject. In this case, the validation code is contained in the class of the object being deserialized, but this need not be the case. You may instead choose to have a validation class that contains all XML-document validation logic.

Listing 2-5  Example of a class that validates deserialized data

import java.io.InvalidObjectException;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectInputValidation;
import java.io.ObjectOutputStream;
import java.io.Serializable;
import java.sql.Timestamp;
 
/**
 * Manages movie information.
 */
public class ValidMovie extends Object implements ObjectInputValidation, Serializable {
    ...
 
    /**
     * Serializes this object.
     *
     * @param stream    object stream to serialize this object to
     */
    private void writeObject(ObjectOutputStream stream) throws IOException {
        ...
    }
 
    /**
     * Deserializes this object.
     *
     * @param stream   object stream from which the serialized data
     *                 is obtained
     */
    private void readObject(ObjectInputStream stream) throws IOException,
                    ClassNotFoundException {
        ...
    }
 
    /**
     * Validates a deserialized ValidMovie object.
     *
     * @throws InvalidObjectException when the deserialized ValidMovie
     *         is not valid.
     */
    public void validateObject() throws InvalidObjectException {
        // Determine validity of this object.
        boolean valid = someValidationMethod();
 
        if (!valid) {
            throw new InvalidObjectException("Deserialized ValidMovie object contains invalid data.");
        }
    }
}

Multiple Class Version Support

Both binary serialization in Java and XML serialization in WebObjects allow you to support more than one version of the same class for serialization and deserialization.

When dealing with multiple versions of a class, you must keep the class's identity in mind. Classes are identified by their name and API. For versioning to succeed, you must ensure that the changes you make when creating a new version of a class are compatible with the previous version. In other words, the new class's API must be a superset of the API defined in the previous version.

You can address versioning by implementing and maintaining writeObject and readObject in a class. However, binary serialization and, by extension, XML serialization provide facilities for the automatic management of multiple versions of an evolving, serializable class. In particular, binary and XML serialization provide support for bidirectional communication between class versions. This means that a class can read data serialized by a newer version. It also allows a class to write a stream from which an instance of a previous version can be successfully created.

When a later version of a class adds fields to the class, you need to initialize only the added fields when deserializing data from a stream created with the previous version of the class. However, when the new version changes field usage and you need to map fields of the new version to fields of the old version or perform conversions on existing fields, you can take advantage of the ObjectStreamField class. See "Advanced Object Serialization," located at http://developer.java.sun.com/developer/technicalArticles/ALT/index.html for details.

Listing 2-1 lists compatible and incompatible changes for new class versions. It summarizes the information provided in Sun's Java Object Serialization Specification.

Table 2-1  Compatible and incompatible changes for new class versions

Change

Compatible

Incompatible

Adding fields, or changing a field from transient to nontransient or static to nonstatic.

x

Adding fields, or changing a field from transient to nontransient or static to nonstatic.

x

Adding classes or implementing java.io.Serializable.

x

Removing classes or removing extends Serializable from a class declaration.

x

Adding writeObject and readObject methods.

x

Removing writeObject and readObject methods.

x

Changing a field's access modifier.

x

Deleting fields.

x

Modifying the class hierarchy.

x

Changing a field from nontransient to transient or nonstatic to static.

x

Changing the type of a field.

x

Changing writeObject so that it no longer writes default field data.

x

Changing readObject so that it reads default field data when the previous version does not write default field data.

x

Changing a class from Serializable to Externalizable or from Externalizable to Serializable.

x

Serialization With Keys

WebObjects XML serialization provides a useful feature: the ability to serialize objects and data with keys. To use this feature you add an additional argument to writeObject and write method invocations: the key, which is a String object.

Adding keys to your XML documents can help in performing useful transformations; that is, you can use the keys in the source document to create the elements in the target document. For example, the element <int>32</int> (created by executing writeInt(32)) provides no information about the integer 32. However, if you use writeInt(32, "age") to serialize the value, a transformation script can use the additional information about the datum to create the element <age>32</age>. See Transforming Primitive-Type Values Using Keys for details.

Application Security

Generally, security-minded environments run Sun's security manager to protect their systems from potentially damaging activities by malicious applications. The security manager is disabled by default. You activate the security manager by adding

-Djava.security.manager

to the command line when launching the application manually or to the application project's Properties file, located in the Resources group. For more information on the security manager, see Security in Java 2 SDK 1.2, located at http://java.sun.com/docs/books/tutorial/index.html.

If you use the security manager, you must add the policy shown in Listing 2-6 for Mac OS X systems or Listing 2-7 for Windows systems to the policy file for XML serialization to work correctly in WebObjects applications. Pay special attention to the lines that deal with java.net.SocketPermission, as they are required when the NSXMLValidation property is set to true.

Listing 2-6  Security-manager policies required for XML serialization in WebObjects for Mac OS X

grant codeBase "file:/System/Library/Frameworks/JavaFoundation.framework/Resources/Java/javafoundation.jar"
{
permission java.io.SerializablePermission "enableSubclassImplementation";
permission java.lang.RuntimePermission "XMLSerializationAccess";
permission java.lang.RuntimePermission "accessDeclaredMembers";
permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
 
// General permissions required to read configuration files, system properties, and so on.
permission java.io.FilePermission "<<ALL FILES>>", "read";
permission java.util.PropertyPermission "*", "read, write";
 
// If the NSXMLValidation property is set to true, uncoment the following line.
// permission java.net.SocketPermission "www.w3.org", "connect, resolve";
};
 
grant codeBase "file:/System/Library/Frameworks/JavaXML.framework/Resources/Java/javaxml.jar"
{
// General permissions required to read configuration files, system properties, and so on.
permission java.io.FilePermission "<<ALL FILES>>", "read, write";
 
// Required by Xalan during transformation.
permission java.util.PropertyPermission "user.dir", "read";
 
// If the NSXMLValidation property is set to true, uncoment the following line.
// permission java.net.SocketPermission "www.w3.org", "connect, resolve";
};

Listing 2-7  Security-manager policies required for XML serialization in WebObjects for Windows

grant codeBase "C:/Apple/Library/Frameworks/JavaFoundation.framework/Resources/Java/javafoundation.jar"
{
permission java.io.SerializablePermission "enableSubclassImplementation";
permission java.lang.RuntimePermission "XMLSerializationAccess";
permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
permission java.lang.RuntimePermission "accessDeclaredMembers";
 
// General permissions required to read configuration files, system properties, and so on.
permission java.io.FilePermission "<<ALL FILES>>", "read";
permission java.util.PropertyPermission "*", "read, write";
 
// If the NSXMLValidation property is set to true, uncoment the following line.
// permission java.net.SocketPermission "www.w3.org", "connect, resolve";
};
 
grant codeBase "C:/Apple/Library/Frameworks/JavaXML.framework/Resources/Java/javaxml.jar"
{
// General permissions required to read configuration files, system properties, and so on.
permission java.io.FilePermission "<<ALL FILES>>", "read, write";
 
// Required by Xalan during transformation.
permission java.util.PropertyPermission "user.dir", "read";
 
// If the NSXMLValidation property is set to true, uncoment the following line.
// permission java.net.SocketPermission "www.w3.org", "connect, resolve";
};