Handling XML Elements and Attributes

Generally, when you parse an XML document most of the processing involves elements and things related to elements, such as attributes and textual content. Elements hold most of the information in an XML document. When the NSXMLParser object traverses an element in an XML document, it sends at least three separate message to its delegate, in the following order:

The parser might send the parser:foundCharacters: message multiple times for one element; however, if the characters consist of nothing but white-space characters (space, new line, tab, and similar characters) the parser sends parser:foundIgnorableWhitespace: instead.

When you are parsing XML elements, an advanced technique you can adopt is to switch processing responsibilities among multiple delegates, each of which knows how to handle a certain type of element. For more information see Using Multiple Delegates.

Design Considerations

In an object-oriented environment such as Cocoa, a common strategy for handling elements is to map them—at the higher nesting levels, at least—to objects. Root elements and other top-level elements are frequently equivalent to collections represented in Cocoa by NSDictionary and NSArray objects. Other elements might readily map to one or more of an application’s custom model objects.

However, not all elements are best expressed as objects. Some lower level and particularly “leaf” elements are more logically viewed as properties of their parent element (if that element maps to an object). And, of course, you would probably make the actual attributes of any element a property (that is, an instance variable) of the corresponding object.

Notwithstanding these suggestions, there is no ready-made mapping formula, and indeed your application might not have to perform any element-to-object mapping to achieve its ends. These design decisions require some thought as well as some familiarity with the structure of the XML.

Handling an Element: An Example

The example code referred to in the following discussion processes an XML file containing personal-address information and converts that information into Address Book objects (ABPerson and ABMultipleValue) that can be added to a specified user’s address database. A portion of the XML looks like the following:

Listing 1  Some of the sample XML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE addresses SYSTEM "addresses.dtd">
<addresses owner=”swilson”>
    <person>
        <lastName>Doe</lastName>
        <firstName>John</firstName>
        <phone location="mobile">(201) 345-6789</phone>
        <email>jdoe@foo.com</email>
        <address>
            <street>100 Main Street</street>
            <city>Somewhere</city>
            <state>New Jersey</state>
            <zip>07670</zip>
        </address>
    </person>
 
    <!-- more person elements go here -->
 
</addresses>

Let’s look at how the first three of these elements might be handled. When the parser first encounters these elements, it invokes the delegate’s parser:didStartElement:namespaceURI:qualifiedName:attributes: method. For the first two elements, the delegate creates an equivalent object. For the third element (lastName), the delegate sets an appropriate property of the second object. Listing 2 shows the delegate’s implementation for the start tags of the first three elements.

Listing 2  Implementing parser:didStartElement:namespaceURI:qualifiedName:attribute:

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict {
 
    if ( [elementName isEqualToString:@"addresses"]) {
        // addresses is an NSMutableArray instance variable
       if (!addresses)
             addresses = [[NSMutableArray alloc] init];
        return;
    }
 
    if ( [elementName isEqualToString:@"person"] ) {
        // currentPerson is an ABPerson instance variable
        currentPerson = [[ABPerson alloc] init];
        return;
    }
 
    if ( [elementName isEqualToString:@"lastName"] ) {
        [self setCurrentProperty:kABLastNameProperty];
        return;
    }
    // .... continued for remaining elements ....
}

The delegate identifies the element passed in (elementName), then processes it accordingly:

The important action undertaken here is having a way (instance variables in this case) to track the current element throughout the parser’s traversal of it. One reason for this importance is the semantics of parser:foundCharacters:, most likely the next delegation method invoked. This method can be invoked multiple times for the same element. In this method the delegate should append the characters passed in to the characters accumulated so far for the element. The NSMutableString method appendString: is useful for this purpose, as shown in Listing 3.

Listing 3  Implementing parser:foundCharacters:

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
    if (!currentStringValue) {
        // currentStringValue is an NSMutableString instance variable
        currentStringValue = [[NSMutableString alloc] initWithCapacity:50];
    }
    [currentStringValue appendString:string];
}

Again the code uses an instance variable (currentStringValue) as a way to track and gather the content for the current element. If the parser encounters some white-space characters in the element content, it sends the message parser:foundIgnorableWhitespace: to give the delegate the opportunity to retain any white-space characters (such as tabs or new-lines).

Finally, when the parser encounters the end tag of an element, it invokes the delegation method parser:didEndElement:namespaceURI:qualifiedName:. Listing 4 presents the approach taken by the delegate in the example code.

Listing 4  Implementing parser:didEndElement:namespaceURI:qualifiedName:

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {
    // ignore root and empty elements
    if (( [elementName isEqualToString:@"addresses"]) ||
        ( [elementName isEqualToString:@"address"] )) return;
 
    if ( [elementName isEqualToString:@"person"] ) {
        // addresses and currentPerson are instance variables
        [addresses addObject:currentPerson];
        [currentPerson release];
        return;
    }
    NSString *prop = [self currentProperty];
 
    // ... here ABMultiValue objects are dealt with ...
 
    if (( [prop isEqualToString:kABLastNameProperty] ) ||
        ( [prop isEqualToString:kABFirstNameProperty] )) {
        [currentPerson setValue:(id)currentStringValue forProperty:prop];
    }
    // currentStringValue is an instance variable
    [currentStringValue release];
    currentStringValue = nil;
}

If the delegate determines that the end tag is for the person element, it adds the ABPerson object to the addresses array and releases the ABPerson object. If the end tag is for the lastName element (for example), the delegate uses the ABRecord method setValue:forProperty: to set the appropriate property in the ABPerson object (ABRecord is the superclass of ABPerson). Finally, the instance variable holding the accumulated content for the element (currentStringValue) is released.

Handling an Attribute

The addresses element shown in the example XML in Listing 1 includes an attribute:

<addresses owner="swilson">

In this hypothetical case, the attribute allows the application parsing the XML to store the created Address Book information in a specific user directory on a multi-user system.

The NSXMLParser object presents attributes of an element to the delegate in a dictionary in the final parameter of parser:didStartElement:namespaceURI:qualifiedName:attributes:. Listing 5 shows how the delegate in the example handles the owner attribute.

Listing 5  Handling an attribute of an element

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict {
 
    if ( [elementName isEqualToString:@"addresses"]) {
        // addresses is an NSMutableArray instance variable
        if (!addresses)
            addresses = [[NSMutableArray alloc] init];
        NSString *thisOwner = [attributeDict objectForKey:@"owner"];
        if (thisOwner)
            [self setOwner:thisOwner forAddresses:addresses];
        return;
    // ... continued ...
}}

The delegate extracts the user name of the owner from the attributeDict dictionary using the attribute name (owner) as a key. It then invokes a private method that associates the owner with the imported Address Book data.