
The delivery of web content is being revolutionized by a new technique known as syndication. The most common format for syndication is RSS, or Really Simple Syndication, an XML (eXtensible Markup Language) format for coordinating the
delivery of time-based content streams, or "feeds." This means that RSS can be
used to deliver content that changes over time. RSS provides for the inclusion
of additional data, similar to email attachments, using the <enclosure>
tag.
Through applications in Mac OS X Tiger, Apple has
added and expanded support for the RSS content, including the Safari browser, iTunes and iWeb. Podcasts are a form of RSS
enclosure as used in iTunes and iWeb. And Safari has arguably the best browser
support for RSS content. Through autodiscovery, an RSS client can notify a user
of an available feed (for example, the RSS badge in
Safari's address bar), and also subscribe on the user's behalf. Plus, there are
a number of tools available for developers and publishers that simplify the
process of generating, validating, and reading RSS feeds. We will explore each of these
topics in this article.
Note that there are several different interpretations of the
RSS acronym. "Really Simple Syndication" is one such interpretation. Others
include "RDF Site Summary" and "Rich Site Summary". RDF stands for "Resource
Description Framework."
The basic structure of an RSS feed is illustrated in Listing 1. The outer container is an <rss> tag that encloses a <channel>. The channel contains elements, or tags that collectively define the feed properties, much the way that a header contains metadata about an email message. Following the channel elements are items that define the actual feed content. Each item defines a unique entry, and a feed can contain multiple items. An <item> might define a headline, image, or audio file. The closing </channel> and </rss> tags follow the items (all tags must be closed to be valid XML). Not all of the tags included in this example are required. You can find more RSS syntax examples in the references listed at the end of this article.
Listing 1: RSS Feed Structure
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<!-- channel metadata -->
<title>ADC Feed</title>
<link>http://developer.apple.com/feed.rss</link>
<description>ADC headlines, new sample code, and docs.</description>
<!-- channel content -->
<item>
<!-- item content -->
<title>Intro To RSS</title>
<description>This article provides an overview of RSS. The enclosure references the podcast.</description>
<enclosure url=""http://developer.apple.com/rss.mp3" length="123456" type="audio/mpeg" />
</item>
</channel>
</rss>
Thinking in Streams
Whether you are a web publisher or an application developer, time-based
streams represent the current trend in providing on-demand content. What is a
"time-based stream?" It is information that:
- Remains valid for a certain period of time; and
- Is consumed as it is delivered. See the Wikipedia definition of Streaming media and other topics.
How does RSS relate to time-based streams? Blogging techniques provide a good example. In a static model, which does not use RSS, you might post your blog updates to a web server on a daily or weekly basis. Interested users will navigate to your site using a browser or blog reader software and read your enlightening observations on the state of the world, best programming practices, and so on. But manually checking for updates takes time, and that user will have to navigate back to your site to read your latest postings.
A more effective approach is to publish your blog updates via an RSS feed,
and allow subscribers (most commonly software acting on behalf of your users) to
check for, download, and display the updates. These activities can be performed
by many browsers, such as Safari, specialized feed readers, and web-based
aggregators such as Bloglines or My Yahoo! It is RSS that makes the feed
publish-and-subscribe model possible.
Enclosures
Like the content it delivers, the RSS specification is not static. And it is
redefining the market for content delivery. Consider "enclosures," which are
part of the RSS specification. RSS enclosures carry additional content with the
stream, similar to the way attachments carry additional information in an email
message. enclosure is a sub-element of item, and
defines a stream of data, including its URL, length in bytes, and MIME type (see
Listing 2). Podcasts are an example of enclosures. The audio and/or video stream
is sent in the body of the enclosure. Enclosures can also include documents and
photos.
Listing 2: The Enclosure Sub-Element
<enclosure url=""http://developer.apple.com/rss.mp3" length="123456" type="audio/mpeg" />
Apple recently released RSS extensions in the <itunes:> namespace. (XML namespaces allow elements with the same name but different purposes to co-exist without interfering with each other.) These extensions allow podcast authors to improve the user experience in iTunes and other client applications that understand the <itunes:> namespace. For example, the <itunes:category> element can improve the way content is categorized, while <itunes:keywords> allows users to search on text keywords. The document Podcasting and iTunes: Technical Specification has more information on the iTunes tags.
RSS and Atom
There are a number of different feed specifications at this time, though most
attention remains focused on RSS 2.0 and Atom 1.0. Web publishers and developers
need to remain aware of changes to the landscape as specifications gain or fall
out of favor. The good news for publishers and developers on Mac OS X is that
Safari handles all the current specifications. Most popular RSS readers also
support both RSS and Atom.
Tools for Publishers and Developers
Feed publishers can use a type of application called a "generator" to create
the RSS markup that makes your feed available. FeedForAll, for example, has a wizard that
walks you through the creation of RSS feeds, whether you want to publish a plain
RSS feed, a podcast feed, or a podcast with iTunes support.
You should also test your feed by using an RSS reader application such as NetNewsWire or an RSS-capable browser. Make sure the reader displays the feed without error. You should also check the feed against a validator, such as FEED Validator, to ensure that it is well-formed. Several other validators are available online; the MacTech article referenced at the end of this article contains a list of these plus other tools for generating and reading feeds.
Handling Dates Properly for Time-based Content
Feed dates are specified in the <pubDate> and <lastBuildDate> channel elements. Listing 3 adds both of these elements to our first example. <pubDate> specifies the date on or after which the feed will be published. <lastBuildDate> specifies the date and time of the last feed update. At reader compares the timestamp of one of these tags with the last time the feed was cached locally, and if the time specified in the tag is newer, then the feed has been updated. The format of each of these tags should follow the convention specified in RFC 2822. The RFC is a little dry; Wikipedia has a summary. The time must be in 24 hour format (no AM or PM) and must include the time zone offset. Podcasting and iTunes: Technical Specification has more information.
Listing 3: Date Channel Elements
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<!-- channel metadata -->
<title>ADC Feed</title>
<link>http://developer.apple.com/feed.rss</link>
<description>ADC headlines, new sample code, and docs.</description>
<pubDate>Mon, 3 Apr 2006 15:00:00 -0800</pubDate>
<lastBuildDate>Mon, 3 Apr 2006 09:00:00 -0800</lastBuildDate>
<!-- snip -->
Important: <pubDate> also applies to items. Use it to specify the publication date of individual items within the channel.
Note: Incorrect date/time formatting is a common problem. Make sure your dates adhere to RFC 2822.
There are a couple of other tags that assist with date handling: <skipDays> and <skipHours>. These tags list the days of the week and hours of the day, respectively, during which the feed will not be updated. This additional schedule information can be used by a reader or aggregator to determine when not to check a feed for updates: if the intervening days or hours since the last update consist solely of values found in these tags for the feed, then the reader can assume that the feed has not been updated. Note, however, that some browsers or readers may ignore these tags.
In addition, the <ttl> or time-to-live tag, specifies the number of minutes during which a feed will not be updated; this is the length of time that a reader or aggregator should cache the feed. While optional, it is strongly recommended in order to minimize load on the server caused by too-frequent checks for update.
As a feed publisher, you cannot depend on all browsers or readers
supporting all these date and time tags. What should you do? Publishing
your feed on a server that properly supports Etags, and/or has the right mod dates, and so deals with HTTP conditional GET requests, is
your best bet for reducing load on your servers. The HTTP/1.1 Header
Field Definitions provide some additional detail on Etags.
Autodiscovery
Getting your feed noticed can be tough sometimes, but there is a convenient way to allow readers to discover your feed. You simply add a <link> element within the <head> element of your web page to specify that an RSS feed is available, and the URL at which to find it. Some web browsers use <link> elements to indicate the availability of a feed for a website: Safari displays an RSS button in its Address Bar.
Listing 4 shows the <link> syntax for both RSS and Atom. You can read more about the <link> type and title attributes on Mark Pilgrim's blog.
Listing 4: Support for Autodiscovery Using the Link Element
<link rel="alternate" type="application/rss+xml" title="RSS"   href="http://developer.apple.com/rss/adcheadlines.rss">
<!-- or -->
<link rel="alternate" type="application/atom+xml" title="RSS"   href="http://developer.apple.com/rss/adcheadlines.rss">
Ping servers
Another way to help people both find your feed, and know that it has been updated, is to send an XML-RPC message to a Ping server. Generally speaking, you should notify appropriate ping servers whenever you update your feed, as it helps both search engines and aggregators know your feed has been updated without having to explicitly poll your website. Some of the most popular servers are Technorati and blo.gs, though there are many others available. Which and how many ping servers you should contact depends on the type and scale of the audience you wish to reach, as well as the amount of work you want your computer to do every time you update your feed.
For More Information
In addition to the documents discussed in this article, the documents, books, and articles below will help you find more specific information about RSS and Atom.
- The Podcasting and iTunes: Technical Specification has information and examples regarding the podcast submission and feedback processes, mistakes to avoid, iTunes RSS tags, and categories under which to publish.
- The ADC SampleRSS.wdgt sample code shows how to extract the content of the Apple RSS Hot News feed using JavaScript. It uses an
XMLHttpRequest object to obtain the feed, then locates and extracts specific XM L elements. Note that this is primarily of interest to
Dashboard developers,
as real-world feeds typically do not contain well-formed XML.
- WWDC 2005 Conference Session 135, "Safari for Web Designers" discusses RSS support in Safari. You will need to login to your online ADC account, and then navigate to the section labelled WWDC Sessions to view the session movie and download the slides.
- The Atom Feed Autodiscovery spec. The RSS autodiscovery syntax is similar, as outlined above.
- O'Reilly publishes the book Developing Feeds with RSS and Atom, as well as several articles, including Making Your RSS Feed Look Pretty in a Browser, and Everything You Wanted to Know About Safari RSS.
- MacTech Magazine published the article "Deconstructing RSS 2.0" in the December, 2005 issue. This article provides summaries and examples of the RSS 2.0 tags. It also has a lengthy list of tools for publishers and developers.
Updated: 2008-03-03
|