RSS Feeds Introduction
RDF Site Summary (RSS) files, based on XML, provide an open method of syndicating and aggregating Web content. Using RSS files, you can create a data feed that supplies headlines, links, and article summaries from your Web site. These files describe a channel of information that can include a logo, a site link, an input box, and multiple "news items." Other sites can incorporate your information into their pages automatically. You can also use RSS feeds from other sites to provide your site with current news headlines. These techniques let you draw more visitors to your site and also provide them with up-to-date information.
What are metadata?
RSS files are a type of metadata. Metadata are:
• Units of information about information.
• Commonly used to provide descriptive information about the content, context, and characteristics of data.
HTML keywords and description metatags are examples of metadata, and are used to provide information about Web pages.
The RSS format originated with the sites My Netscape and My UserLand, both of which aggregate content derived from XML news feeds. Because it's one of the simplest XML applications, RSS found favor with many developers who need to perform similar tasks. Users include Moreover, Meerkat, UserLand, and XML Tree. This article looks at the RSS format and examines some open source Perl modules that will allow you to work with RSS files easily.
What exactly are these RSS files?
RSS files are metadata (see the sidebar what are metadata?). Until you've used them or seen an example, it may not be easy to understand what RSS files are, but they are easy to create. An RSS file commonly contains four main types of elements: channel, image, items, and text input. These elements are easy to identify and code, as the example in Listing 1 demonstrates. An example of an item within an RSS 0.91 file, Listing 1 contains three easily identifiable parts: a title, a link, and a description.
In headline collections published as results of RSS file aggregations, HTML normally renders the specified title as a headline. The title usually also serves as a link, using the URL listed in the link element. Finally, the description is normally displayed as a summary of the article underneath the headline.
Creating RSS files
You can build RSS files to either the proposed RSS 1.0 specification, or to the currently more popular RSS 0.91 spec. For production applications, use RSS 0.91, because the 1.0 proposal is still under consideration. The Resources section, at bottom, includes links to both the 1.0 and 0.91 specs. which provide a detailed review of all elements. This discussion focuses on the most commonly used elements, and all the examples in this article are in 0.91 formats.
The 1.0 proposal differs from the 0.91 format in one main way: It incorporates Resource Description Framework (RDF) elements that allow greater flexibility at the expense of some increased complexity. This proposed specification is more extensible, creating a W3C standard for RSS files that will meet current needs, will be as backwards-compatible as possible, and will be adaptable to future requirements.
Both versions of the specification share the characteristic of being a lightweight format that developers can use for many purposes.
RSS is an XML application. Because of this, all RSS documents begin with the XML 1.0 declaration followed by the RSS document type declaration, as shown in Listing 2.
The first line declares the document to be an XML document. The second line, the DTD declaration, specifies that this XML file is based on the RSS 0.91 document type definition, DTD, at Netscape. Finally, the root element marks the beginning of the RSS file content, all of which goes between the
The four main sections of an RSS files
After the root element come the four main sections of the RSS file. These are the channel, image, item, and text input sections. In practical use, the channel and item elements are requirements for any useful RSS file, while the image and text input are optional.
The channel section
The channel element contains metadata that describe the channel itself, telling what the channel is and who created it. The channel is a required element that includes the name of the channel, its description, its language, and a URL. The URL is normally used to point to the channel's source of information.
Listing 3 shows the beginning of the channel element. This part of the channel element defines the channel and begins the channel information.
The image section
The image element is an optional element that is usually used to include the logo of the channel provider. The default size for the image is 88 pixels wide by 31 pixels high, but you can make your logo as large as 144 pixels wide by 400 pixels wide. Here is a sample image element:
The items
Items, the most important elements in a channel, usually form the dynamic part of an RSS file. While channel, image, and text input elements create the channel's identity and typically stay the same over long periods of time, channel items are rendered as news headlines, and the channel's value depends on their changing fairly frequently. Here is an example of a channel item:
The text input
The text input area is an optional element, with only one allowed per channel. Usually rendered as an HTML form, text input lets the user respond to the channel. You might use this feature to enable your users to subscribe to your newsletter or search your site. Here is an example of a text input element:
Author: Webshree
email: webshree@gmail.com