RSCML - Really Simple Catalogue Markup Language

A proposal for an XML based standard language for publishing meta data about a product catalogue.

Bookmark: http://www.voidstar.com/module.php?id=130
Plain html: http://www.voidstar.com/module.php?id=130&mod=book&op=feed

What is it?

It's a proposal for a simple XML based standard that allows suppliers (and others) to publish a subset of the product catalogue data that they produce.

The intention is that product search and discovery is turned into a two stage process. First a local index is built up by grabbing and combining the RSCML feeds from many suppliers. This builds a shortlist of product-suppliers. Then each one is manually researched by viewing the full product catalogue entry by going to the supplier site.

It's designed to be trivially easy for the data creator (the supplier) to create this subset. And for the data user (The buyer or aggregator) to combine and collate data from multiple suppliers.

It attempts to use the lessons learned from RSS headline syndication and apply them to the catalogue problem.

Why is it needed?

Catalogue build and maintenance has become almost the number one problem that is holding back e-Commerce. Suppliers are expected to package their data and pass it to their big customers. Each customer wants the data in a different format. Each industry is creating different formats. The Buyers have taken on the task of building and maintaining the data despite the fact that they are not the creators of the data.

So now the buyers have a task they shouldn't want and the suppliers can't cope with the demands or get updates accepted when they change their product lines.

Several e-procurement vendors have attempted to deal with this by "Punch Out" or "Round Trip" approaches. This is an attempt to formalise this.

The various attempts such as UDDI, to index businesses are shying away from indexing products because the domain space is too big to cope with.

How about some Use Cases?

  • A supplier has a direct link between their accounting system and a published RSCML feed on their website. Whenever a new product is added or an old product is changed or deleted, the feed is updated.

  • A Distributor builds a system that collects the RSCML feeds from all it's suppliers. They use this to build a searchable index of products that they can source. This is combined and linked to their pricing information and stock control systems. They re-publish their own RSCML feed straight out of their own index system of all the products they can source.

  • A buyer collects RSCML feeds from their preferred suppliers. This is used as a source of information to update e-procurement system. When the purchasing agent is looking for a product they do a first search through this data. This gives them a shortlist. They then click though the entries to the suppliers website to view the full product specs. They then place the order.

  • A group of third party websites begin collecting RSCML globally and replicating between them the list of feeds they've found. Over time they build up a global index of products or a complete list of products in a niche market. They use this as a paid for search tool.
  • Why should people produce RSCML?

    As a supplier, it's a cheap and easy way to publish a list of your products. With a small amount of programming done once, you never need to go through the pain of re-formatting your data for eac buyer or trying to get a buyer to accept updates. just point them at your RSCML feed.

    Why should people read RSCML?

    If you're trying to build a combined catalogue, you'll have to go through a lot of pain as you try and persuade your suppliers to provide clean data. You'll have to go through the pain of trying to keep the data clean and up to date. It's a never ending task. Now you can just ask your supplier for a single URL. Type it into your system and you're done. Forever.

    The proposed XML format

    I've written this out as a plain XML standard that does not use RDF. It's trivial XML that should be easy to parse. Arguably RDF would be a better choice but there's a big trade off here between power and ease of adoption. If RDF is used and the standard is extended dramatically, but each extension is not widely used, we're right back where we started from.

    It's important that as the versions progress, old elements are still supported, from the point that implementations begin to appear.


    <?xml version="1.0" encoding="ISO-8859-1" ?> //standard XML header
    <rscml version="0.1"> //defines the version level of this feed
    <channel>
    //A single feed. Multiple <channels> could be concatenated into one feed

    <title></title>
    //required. A one line name for this channel

    <link></link>
    //required. The URL of the owning website. Ideally the web page that describes the RSCML feed.

    <url></url>
    //required. The recommended URL of this RSCML feed.

    <webmaster></webmaster>
    //required. The email address of a contact at the owning website.

    <description></description>
    //optional. A short description of this channel.

    <rec collection frequency></rec collection frequency>
    //optional. The likely frequency of updates in seconds. Gives a guide to how often readers should check for updates.

    <last modified date></last modified date>
    //optional. The date in ISO standard format

    <uddi id></uddi id>
    //optional. A UDDI ID for this organization

    <category url=></category>
    //optional. A category in which to place this feed with the URL of a category standard that this is taken from. eg UNSPC

    <image>
    <link></link>
    <width></width>
    <height></height>
    <alt></alt>
    </image>
    //optional. A link to a small image file that can be used by the reader when the RSCML data is displayed.

    <product>
    //Multiple. One <product></product> entry for each unique product by ID and Manufacturer

    <deleted />
    //Optional. If present marks the <product> as no longer available.

    <ID></ID>
    //Required. A unique product identifier. Probably the manufacturer's id number

    <title></title>
    //required. A one line name for this product.

    <abstract></abstract>
    //optional. A short description of this product.

    <manufacturer></manufacturer>
    //optional. A one line name for the manufacturer.

    <link></link>
    //optional. The URL for the web page that describes this product.

    <category url=></category>
    //optional. A category in which to place this feed with the URL of a category standard that this is taken from. eg UNSPC

    <attribute>
    <name></name>
    <value></value>
    </attribute>
    //Optional, Multiple. Name value pairs of whatever attributes may be needed to describe the product.

    </product>
    </channel>
    </rscml>


    Note that all strings are assumed to be CDATA and should be character encoded.
    Generally, strings should be allowed to contain html as long as this is also entity encoded so that the feed remains valid html.

    The Meta Feed format

    There is a need for aggregators of RSCML to exchange lists of the feeds they have discovered. This helps to promote the standard and encourages the development of more complete indexes.

    This does not need as much detail as the full RSCML but many of the attributes are shared.


    <?xml version="1.0" encoding="ISO-8859-1" ?> //standard XML header
    <rscml version="0.1"> //defines the version level of this feed
    <channel>
    //A single feed. Multiple <channels> could be concatenated into one feed

    <title></title>
    //required. A one line name for this channel

    <link></link>
    //required. The URL of the owning website. Ideally the web page that describes the RSCML feed.

    <url></url>
    //required. The recommended URL of this RSCML feed.

    <webmaster></webmaster>
    //required. The email address of a contact at the owning website.

    <description></description>
    //optional. A short description of this channel.

    <rec collection frequency></rec collection frequency>
    //optional. The likely frequency of updates in seconds. Gives a guide to how often readers should check for updates.

    <last modified date></last modified date>
    //optional. The date in ISO standard format

    <uddi id></uddi id>
    //optional. A UDDI ID for this organization

    <category url=></category>
    //optional. A category in which to place this feed with the URL of a category standard that this is taken from. eg UNSPC

    <image>
    <link></link>
    <width></width>
    <height></height>
    <alt></alt>
    </image>
    //optional. A link to a small image file that can be used by the reader when the RSCML data is displayed.

    <rscmlversion></rscmlversion>
    //optional. The RSCML version of the referenced feed.

    </channel>
    </rscml>


    Note that all strings are assumed to be CDATA and should be character encoded.
    Generally, strings should be allowed to contain html as long as this is also entity encoded so that the feed remains valid html.

    Issues

  • Turning the proposal into a real standard. Currently this is a first attempt and clearly needs lots of work to turn it into a full blown standard.
  • Relationship with price. Product catalogues and price are intimately connected but this introduces a whole can of worms.
  • RSC Evangelism. How do we get wide spread adoption of the standard
  • The Relationship between RSCML and SOAP-UDDI. There are other standards that this one would have to interact with.
  • Next Steps

    If this idea has enough merit to justify taking it further, there's a series of activities that could be used to take it there. Such as:-
  • Creating and building an rscml-dev mailing list
  • Marketing and announcement
  • Buiding a Website eg rscml.org
  • Development of Reference implementations
  • Creation of a glyph to publicise location
  • Development of a validator
  • Possible interested parties

  • Catalogue companies and ISVs
  • e-Procurement ISVs
  • Shopping cart systems
  • Auction Houses
  • Accounting Software companies
  • E-Commerce ISVs
  • Distributors
  • Suppliers
  • Copyright and Acknowledgements

    Acknowledgements
    The ideas in this owe a huge debt to the Syndication community and especially those involved in RSS headline syndication.

    Copyright
    Like everything on this site, the RSCML proposal is Kopyleft, All Rights Reversed. Feel free to copy, re-use, distribute it, or whatever. Of course, references are always welcome. And you could always offer me a job!