Handbook

RSS FAQ
- Specifications

Can I put html in the title or description?
Last updated by Anon on Tuesday, 08/07/2001 - 20:31

Morbus Iff wrote:
> Should HTML in RSS *always* be encoded to its entity? (<, etc.)?

No, not always.

I looked up HTML support in the different specs [1-3], since I was curious about this FAQ. It's, of course, different for the different RSS versions:

RSS 0.9 : no
RSS 0.91: no (by spec)
RSS 0.92: yes, entity-escaped
RSS 1.0 : no; maybe, with content module

All the "no"s are assumed: none of those specs mention HTML. Since 0.92 claims entity-escaped HTML as a new feature, 0.9 and 0.91 must allow no HTML however, 0.92 claims to be a description of then-current use of 0.91, so there are/were feeds with entity-escaped HTML claiming to be 0.91). 1.0 is presumably derived from 0.9 enough to allow no HTML, and its examples contain no HTML (though in the version I looked at, the examples had some unescaped 'es).

If an RSS 1.0 document makes use of the content module [4], it will have a that may specify XHTML, and may have a . If the format is XHTML and the encoding is not given, character encoding (like 0.92) is assumed. The other encoding option the spec names by name is well-formed XML, which is the only case in all of the RSS specs in which there's HTML that isn't encoded in character data.

So for 0.9, vanilla 1.0, and 0.91, all character data is for display to the user. In 0.92, the character data is for interpretation by an HTML-aware user-agent. Some 0.92 files may claim to be 0.91. In 1.0 with the content module, and tell what to do.

Anyone who knows better (such as anyone involved in RSS 1.0 development, on RSS 1.0) should feel free to correct me. Anyone compiling a FAQ should feel free to swipe from this post.

[1] http://backend.userland.com/rss091
[2] http://backend.userland.com/rss092
[3] http://purl.org/rss/1.0/spec
[4] http://purl.org/rss/1.0/modules/content/

Mark Paschal
markpasc@mindspring.com



but then this from Karl Ove Hufthammer

IMO, it should *never* be encoded this way. I see it as abuse of the SGML/XML[1] language(s). The best solution would be to use namespaces[2].

> Or, is it only recommended?

[1] Yes, I know says it's OK to do this.

[2] This isn't possible in the current version of RSS 1.0, as 'description' can only contain PCDATA.



And this from
Aaron Swartz

Well, it's doable (since transmitting non-XML HTML is going to be a big requirement of the folks we hoped would use the module) but namespaces are certainly recommended.


My Take?

1. no HTML in the <title>
2. Character encoded HTML is allowed in the description and is in fact common. Anyone building a reader or aggregator should expect it. And if you are writing 0.91 or 0.92 RSS feel free to go on doing this.






    previousindexnext
    Where are the specifications?upCan I put html in the title or description?