Bug: Smilies break RSS feed
Test case
- Another test is to load this URL into FeedReader, which also complains.
Environment
| TWiki version: |
TWiki.org |
| TWiki plugins: |
various, including SmiliesPlugin |
| Server OS: |
|
| Web server: |
|
| Perl version: |
|
| Client OS: |
Win2000 |
| Web Browser: |
IE 5.5 or FeedReader 1.65b |
--
RichardDonkin - 15 Feb 2002
Fix record
A workaround is to avoid putting smilies near the top of pages - this one is OK though
The solution is to ensure that no
HTML elements get passed through to the RSS feed, even if plugin generated (perhaps just disable plugin processing on search results, as an option to
FormattedSearch)? Interestingly, the http:// link near the top of this page (i.e.
SmiliesBreakRssFeed) is handled OK, it just comes through as plain text rather than as an
HTML element.
Not sure of the best way of doing this, and it may require real code rather than
FormattedSearch.
--
RichardDonkin - 15 Feb 2002
The real problem is that plugins change the text in an unpredictable way.
WebRssBase has a
%SEARCH{}% that contains a
$summary. This summary is formatted as plain text (by
TWiki::makeTopicSummary()), that is, does not contain any
HTML code nor links. This breaks if a plugin adds some links or image tags.
The correct fix is to add a new plugin hook into
makeTopicSummary so that plugins can defuse later rendering by the plugin. For example, the
SmiliesPlugin could change
:-) to
:<nop>-) so that later smilies rendering does not expand the smilies.
--
PeterThoeny - 16 Feb 2002
Another option would be to have an option for 'no plugin rendering' as part of
FormattedSearch (i.e. just pass the text through without calling plugins) - this would be fine for
WebRss, since the contents of the RSS elements are meant to be plain text anyway, in contrast to normal searches where the summary info may need plugin processing. This would avoid having to make plugin changes to support this.
I just got another instance of this problem on the Codev RSS feed (
XML version), this time due to an element not plugins -
FeedReader coughed and Mozilla complained as follows:
XML Parsing Error: undefined entity
Location: http://twiki.org/cgi-bin/view/Codev/WebRss?skin=rss&contenttype=text/xml
Line Number 223, Column 111: <description> Collaboration Site Taking Wiki to the Nth Degree This is the result of my mind wandering over&nbsp; what's common between most Collaborative Tools Fundamental Features ...</description>
I should say that this RSS feed is very useful, thanks for adding it. TWiki now has
InstantNotification as long as you install something like
FeedReader on your desktop!
One side-effect of this - if people are going to notice changes more quickly, it would be good to reduce the lock timeout on TWiki.org, to 10 or 20 minutes perhaps?
--
RichardDonkin - 16 Feb 2002
I tried putting a
<nop> into the
UserProblems smilie, but this didn't fix it, presumably the NOP is stripped at some point and then the plugin attacks again.
It turns out that the
SmiliesPlugin doesn't generate
XHTML, resulting in the
<img src> not ending in a
/>. When the RSS feed is generated, this turns into an
XML document that is invalid (see
XhtmlConsideredHarmful - in this case non-XHTML is harmful!) The best fix is to make this and other Plugins generate
XHTML. However, a shorter term fix is to avoid plugin processing as mentioned above (perhaps this could be configurable so that if a site only uses
XHTML-conformant plugins (and manually typed
HTML in topics) they can turn this back on.
--
RichardDonkin - 18 Feb 2002
Fixed Smilies code at TWiki.org to generate
XHTML:
$_[0] =~ s/(\s|^)$p(\s|$)/"$1<img alt=\"$emotion\" src=\"$TWiki::urlHost$TWiki::pubUrlPath\/$installWeb\/SmiliesPlugin\/$url\" \/>$2"/ge;
Can you test if this works now?
--
PeterThoeny - 18 Feb 2002
Thanks, that's fixed it - I put a smilie in
Sandbox.RichardsRssPage, and now opening
http://twiki.org/cgi-bin/view/Test/WebRss?skin=rss&contenttype=text/xml
in IE5.5 and
FeedReader produces no errors.
--
RichardDonkin - 18 Feb 2002
I need to reopen this bug, image tags generated by smilies do not pass the RSS feed test even thogh they are valid
XHTML. This is with the latest
TWikiAlphaRelease, also here at TWiki.org. To validate, use
http://feeds.archive.org/validator/check
with URL /cgi-bin/view/Sandbox/WebRss?contenttype=text/xml&skin=rss
--
PeterThoeny - 11 Dec 2002
Unless the smilies plugin is somehow generating 8-bit characters, I don't think anything I've done would have affected this, but you never know... I think it's just that the smilies' use of the IMG tag is not allowed by the stricter validation at feeds.archive.org. The real fix is to generate RSS feeds that conform to the RSS 1.0 definition, rather than feeds that are just valid
XML - presumably this validator actually uses the DTD/Schema (as in
RichSiteSummary) to validate the RSS feed.
One option is to just turn off plugin processing when the new
$TWiki::pageMode is 'rss' - this would prevent this and any future plugin problems with RSS. Since RSS has a strict format there is no point running normal skins, for example, and any plugin processing done on the topic summary is probably not essential.
Alternatively, we could allow plugins to register so that they are run in 'rss' $pageMode - I think the default should be not to run them, since in most cases they are not relevant or actually harmful to the RSS feed.
--
RichardDonkin - 11 Dec 2002
The IMG or any non
RDF tag is not allowed by the RSS definitions. If you need to put this or any non-rdf spec tag into the field, the right way is via the RSS 2.0 definition. To do it, you define a namespace which allows the the feed to correctly use the IMG or any other valid
XHTML tag.
--
TomKagan - 11 Dec 2002
Good to have confirmation of this. Turning off plugin processing is one idea to remove IMG etc, but this could happen with other tags as well that are inserted by hand at the top of the topic (e.g.
<font>).
What's really needed is a 'strip
HTML markup' statement that is applied in
makeTopicSummary when $pageMode is 'rss', i.e. when building an RSS feed. Time to break out the
Jeffrey Friedl book
again

(Oops, another smiley, but not in the first two lines fortunately...)
--
RichardDonkin - 11 Dec 2002
Normal
HTML tags in the topic text are not a problem because they are stripped by the topic summary function. The img tag is added after that by the plugin. For now I filter out img tags in case the skin has 'rss' in it. That solves the RSS feed problem with smilies, see
Sandbox.TestSmiliesForWebRss
--
PeterThoeny - 02 Jan 2003