Jason Fried points out an interesting idea for an RSS reader: make it group items that link to the same story. This sounded like a simple task for FeedDemon’s XSLT-based newspaper styling – I figured, just use the Muenchian Method to group items that have the same <link>. Problem solved!
Then I remembered that the RSS <link> element is the URL of the item itself rather than the URL of the story being discussed. Oops.
One possible solution is to group items by title, since items that talk about the same thing often have the same title. However, because this isn’t always true, grouping items by title isn’t entirely reliable. (If you’re viewing this in FeedDemon, you can see what I mean by clicking here to apply a newspaper style that groups by title).
What’s really needed is a way to group by the links within each item’s description, but my limited XSLT experience has left me scratching my head over how to do this. So…are there any XSLT gurus out there who want to take a crack at this?
You should just use some XPath expression like item/description/a@href, but it’s prone to be unreliable since a description may include a lot of links… I’ll think some more about it.
Wouldn’t this be difficult or even impossible for escaped descriptions? After all the description element isn’t actually XML, it’s a big chunk of CDATA that just happens to be HTML when extracted. However feeds that use XHTML in a namespace should be doable.
Given that FeedDemon seems to store the channels ‘escaped’ rather than ‘namespaced’ this might not actually be possible?
SharpReader has been doing this for about a year now, I believe.
My XSLT is a bit rusty, but could you use the contains() function to look through the descriptions?
Someting like:
<xsl:for-each select=”item[contains(description,’http:||www.somesite.com/link/url.html’)]”>
…
</xsl:for-each>
Then again, that would mean you’d have to know a link beforehand to try and match against. I’ll post this anyway in case it sparks another solution…
I have to agree with Dan. It seems to me like a job for a regular expression – find URL and than check other descriptions for it. You need XSLT version 2 or possibly javascript extension.
Question is, is it necessary to do this process inside XSLT? It could be easier to do that using some programming language.
Yes, this could certainly be done within FeedDemon itself. I was just hopeful it could be done in XSLT :)
Try this http://www.write.cz/rssgroup/index.xml
It goes through every description and looks for other descriptions containing the same URL (if some). Maybe it helps. Works under Internet Explorer only.
Here is xml and xsl for download: http://www.write.cz/rssgroup/rssgroup.zip
Nick,
Do you know what the link is in the first place? For example, will there be an XSLT parameter or XML node that specifies the link that’s being matched against? Your real problem is keeping track of what links have previously been matched against [I think. There may be a solution to that problem as well].
Jan –
Very cool. Although while testing, http://www.example.com/ and http://www.example.com/index.htm are picked up as different URL’s.
Jan –
Very cool. Although while testing, http://www.example.com/ and http://www.example.com/index.htm are picked up as different URL’s.
David your surely right, I’ve just wanted to show possible way to do it, its gonna need some hacks.
BTW, I’ve cleaned that code a bit and made a non-javascript version (index2.xls). Download at http://www.write.cz/rssgroup/rssgroup.zip
Jan, this is great – thanks! I’ve taken your example and made it into a FeedDemon style, which can be downloaded from http://www.bradsoft.com/feeddemon/getstyles/1.0/Related.fdxsl
Nick, looks nice, just one thing – at line 62 in select should be a $url!=” condition, otherwise it makes related all of the news without links.
Thanks for the correction, Jan – I’ve uploaded the changed FDXSL to the same location.