March 11, 2004

RSS 2 Advogato

My initial plan for this site was to use Rawdog to suck my Advogato diary across, but I've since found that Advogato rather limits what I can do. The big problem is that I like to clearly seperate out the different topics I'm talking about. The sanest ways to do that on Advogato seem to be:

  • Only talk about one thing per day
  • <p><strong> My Heading </strong></p>

This is ugly at best. It wouldn't be so bad if one could use headings, but that isn't an option. So the new plan is to work in reverse.

I now have movable type installed on this server and the game plan is:

  1. Write nice templates for it
  2. Write suitable RSS feeds for it including a full entry feed
  3. Write software that will:
    1. download the RSS feed
    2. splurge together all the entries from each day
      • convert the entry titles to the nasty strong paragraph code above
      • discard, or convert to other markup, anything that Advogato doesn't allow (like <code>)
    3. talk to Advogato via XMLRPC and get my diary from there
    4. compare the combined RSS entries to the diary
      1. Update any changed entries
      2. Add any new entries

My first task is to decide what language I'm going to write this thing in. I've been experimenting with Python lately, and it seems quite good at XML. On the other hand, I am rather more experienced with Perl. Java is a no-no, while I love JavaDoc and 'everything is an object', its a real pain to work with when dealing with variable length arrays.

Python would allow me to use FeedParser, which would make life easy. The major problem I have with FeedParser is that its liberal. I really dislike liberal XML parsers since tools supporting bad feeds, let authors get away with bad feeds, which forces new tools to also support bad feeds, which raises the barriers to entry for those people wishing to write such a tool. Eventually you get locked into a cycle of liberalness and you get everything that is bad with the web today.

Since I am going to be working with data I control, I know that I don't need a liberal parser, so I can pick and choose any old XML engine. I'm now rather tempted by Perl, since then I get to use LWP for getting the feeds in the first place (and I have a nice little book on the subject) which should make it really easy.

Posted at March 11, 2004 10:12 AM (TrackBack)

Comments