Information overload: blog filtering

04 Jan 2005

I recently stopped reading blogs with SharpReader. It’s a great product, but I had over a hundred feeds that I was monitoring and that’s just too much information coming in. No way to get through all that and still get your job done. I now started from scratch with BlogLines and am trying to think twice before adding a new feed (currently at 10).

I remember the IT Conversations/BloggerCon podcast where Robert Scoble talks about the hours he spends each day reading 915(!) feeds. Ok, there are people who only post a couple of articles a week, but there’s also blogs like the excellent MetaFilter that give you between 10 and 20 new stories every day. How much time does it take a person per day to make 1000 decisions like: “Am I going to read this?”, “Should I click on this link?”, “Should I put this in my favourites/blog list/furl list/…?”.

One way to go is to let someone else do the filtering. You could stick to reading 10 blogs that compile the juicy bits. I’ve seen 3 kinds of these aggregated blogs:

Repost: Scoble has an aggregator blog where he re-posts (copy/paste) interesting articles he came across while weeding through oceans of information. Inasfar as Robert’s interests match yours, that is an option.
Excerpt: BoingBoing and the above mentioned MetaFilter have a group of people posting interesting links embedded in small articles (5-10 lines) with the short content or some background info and leave it to you to decide if you want to click further.
Link blog: Jeremy Zawodny has a link blog where he just posts links, hardly any explanation. You get a dozen posts per day with a title like “An Article Buried in Junk” and you then have to decide whether you want to click on it and read it. No background, no personal opinion like “Good point”/”Hilarious rant”/”Despicable proposition”/…, nada. As a filtering tool it does not work for me, but I can imagine that for Jeremy himself it works like a recording of his mouse wanderings, an outboard brain. “What was that eBay story again I saw a couple of weeks ago? Oh, I put it on my linkblog! So I can Google it!”

I guess we’ll have to wait for someone to combine bayesian or collaborative filtering with a feed aggregator, like Wesner Moise suggests. How could this work?

client-based: If there were a tool like Popfile for blogs (this would be an HTTP proxy that can interpret RSS/Atom feeds and mark the interesting posts by changing the title), it could ‘learn’ what posts you find interesting and use its bayesian filtering to find similar articles. Popfile was made for spam detection, so maybe we have to look at spam-oriented collaborative filtering efforts like Razor to get this on our desktop.

server-based: if you read your feeds through a web interface, and the actual content is on a central server, it is much easier to enable filtering. I think BlogLines would be an excellent candidate for this. They already suggest new feeds to add, based on the ones you’re already subscribed to. They just need to dive a level deeper and work on the article level. And the user should have something like a [–

++] quotation toolbar under each article. Mark Fletcher, I’m counting on you!

Let’s see how long it takes before I can say “Told you so”!

Peter Forret

Information overload: blog filtering

Also on this blog ...