Edgeio: edge aggregator

Michael Arrington has just launched his new baby: Edgeio, a classifieds aggregator. Edgeio will spider and index anyone’s feed and aggregate the posts tagged with “listing”. It then clusters the other tags in order to attach the post to the right classifieds category. The revolutionary thing here is that Edgeio does not require you to post your offer on their own site, they go and take it from yours. Edgeio clearly states that they start with classifieds as an example, a proof of concept for a concept that is much broader than that (that sounds like an echo from the Google Base launch).
As I understand it from their specs, they use the standard RSS <category>listing</category> categories from the RSS spec, no microformats (see further).

Pioneer?

The pioneers of this type of aggregation are Technorati (specifically Tantek Çelik): they have been using the rel=tag microformat <a href="http://technorati.com/tag/music" rel="tag">music</a> (instructions for laymen and experts) since the beginning of 2005.
The people at Technorati also have created “Most Popular” pages based on the same principle:

  • Popular News: “The news stories people are talking about right now, ordered by new links to news sites in the last 48 hours.
  • Popular Movies: “The movies people are talking about right now, ordered by new links to the Internet Movie Database in the last 48 hours.
  • Popular Books: “The books people are talking about right now, ordered by new links to Amazon in the last 48 hours.
  • Popular Blogs: “The biggest blogs in the blogosphere, as measured by unique links in the last six months.

The basic concept is: find a link type that identifies a topic/resource (e.g. http://www.imdb.com/title/tt0388795/ for the movie “Brokeback Mountain”) and aggregate all blog posts that have such a link. Whereas for books it makes sense to use Amazon as principal link source for books, it is more difficult to find a good one for music CDs (Amazon? iTunes? CDNow? CDBaby?), TV shows, political candidates, theatre plays, …

Competition?

  • I think Technorati, if it wanted to, could create a Edgeio-like spin-off in under a week. One small difference is that Technorati spiders sites (all HTML), not just feeds (post contents + metadata).
  • Google Blog Search certainly has the horsepower, but little experience in microformats. Then again, they have the worlds’ biggest war chest, they could buy talent and resources. And the PageRank reputation ranking could come in handy.
  • The ‘smaller’ blog search players: Feedster, Ask/Bloglines, IceRocket, … already have the content, but would still have to develop the service.
  • Feedburner could move in that direction too, but then only for their own burned feeds. Or they could sell a service to companies like Edgeio to bulk download changed feeds from all Feedburner feeds in one go.

Edge aggregator

The direction Edgeio and/or Technorati could evolve to, is a “generic edge aggregator”. In the end we don’t want 5000 different services all scraping our blog feeds for each little niche application. The ideal would be a handful of aggregators that provide APIs to data and aggregation services, either paid, or monetized through contextual advertising. Image a hypothetical ‘edge aggregation’ provider “GoogRatiO“.

  • GoogRatiO spiders and indexes ALL feeds of all blogs. Oh heck, it even keeps a cached copy of each post.
  • GoogRatiO allows anyone to set up a new project on a URL myproject.googratio.com. In the project settings, you can specify which URLs should be tracked, the importance of recency, frequency and reputation, and it would automatically show a hitparade of the top 10/50/100. E.g. the dance music site Juno could set up juno.googratio.com that tracks all http://www.juno.co.uk/artists/…/ links in blog posts of the last month and shows an hourly updated Buzz chart of the top 20. GoogRatiO places contextual advertising on each page.
  • GoogRatiO has an API that allows a third party to use its database. It includes functions like Get_all_posts_for_URL_base("http://juno.co.uk/artists") and Get_aggregated_buzz_for_URL_base("..."). Below 1000 requests/day GoogRatiO is free. Above that: subscriber fee.
  • GoogRatiO also calculates a ‘reputation’ for each blog feed. This is needed to deal with splogs and other scam artists. So each link does not weigh the same. Compare it to Technorati’s “blog authority” or Google’s “PageRank”. For a company like Juno, a link from the Rollingstone blog is worth more than one from a Blogspot site a 14-year old fan just set up.
  • GoogRatiO will links blogs to actual sales (with money being paid and all). So it could come up with some inventive ways of redistributing affiliate fees

Imagine the ease with which applications like “Most Popular Youtube video“, “Most popular De Standaard newspaper article”, … could be created. The Long Tail at work!

Edgeio Buzz Timeline

As a professional reporter on Web 2.0 projects, Mike knows exactly how to plan the buzz for his new project:
Edgeio Buzz

  • 2005-10-07: Teaser: “Edgeio will give you the ability to do new and (we think) really exciting things with your blog” – Techcrunch
  • 2006-02-02: SDForum announcement “All Your Classifieds Belong To Us” – Jeff Clavier
  • 2006-02-09: “Teare spilled a lot of beans tonight at an SDForum online-classifieds event at the GooglePlex” – BusinessWeek, Dave Winer, Scobleizer
  • 2006-02-11: “We will be focusing on classified listings of any type to start” – 1st post on Edgeio blog
  • 2006-02-12: “Mike Arrington called me today and gave me a demo ” – Mashable (Feb 12)
  • 2006-02-18: “I was given a personal tour” @ TechCrunch NakedConversations Party – Dan Farber @ ZDnet (Feb 18)
  • 2006-02-20: Buzz acceleration – Buzzmachine, SiliconBeat, A VC
  • 2006-02-27: Official launch: Techcrunch, helped by Om Malik, Read/Write Web, WeBreakStuff

(There are obviously clear advantages in finding seed investors/business consultants/software developers/media buddies that are also A-list bloggers.)

This is certainly a project to follow!
PS: thanks to Bart and Francois for bringing Edgeio to my attention.

Technorati:

3 thoughts on “Edgeio: edge aggregator”

  1. On the concept of an Edge Aggregator:

    The Technorati people should be inspired by your ideas to extend the Technorati api http://developers.technorati.com/wiki/TechnoratiApi !

    On a related idea that came to mind:

    http://rssping.com/ is an effort for enriching the info sent to pinging services, so that blog search engines do not need to visit the blogs anymore. As far as I can see, it could be used to send the semantically significant parts of a posting (or the entire posting for that matter).
    However, as with so many lofty efforts, it doesn’t work. Since blog search engines compete each other on features, none of them will limit itself to the minimum agreed upon in the standards of rssping.

    On EdgeIO itself:

    as nice as the concept of distributed databases may be, in this particular example, it is just not going to work. There is no way a ebay-type of service can thrive on the listings submitted by those few tagging and microformat freaks around the planet. And the reliance on that “listing” tag makes it ridiculously vulnerable for spam or sabotage (even Google’s Base doesn’t seem to be able to cope with spam: http://mashable.com/?p=188 ) (And as you said, if it DID, competing services like technorati, IceRocket, could set up a competitor in a few days.)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.