Monthly Archive for November, 2005

Professional Communicator

Just received a sweet little email from self-proclaimed “professional communicator” Jody Robb, about a free RGB-2-CMYK converter tool of mine:

From: Jody Robb [mailto:jr@***]
Subject: Poxy program

I wasted ten minutes on your poxy program. If you�re gonna put stuff up on the net, make sure it works, arsehole.

People like you who waste other people�s time and resources to show off are the slime in the gutter of the internet.

Jody Robb
Digital Media Producer/Consultant
(…) Australia

followed by an email 5 minutes later: “It does work but you’re still a fuckwit”.

You make us non-professional communicators blush, Jody!

Technorati:

Google Toppers: pick your title carefully

Everyone with a bit of SEO (Search Engine Optimalisation) experience knows that the title of your HTML pages is crucial. But just how strong is that tiny part of your HTML? When I noticed I had become #1 for the Google query “media technology belgium” in Google (3 words I put in my blog title), I started investigating a bit further. I was first in a total of 18 million pages (according to the “of about” part in the top right of each Google results page). Could I do better than that? I could -as can be seen lower- but there are much stronger examples from other blogs.

Some remarks up front:

  • Keep in mind that Google has indexed something like 15 billion pages so the theoretical walhalla of search results would be: #1 of about 15.000.000.000 pages. Impossible to reach of course (expect if Google would allow to search for the character “.” in an URL, e.g.) – but let’s see how close we get.
  • Search term combinations with e.g. the word “blog” will always be smaller than 512,000,000 since that is the # of pages in Google for “blog“.
  • Sometimes the # of results for a query with 2 words returns less results that a query with the same 2 and 1 extra word. Does not make sense, but there results might come from 2 different servers/stat versions/indexes.
  • All results below are subject to change. You might do the same query now and get another #1 or another total. It’s normal, Google is a huge beast that is always on the move.

Google Toppers Game

Try to guess the blogs that can claim these word combinations in Google:

EASY

MEDIUM

HARD

Do you know of any other impressive search queries that can be claimed by a blog? Is there a #1 of about 1,000,000,000? Add them to the comments!

Technorati:

Filling a terabyte iPod

Muster said that within five years, Apple could release an iPod with one terabyte of storage — that’s almost 17 times the maximum amount of iPod storage Apple currently offers.
Munster envisions a one terabyte iPod as a portable, “coffee table” media center that would allow users to store hundreds of movies and thousands of photos and songs.
cnn.com

A 1000 GB iPod, that is

  • 200 movies or 370 hours of full quality DVD
  • up to 2000 hours (almost 3 months non-stop) at DivX/Xvid/MPEG-4 quality
  • using the H.264 video compression: 120 days or 4 months of video!
  • 1500 music albums of full quality CD (which means, no Sony XCP)
  • 15.000 albums if your rip/compress them to MP3 first, maybe 20.000 if you use WMA/AAC (that is over 2 years of audio to listen to!)
  • 2500 episodes or 100 seasons of TV series like Lost, L-Word, Desperate Housewives, Sopranos, … in compressed format (hey, it’s a 2,5″ screen, who cares about HD?)
  • If your terabyte iPod breaks down and you buy a new one, it will take you between 3 hours (FireWire 800 Mbps) to 2 days (Wifi 802.11g) to fill it up again (from the backup you of course had put on your snug little home 10GB RAID-5 storage cluster thingy).
    If by then all portable devices have 10-Gbit Ethernet built in: 15 minutes will be enough to fill ‘er up.
  • Our then-standard 48 megapixel camera would create 72MB RAW images, of which the iPod could store 14.000, or if you would compress them to 5MB JPEG: 200.000 pictures.

Other predictions: the iPhone (or Apple as mobile virtual network operator) and the iTIVO, a media-center/time-shifting/TV/video/DVD hub , all in the next 12-24 months. Let’s hope this inspires some people to seriously vamp up their design/user interface teams (Nokia, Microsoft, I’m looking at you!).

Technorati:

Google: turning cash into cache

Google and dark fiber

Interesting theory on why Google is buying up miles and miles of ‘dark fiber’ hi-speed optical network (they have been doing this since at least 2002).

The probable answer lies in one of Google’s underground parking garages in Mountain View. There, in a secret area off-limits even to regular GoogleFolk, is a shipping container. But it isn’t just any shipping container. This shipping container is a prototype data center. Google hired a pair of very bright industrial designers to figure out how to cram the greatest number of CPUs, the most storage, memory and power support into a 20- or 40-foot box. We’re talking about 5000 Opteron processors and 3.5 petabytes of disk storage that can be dropped-off overnight by a tractor-trailer rig. The idea is to plant one of these puppies anywhere Google owns access to fiber, basically turning the entire Internet into a giant processing and storage grid.
While Google could put these containers anywhere, it makes the most sense to place them at Internet peering points, of which there are about 300 worldwide.

Two years ago Google had one data center. Today they are reported to have 64. Two years from now, they will have 300-plus. The advantage to having so many data centers goes beyond simple redundancy and fault tolerance. They get Google closer to users, reducing latency. They offer inter-datacenter communication and load-balancing using that no-longer-dark fiber Google owns. But most especially, they offer super-high bandwidth connections at all peering ISPs at little or no incremental cost to Google.
from “Google-Mart” (Cringely)

King of cache

Having a datacenter close to any customer is of course the best way to do swift and reliable caching. Presumably not for its primary activity web search, since caching responses for search makes not much sense (see Pareto doesn’t do search). That kind of local storage/network capacity can nevertheless be used to cache other stuff:

  • web pages for Google Accelerator
  • email messages for Gmail
  • images for Google Images
  • video for Google Video
  • audio for Google Audio (doesn’t exist yet, but is bound to be created at some point)
  • voicemail for Google Phone (not yet, but …)
  • peer-2-peer caching: BitTorrent, P2P telephony…
  • And of course also hosting of users’ content. Google makes money by indexing it, making it searchable and putting contextual ads next to them, so they can host people’s audio/video/images for free.

And all that processing power (with some simple extrapolation: 17 Teraflops per datacenter block) could be used to index, filter and convert data or just rent out for distributed computing.

Orders of magnitude

How big is a petabyte? A thousand terabyte, but that doesn’t say much. “If digitized with full formatting, the seventeen million books in the Library of Congress contain about 136 terabytes of information.” (Berkeley 2003) So a datacenter of 3.5 petabytes, that’s about 25 libraries of Congress. At the fastest network speed we know, 10 Gbit Ethernet or 108 TB/day, it would take 32 days to fill one of such datacenters. And it demands something between 5 and 10 Megawatt of power.

Technorati: