Idea: hosted classification service

Yesterday evening I was watching “How to replace yourself with very small shell script” by Hilary Mason.

[youtube width="500" height="360"]http://www.youtube.com/watch?v=IoQ4tka1zNk[/youtube]

In short: she uses some scripts to process incoming mail and send outgoing reminders. The part that really interested me is the one where she uses classification, probably naive Bayes, to extract topics from the tweets of her friends.

That made me think about Paul Graham’s famous spam essay (2002), which boosted the development of Bayesian spam filters for email. A Bayesian spam filter will, in very broad terms, analyze the words in a message, compare them to words typically used in a ‘spam’ or ‘ham’ collection, and come back with either a binary classification (spam/ham) or a spamminess score. The first time I read that article must have been back in 2003 or 2004. I recall installing one of the early versions of POPFile, a spam filter written in Perl. It worked as a POP3 proxy and did a pretty good job. POP3 made sense, because at that time, the only spam we had was email spam. Now there’s blog spam, comment spam, trackback spam, Twitter spam …

But these are the cloud days, right? If you think about it, Akismet (WordPress) and Mollom (Drupal) offer cloud-based spam filtering. Before them, Postini (now part of Google) offered hosted spam filtering services for email. But would it be possible to offer a very generic web service-like document classification service? Imagine the service classifier.com:

  • you register and get your own subdomain at myapp.clasifier.com
  • you choose whether your service will return one of a number of classes (ham/spam or urgent/normal/ignore …) or a numerical score.
  • you choose a tokenizer: defines what words will be extracted from your input: e.g. you can ignore, include or reformat email headers, you can ignore or transform HTML code, …
  • you create a corpus per category, the service will tell you if you have enough input
  • you call the service with an HTTP POST with an API key and the new document content to be classified, and you get back (in JSON/XML) the result

Sounds like something Google would offer? Well, they do, in some way: http://code.google.com/apis/predict/ Now if someone would develop a nice and easy interface around it …

Not happy with the Canon 500D

Couleur Cafe 2006
In June 2006 I bought my first reflex camera: a Canon 350D. About the same time I started taking pictures of tango (above: my first tango picture, at Couleur Cafe 2006). And it was the start of an exciting journey. Concerts, milongas, tango festivals, portraits, I discovered the joy of creating – or recording –  beauty. It has become a passion, and a privilege to do. I love the concentration, the play with light, the search for the right frame, the waiting for the perfect moment and then, every now and then, the joy of seeing that you’ve created an image that actualy IS worth a thousand words.

Along that way, my 350D was my trusted accomplice. I took it everywhere, first in a simple black camera bag, afterwards, when I started buying more lenses, in a Lowepro backpack. Eventhough the screen on the back of the camera was small, it gave me enough feedback to know whether I was taking pictures the right way, allowed me to finetune ISO, whitebalance, shutterspeed. It sometimes felt like the extension of my hand, of my eye. I just loved that camera.

But then, end 2009, it started breaking down. First random power issues, then just dead. It was sent to Canon, they said: completely oxydated, we have to replace the whole interior. So I needed a new camera. I doubted a lot: should I take the 500D, its successor, or the 5D Mk II. In car terms: should I stay in the BMW 3 series, or move up to the 5? After some weeks of hesitation (“that 5D is a lot of money“), I finally settled for the 500D. Boy, have I regretted that.

Continue reading ‘Not happy with the Canon 500D’

Focal length for the common man: “portrait distance”

I remember that before I started photography on a serious level, I had some understanding of shutter speed, but none of aperture and focal length. Even when I read what they meant, I still couldn’t ‘picture’ it, had no feeling for the numbers. Let’s leave ‘aperture’ for another time and just concentrate for now on the concept of “focal length”

First of all, the focal length of a lens is not the same as the actual physical length of the lens. Yes, 200mm and 300mm lenses (telephoto lenses) tend to be longer, but they’re not exactly 200mm and 300mm long. For instance, the Sigma 55-200mm F4-5.6 DC HSM is 85mm (3.3″) long,  while the 70-200mm F2.8 II EX DG lens is 184mm (7.2″). Same maximal focal length, but more than twice as long.

So what is focal length? I could explain that it is “the distance from the center of the lens to the principal foci (or focal points) of the lens“, but that wouldn’t make it more comprehensible, would it? Well, I read through the theory, with tangens of the viewing angle and stuff, and I think I understand it (I’m an engineer, I actually like trigoniometry). A 200mm lens gives a viewing angle of 12° on the diagonal. Still not clear? That’s when I thought: let’s invent something more tangible: the “portrait distance“. Say you need a surface of about 72cm x 48cm (28″ x 18″) to make a portrait of a person (not just a headshot, but with some torso on it too). See some examples below:

Vriendschap foto's voor Erfgoeddag Sandy @ Chaff Brussels Tango Festival - Day 1 ¿Que? Fado & Tango - Dirk

Well, the distance between the camera and the person you’re making the portrait of, will be +- 20 times the focal length.

Continue reading ‘Focal length for the common man: “portrait distance”’

“I will you in the night” – Idool 2003

At the Pixagogo reunion dinner the other evening, I was reminded by one of my ex-colleagues Steven (‘Beukie‘) that back in 2003 I was having some fun with remixes/mashups. More specifically, I took some vocals of the Belgian “Idool 2003″ preselections, and added music to them. To make the exercise more fun, I took samples from the ones that were really musically challenged.

So I went back in my archives and here are the three that I found:

  • “I will you in the night”
    Marnik had translated a Flemish song, into his own ‘impoverisation’, as he proudly announces. Unfortunately, the Dutch “Ik wil je” (I want you) does not normally translate into the English “I will you“.

    I also found back the original clip on Youtube (via partybrigade):
  • “But if I let you go”
    This ‘Pieter’ was officially called the worst candidate by the jury, and that decision is not impossible to understand. He had no tone, no rhythm and bad English. “There snow one like you!” He needed a lot of input of Madonna to make it bearable.
  • “Killing me softly”
    She was not that bad a singer, but her timing was awful. I remember having to cut and trim a lot to align her words to a steady beat. I made it a slow jazzy version with a lot of echo.

Out of that edition of Idool came Hadise, Brahim and Natalia, so it wasn’t all that bad. Still, there was also the girl with the wobbly hands: