Monthly Archive for October, 2004

More is better: the quintuple-neck guitar


I remember when I saw my first double-neck guitar, I was really impressed. That seemed like a huge thing to have hanging from your neck. They became really popular in the Led Zeppelin, Yes and The Who era. The idea is to have 2 guitars handy, like a 6-string and a 12-string, or a bass and a guitar, a fretted and a fretless bass, or 2 guitars in a different tuning. But obviously it’s also one of those macho ’size-matters’ things. The guitarist with the double-neck here is Jimmy Page from Led Zeppelin.


There’s more, of course: e.g. Steve Vai has played the triple-neck heart-shaped monster you see on the side.

Today, triple neck guitars are still rare, because they’re huge, heavy, expensive and utterly pointless. They’re an obscene symbol of self-indulgence, like Missy Elliot’s Lamborghini bed, or Snoop’s jewel-encrusted crunk cup.
(from: Engadget)

King of the hill: the quintuple-neck guitar, used by Cheap Trick’s Rick Nielsen. That is: 36 strings (4 guitars and a 12 string). Respect!

If you're new here, you may want to subscribe to my RSS feed or receive updates via email. Thanks for visiting!

How to Podcast with Blogger and SmartCast

This is a step-by-step manual on how to create your own Podcast feed with Blogger and Feedburner SmartCast.
PRECONDITIONS:
A. you have already created the MP3 files (see podcastingnews.com on software for this)
B. you already have hosting for the MP3 files, i.e. the Mp3 files already have their URL that anyone can access (see podcast hosting on hosting providers).

  1. Create a new blog in Blogger: [your_podcast].blogpost.com
  2. Go to the Settings/Formatting page of your new blog
  3. Set ‘Show Link Field’ to Yes
  4. create a new Post, and put the MP3 file you want to podcast in the ‘Link’ field.
  5. Publish the post, and check the blog (the title of your post will be clickable and point to the MP3 file)
  6. At www.feedburner.com, burn the http://[your_podcast].blogpost.com/atom.xml
  7. Enable the ‘SmartCast’ option for this feed:
  8. Verify your feed on feedvalidator.org (just to be sure – the result should be “This is a valid RSS feed”).
  9. Enter the Feedburner RSS feed into your Podcast aggregator (here: Doppler)
  10. Let the Podcast aggregator update (in Doppler: push “Retrieve Now”): your MP3 file should download
  11. configure your aggregator to introduce the downloaded files into iTunes, Windows Media Player or whatever application you use

Good luck!

Delicious Library: no, we don’t like Microsoft

Delicious Library asks everyone to blog about them, but they are slightly more picky in who can actually access their site: they are unreachable for the tiny minority amongst us running Internet Explorer on Windows.

This website currently requires one of the following modern browsers:
(…)
Windows
Firefox
Netscape
Mozilla

What are they about? Pray you have an ‘acceptable’ browser and check it out!

(via BizStone)

Podcasting and Windows Media Agony (WMA)

Thanks to Feedburner’s new SmartCast (see Podcasting with Blogger), I can now create a Podcast feed with Blogger. It works fine with MP3 files. Before I buy my portable MP3 player (iPod or Zen), I wanted to check out if it’s easy to podcast Windows Media files. The iPod does not support .WMA files, but if it turns out that they don’t integrate easily into podcasts anyway, that’s less of an issue.

Scenario 1: Windows Media Services aka Cougar
I’m a sucker for sampling trivia, and there now is a great program on Studio Brussels on ‘modern music’: De Sample Minds. Lots of fun music (including a weekly dose of the Beach Boys), ample background info on the technical and legal aspects of sampling from DJ Bobby Ewing and above all, they don’t take themselves too seriously. The only problem: it plays on Sunday afternoon and I’m never near a radio at that moment. However, they publish the program archive on-line in ASX/WMA format. So I could make a feed for that, right? I downloaded the .asx files (basically XML-based playlists) and retrieved the .wma references in them. The latter are hosted on wm.streampower.be, which is a Cougar/9.00.00.3372 server. ‘Cougar’, that’s the Windows Media Services (not the most compatible of servers, as will be revealed).

  • 1st try: Sample Minds in Webjay. The Cougar server does not support HTTP HEAD, which Webjay uses to check if the URL actually exists. So the links all look ‘dead’ and do not get included into the auto-generated playlist. The ‘enclosures’ in the Podcast feed cannot be downloaded by iPodder and Doppler, presumably because the Cougar server does extra checks like verifying UserAgent, HttpReferer and consumed bandwidth. When it detects an unusual downloader, it just returns a audio/x-ms-wax file that references Ref1=http://(server)/(path)/(file).wma?MSWMExt=.asf.
  • 2nd try: Sample Minds in Blogger: I convert the ATom feed with the new FeedBurner SmartCast, but because HTTP HEAD does not work, the link looks dead to FeedBurner and do not get included into the RSS feed. But even if they would be included, no files would be downloaded (judging by the experience with Webjay).
  • 3rd try: and this one works, but it is not a Podcast: SampleMinds in ASX playlist. But it only works for streaming, not for downloading.

Scenario 2: ‘Normal’ webserver (like IIS)
Let’s see what happens with .WMA files on my own webserver. I’ve included some in the SmoothPod blog. Again Feedburner SmartCast converts the Atom feed to an RSS Podcast feed and, lo and behold, it works! The WMA files are detected, converted into a perfect
<enclosure url="http://www.smoothouse.org/smoothouse/media/hardwork.wma" length="4190342" type="audio/x-ms-wma" />
which is picked up by Doppler without a problem.

Conclusion:
WMA files delivered from a ‘normal’ webserver should cause no problem. Podcasting with WMA files from a Windows Media Services server will probably not work.

Probe average cpu utilisation (MRTG)

There are two main tools to keep track of your CPU usage: top and vmstat.

  • top is an interactive tool: it shows you the CPU usage of each process, as well as overall statistics, updated every 5 seconds. It’s good for hands-on checking.

    #top 17:18:34 up 2 days, 8:14, 3 users, load average: 0.00, 0.00, 0.00
    47 processes: 46 sleeping, 1 running, 0 zombie, 0 stopped
    CPU states: 0.1% user 0.1% system 0.0% nice 0.0% iowait 99.6% idle
    Mem: 1030872k av, 1022256k used, 8616k free,
    0k shrd, 104844k buff
    777088k actv, 12k in_d, 22296k in_c
    Swap: 2048276k av, 8120k used, 2040156k free
    640080k cached
    PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
    30776 root 19 0 1140 1140 852 R 0.9 0.1 0:00 0 top
    1 root 15 0 504 464 436 S 0.0 0.0 0:03 0 init (...)

    But say you want to get just one number (percentage) back, so you can use it for logging.
  • vmstat wil give you the following output:

    #vmstat
    procs memory swap io system cpu
    r b w swpd free buff cache si so bi bo in cs us sy id
    0 0 0 7964 8804 104712 640224 0 0 2 16 129 27 0 0 100

    You can run vmstat 1 5 to get 5 consecutive measurements (1 second apart). The number we want is the average CPU usage, or (100% – idle). The following command will do the job:
    #vmstat 1 5 | gawk "/0/ {tot=tot+1; id=id+$16} END {print 100 - id/tot}"
    gives
    0.4

Angelina ‘The Lips’ Jolie


The readers of the magazine Esquire have just elected Angelina Jolie most sexy woman alive. I beg to differ. She might be alive all right, but most sexy woman? Those lips, my God! Collagen alert!

The full list in Esquire was:

  1. Angelina Jolie
  2. Halle Berry
  3. Britney Spears
  4. Jessica Simpson
  5. Beyonce
  6. Charlize Theron
  7. Jennifer Aniston

Errm … Britney? 3rd most sexy woman? My take on this: put Charlize on 1, Beyonce on 2 and send the rest shopping!

[Listening to: "Woman to Woman" - Joe Cocker - Sampled Vol 4 (CD 1/2)]

Estimate # of lines in a log file

Let’s say you need an (approximate) count of the number of lines in a huge file. The most obvious way of calculating this would be using wc, but this actually can be quite slow:
# time wc -l /var/log/squid/access.log
2812824 /var/log/squid/access.log
real 0m43.988s

(counting is done at 64.000 lines/sec)

Running wc without the -l (only count lines) would be ever slower because it would also count the words, instead of just the LF (linefeed) characters. But using wc -c is very fast! This is because the filesystem keeps track of each file’s filesize (= number of characters/bytes), so the file does not even have to be read to give this number. Can we estimate the # of lines from the # of bytes?

For the type of file we are talking about here (a Squid log file) there actually is a way. The file is more or less ’square’, meaning that every line is about the same length (it contains date, status, URL, …).
If we take the beginning of the file (the first 10000 lines):
# head -10000 /var/log/squid/access.log | wc
10000 100000 1775257

we see that every line is about 177 chars long.

The end of the file (the last 10000 lines):
# tail -10000 /var/log/squid/access.log | wc
10000 100000 2047887

gives us a number of 204 chars/line.

Let’s take some more data and combine both:
# ( head -50000 /var/log/squid/access.log ; tail -50000 /var/log/squid/access.log ) | wc
100000 1000000 19488905

which gives us an average of 195 chars/line.

A file size of 533.229.920 bytes (533MB) would lead us to estimate the # of lines to 2.734.512, where the actual # of lines is 2.818.184 (3% difference). That is: we lose 3% accuracy but the calculation takes almost no CPU time, instead of 45 seconds. This might be a trade-off you are willing to accept!

Calculate hit rate from a log file

You have a huge file that contains one line per request/transaction. Some of the lines are of one type (e.g. ‘HIT’), some of another (e.g. MISS). Let’s say you want to calculate the hitrate, but as fast as possible.
We take a Squid log file of about 140MB. How long does it take to count how many lines it has?
# time wc -l /var/log/squid/access.log
845212 /var/log/squid/access.log
real 0m6.523s
(about 21.4 MB/s or 130.000 lines/s)

And now let’s just filter out the lines containing ‘HIT’ and count those:
#time sh -c "grep -i HIT /var/log/squid/access.log | wc -l"
Wow! This takes ages (I stopped it after 15 minutes) and the grep takes 100% CPU all the time. So let’s look for another solution.

Maybe gawk? First let’s see if it is much slower than wc -l for counting lines:
# time gawk "END {print NR}" /var/log/squid/access.log
845907
real 0m26.129s
(5.3 MB/s or 32.000 lines/s – 4 times slower)
And now let it count the hits too:
]# time gawk "BEGIN {hit=0} /HIT/ {hit = hit+1} END {print hit/NR*100}" '/var/log/squid/access.log'
84.5023
real 0m32.836s
(4MB/s or 25.000 lines/s – slow but acceptable)

Do we actually need a count on the whole file? What if we just took the last (i.e. most recent) 100.000 lines? The result would be a better indication of what the current hit rate is, and the speed of calculation would be more predictable.
# time sh -c "tail -100000 /var/log/squid/access.log | gawk 'BEGIN {hit=0} /HIT/ {hit = hit+1} END {print hit/NR*100}'"
92.305
real 0m3.332s
(30.000 lines/s)

It is actually a bit slower the first time you run it, probably due to disk or filesystem caching. So if you want your hit rate calculation to take less than 2 seconds, you could take the last 50.000 lines. Done!