<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>blog.forret.com &#187; Linux</title>
	<atom:link href="http://blog.forret.com/categories/linux/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.forret.com</link>
	<description>Tango, photography and whatever&#039;s bleeding edge</description>
	<lastBuildDate>Mon, 01 Feb 2010 11:55:40 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Redirecting with Apache&#8217;s .htaccess</title>
		<link>http://blog.forret.com/2007/07/redirecting-with-apaches-htaccess/</link>
		<comments>http://blog.forret.com/2007/07/redirecting-with-apaches-htaccess/#comments</comments>
		<pubDate>Wed, 04 Jul 2007 13:09:27 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2007/07/redirecting-with-apaches-htaccess/</guid>
		<description><![CDATA[When you migrate web sites from one place to another, and the URLS change, you don&#8217;t want to lose visitors that still use the old links. If your &#8216;old&#8217; website ran on Apache, you can use its mod_alias/mod_rewrite functionality to automatically redirect to the new URL. This involves adding redirect rules to the .htaccess file [...]


No related posts.]]></description>
			<content:encoded><![CDATA[<p>When you migrate web sites from one place to another, and the URLS change, you don&#8217;t want to lose visitors that still use the old links. If your &#8216;old&#8217; website ran on Apache, you can use its mod_alias/mod_rewrite functionality to automatically redirect to the new URL. This involves adding redirect rules to the <code>.htaccess</code> file in the base folder of the redirects. Some examples:</p>
<p><b>Generic structure of the .htaccess redirects</b><br />
<code><br />
<strong><a href="http://httpd.apache.org/docs/1.3/mod/mod_alias.html#redirect">Redirect</a></strong> permanent /(old url) (new url)<br />
Redirect ... (add all your one-2-one redirects here)<br />
RedirectMatch permanent ^/old_stuff/.*html$ http://www.example.com/<br />
RedirectMatch ... (add your catch-all redirects here)</p>
<p><strong>RewriteEngine</strong> on<br />
<strong>RewriteBase</strong> /blog/<br />
<strong><a href="http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html#RewriteRule">RewriteRule</a></strong> ^([regex])$ http://blog.example.com/$1   [R,L]<br />
RewriteRule ... (add all your variable redirects here)</code></p>
<p><b>EXAMPLE: old Blogger site (on your own server) to new Wordpress site</b><br />
I&#8217;ve done <a href="http://blog.forret.com/2005/12/migrating-from-blogspot-to-a-real-blog/">a migration from a blog published by Blogger (via FTP) onto my own webspace</a>, to a blog run by Wordpress. I&#8217;ve used the following Rewrite rules to handle the redirections.<br />
* HOMEPAGE:<br />
redirect /index.html and / to your new blog URL<br />
<code>Redirect permanent / http://blog.example.com/<br />
Redirect permanent /index.html http://blog.example.com/</code></p>
<p>* FEED:<br />
redirect e.g. /atom.xml to your Feedburner feed<br />
<code>Redirect permanent /atom.xml http://feeds.feedburner.com/(exampleblog)</code></p>
<p>* ARCHIVES:<br />
redirect e.g. /archive/2005_03_posts.html to the new Wordpress archives<br />
<code>RedirectMatch permanent /archive/([0-9][0-9][0-9][0-9])_([0-9][0-9])_.*$ http://blog.example.com/$1/$2/</code></p>
<p>* POST PAGES:<br />
This is tricky, because Blogger and Wordpress do not use exactly the same rules for constructing the text-like URL (the &#8216;post slug&#8217;). E.g a post called <em>how-to-podcast-with-blogger-and.html</em> on my old Blogger site became <em>how-to-podcast-with-blogger-and-smartcast/</em> on the new Wordpress one. So what I did consisted of 2 type of rules:<br />
a) redirecting individual pages<br />
<code>Redirect permanent 2004/10/how-to-podcast-with-blogger-and.html http://blog.example.com/2004/10/how-to-podcast-with-blogger-and-smartcast/</code><br />
b) a generic rule for the others (this uses Rewrite instead of RedirectMatch!): each page is redirected to a search on the Wordpress blog within the correct month with the two first words of the title:<br />
<code>RewriteRule ^([0-9][0-9][0-9][0-9])/([0-9][0-9])/([a-z0-9]*)-([a-z0-9]*).*$ http://blog.example.com/$1/$2/?s=$3+$4  [R,L]</code><br />
This method is far from perfect, but will bring visitors a lot closer to the right page. If you use pretty distinctive words for titles (e.g. &#8220;<a href="http://blog.forret.com/2006/10/myspace-bulletin-and-other-spam/">Myspace: bulletin and other spam</a>&#8220;), chances are the right page show up first. If you start all your posts with &#8220;The ten best ways to &#8230;&#8221; then you will need a more sophisticated rule; e.g. using the 6th and 7th word:<br />
<code>RewriteRule ^([0-9][0-9][0-9][0-9])/([0-9][0-9])/[a-z0-9]*-[a-z0-9]*-[a-z0-9]*-[a-z0-9]*-[a-z0-9]*-([a-z0-9]*)-([a-z0-9]*).*$ http://blog.example.com/$1/$2/?s=$3+$4  [R,L]</code></p>
<p><b>Not losing the querystring</b><br />
Redirect and RedirectMatch cannot redirect to a URL with a querystring (e.g. to <code>newpage.php?param1=val1&#038;param2=val2</code>). For that you will need to use the RewriteRule. An example: redirect all links like test.asp?param=value on the old domain to the new domain while keeping all querystring parameters:<br />
<code>RewriteRule ^tools/test.asp\??(.*)$  http://web.example.com/tools/test.asp [L,QSA]</code><br />
where the QSA = (query string append) keep existing querystring, and L = (last rule) stop looking further for rule matches.</p>


<p>No related posts.</p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2007/07/redirecting-with-apaches-htaccess/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Convert Bind DNS zone into PTR records</title>
		<link>http://blog.forret.com/2005/06/convert-bind-dns-zone-into-ptr-records/</link>
		<comments>http://blog.forret.com/2005/06/convert-bind-dns-zone-into-ptr-records/#comments</comments>
		<pubDate>Wed, 15 Jun 2005 13:32:00 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2005/06/convert-bind-dns-zone-into-ptr-records/</guid>
		<description><![CDATA[The following script I made in order to convert the forward DNS records in a /var/named/db.[domain] file into the correct format for a reverse DNS db.[subnet prefix] file.

#!/bin/sh
(...)
DNSROOT=/var/named
PREFIX=$1
DOMAIN=$2
shift 2
DNSPRE=$DNSROOT/db.$PREFIX
DNSDOM=$DNSROOT/db.$DOMAIN
echo "; save this in $DNSPRE"
(
if [ -f $DNSDOM ] ; then
cat $DNSDOM
&#124; grep $PREFIX
&#124; grep -w "A"
&#124; sed "s/$PREFIX.*//g"
&#124; gawk "BEGIN {OFS = "t" ;} {print [...]


Related posts:<ol><li><a href='http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/' rel='bookmark' title='Permanent Link: Calculate hit rate from a log file'>Calculate hit rate from a log file</a> <small>You have a huge file that contains one line per...</small></li>
<li><a href='http://blog.forret.com/2004/11/probe-disk-performance-mrtg/' rel='bookmark' title='Permanent Link: Probe disk performance (MRTG)'>Probe disk performance (MRTG)</a> <small>The hdparam can be used to monitor the throughput speed...</small></li>
<li><a href='http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/' rel='bookmark' title='Permanent Link: Date formatting in GAWK: boot time'>Date formatting in GAWK: boot time</a> <small>I have one server with apparently an exceptional stability: #...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>The following script I made in order to convert the forward DNS records in a /var/named/db.[domain] file into the correct format for a reverse DNS db.[subnet prefix] file.<br />
<code><br />
#!/bin/sh<br />
(...)<br />
DNSROOT=/var/named<br />
PREFIX=$1<br />
DOMAIN=$2<br />
shift 2<br />
DNSPRE=$DNSROOT/db.$PREFIX<br />
DNSDOM=$DNSROOT/db.$DOMAIN<br />
echo "; save this in $DNSPRE"<br />
(<br />
if [ -f $DNSDOM ] ; then<br />
cat $DNSDOM<br />
| grep $PREFIX<br />
| grep -w "A"<br />
| sed "s/$PREFIX.*//g"<br />
| gawk "BEGIN {OFS = "t" ;} {print $4,"IN","PTR",$1 ".$DOMAIN.",";; FROM `basename $DNSDOM`" }"<br />
fi</p>
<p>if [ -f $DNSPRE ] ; then<br />
cat $DNSPRE<br />
| grep -w "PTR"<br />
| gawk "BEGIN {OFS = "t" ;} {print $1,$2,$3,$4,";; FROM `basename $DNSPRE` "; }"<br />
fi )<br />
| sort -n<br />
| uniq --check-chars=3<br />
</code></p>
<p>You would call it as follows:<br />
<code>revdns.sh 192.168.110 internal.example.com &gt; new.db.192.168.110</code> and then replace the records of the original db.192.168.110 with the records of the new file. The script still requires manual intervention (you cannot pipe the result straight into a live Bind config file) but saves a lot of typing!</p>
<p>Example of the output:<br />
<code><br />
201     IN      PTR     james.internal.example.be.  ;; FROM db.internal.example.com<br />
202     IN      PTR     wilbur.internal.example.be. ;; FROM db.internal.example.com<br />
216     IN      PTR     appprd1.internal.example.com.   ;; FROM db.192.168.110<br />
217     IN      PTR     appprd2.internal.example.com.   ;; FROM db.192.168.110<br />
218     IN      PTR     appprd3.internal.example.com.   ;; FROM db.192.168.110<br />
219     IN      PTR     appprd4.internal.example.com.   ;; FROM db.192.168.110<br />
220     IN      PTR     appprd5.internal.example.com.   ;; FROM db.192.168.110<br />
221     IN      PTR     appprd6.internal.example.com.   ;; FROM db.192.168.110<br />
</code></p>


<p>Related posts:<ol><li><a href='http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/' rel='bookmark' title='Permanent Link: Calculate hit rate from a log file'>Calculate hit rate from a log file</a> <small>You have a huge file that contains one line per...</small></li>
<li><a href='http://blog.forret.com/2004/11/probe-disk-performance-mrtg/' rel='bookmark' title='Permanent Link: Probe disk performance (MRTG)'>Probe disk performance (MRTG)</a> <small>The hdparam can be used to monitor the throughput speed...</small></li>
<li><a href='http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/' rel='bookmark' title='Permanent Link: Date formatting in GAWK: boot time'>Date formatting in GAWK: boot time</a> <small>I have one server with apparently an exceptional stability: #...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2005/06/convert-bind-dns-zone-into-ptr-records/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Installing NTP (time synchronisation)</title>
		<link>http://blog.forret.com/2005/05/installing-ntp-time-synchronisation/</link>
		<comments>http://blog.forret.com/2005/05/installing-ntp-time-synchronisation/#comments</comments>
		<pubDate>Thu, 19 May 2005 14:09:00 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[ntp]]></category>
		<category><![CDATA[synchronisation]]></category>
		<category><![CDATA[time]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2005/05/installing-ntp-time-synchronisation/</guid>
		<description><![CDATA[
Set timezone (optional) 
create symbolical link from /usr/share/zoneinfo/... to /etc/localtime:
ln -sf /usr/share/zoneinfo/Europe/Brussels /etc/localtime 
Set UTC mode (optional) 
if your hardware clock runs in UTC (Universal Coordinated Time) mode, add
UTC=true
to the /etc/sysconfig/clock file
Make sure ntpd is not running  
Use service ntpd stop to stop it. 
Choose the NTP server you will get your time from [...]


No related posts.]]></description>
			<content:encoded><![CDATA[<dl>
<dt><strong>Set timezone (optional)</strong> </dt>
<dd>create symbolical link from <code>/usr/share/zoneinfo/...</code> to <code>/etc/localtime</code>:<br />
<code>ln -sf /usr/share/zoneinfo/Europe/Brussels /etc/localtime</code> </dd>
<dt><strong>Set UTC mode (optional)</strong> </dt>
<dd>if your hardware clock runs in <a href="http://www.worldtimeserver.com/current_time_in_UTC.aspx">UTC (Universal Coordinated Time)</a> mode, add<br />
<code>UTC=true</code><br />
to the <code>/etc/sysconfig/clock</code> file</dd>
<dt><strong>Make sure <code>ntpd</code> is not running</strong>  </dt>
<dd>Use<code> service ntpd stop </code>to stop it. </dd>
<dt>Choose the <strong>NTP server</strong> you will get your time from  </dt>
<dd>it can be an internal server that has the NTP service open for clients, or an <a href="http://ntp.isc.org/bin/view/Servers/StratumTwoTimeServers">public NTP server</a>. To be sure, use 2 servers. To check if you can access it, run <code>ntpdate timeserver.ntp.ch</code> </dd>
<dt><strong>Edit the <code>/etc/ntp.conf</code> file</strong>  </dt>
<dd>Rename the current file to <code>ntp.bak.conf</code> and make a small new one:<br />
<code>restrict default ignore<br />
server   timeserver.ntp.ch  # Swiss time<br />
server ntp.ucsd.edu       # Univ of California, San Diego<br />
restrict timeserver.ntp.ch mask 255.255.255.255 nomodify notrap noquery<br />
restrict ntp.ucsd.edu      mask 255.255.255.255 nomodify notrap noquery<br />
server  127.127.1.0     # local clock<br />
fudge   127.127.1.0 stratum 10 #so it only takes over if the rest fails<br />
restrict 127.0.0.1  driftfile /etc/ntp/drift broadcastdelay  0.008  authenticate no</code> </dd>
<dt><strong>Set your system clock right</strong>  </dt>
<dd>Run the following command a couple of times:<br />
<code>ntpdate -u timeserver.ntp.ch # or whatever server you want to use</code><br />
You will see the initial diffence in time go away afer the 2nd or 3rd time. </dd>
<dt><strong>Set hardware clock</strong>  </dt>
<dd> <code>/sbin/hwclock --systohc</code> </dd>
<dt><strong>Run the <code>ntpd</code> daemon</strong>  </dt>
<dd> <code>service ntpd start</code> </dd>
<dt><strong>Add <code>ntpd</code> to the services started at boot time</strong> </dt>
<dd><code>chkconfig ntpd on</code></dd>
<dt><strong>Check the NTP results</strong> </dt>
<dd> <code>ntpd -p</code><br />
will show you what the difference is between your clock and that of the servers you added. You are looking for lines like<br />
<code><br />
remote           refid      st t when poll reach   delay   offset  jitter<br />
==========================================================================<br />
LOCAL            LOCAL      10 l   30   64  377    0.000    0.000   0.004 *<br />
192.168.246.107 192.168.246.88   3 u  41  128  177 0.313    5.598   0.345</code><br />
and not lines like<br />
<code><br />
remote           refid      st t when poll reach   delay   offset  jitter<br />
==========================================================================<br />
192.168.246.126 LOCAL        11 u   37  128  375    0.204  6082.02 6069.84</code><br />
Jitter is too high! </dd>
</dl>


<p>No related posts.</p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2005/05/installing-ntp-time-synchronisation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Perl HTML scraping part #1</title>
		<link>http://blog.forret.com/2005/01/perl-html-scraping-part-1/</link>
		<comments>http://blog.forret.com/2005/01/perl-html-scraping-part-1/#comments</comments>
		<pubDate>Fri, 21 Jan 2005 17:05:07 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2005/01/perl-html-scraping-part-1/</guid>
		<description><![CDATA[Here we are, back at the scene of the crime. Yes, I know it&#8217;s been a while. And the task of the day is:


GOAL:
make an HTML scraper, i.e. a script that grabs another URL and outputs the results to the screen 
TOOL:
let&#8217;s say &#8230; Perl (in my case: Perl 5.8 on RedHat) 
INPUT:
a URL 
OUTPUT:
the [...]


No related posts.]]></description>
			<content:encoded><![CDATA[<p>Here we are, back at the scene of the crime. Yes, I know it&#8217;s been a while. And the task of the day is:</p>
<p>
<dl>
<dt>GOAL:</dt>
<dd>make an HTML scraper, i.e. a script that grabs another URL and outputs the results to the screen </dd>
<dt>TOOL:</dt>
<dd>let&#8217;s say &#8230; Perl (in my case: Perl 5.8 on RedHat) </dd>
<dt>INPUT:</dt>
<dd>a URL </dd>
<dt>OUTPUT:</dt>
<dd>the HTML code of that URL</dd>
</dl>
<p>The actual HTML retrieval is easy: you need <code>get()</code> from the LWP::Simple module:<br />
<code>use LWP::Simple;<br />
my $page = get($url);</code></p>
<p>Some remarks:</p>
<ul></ul>
</p>
<p>
<li>Since you are generating a web page, you need the CGI module (to take care of the HTTP headers and stuff).</li>
<p></p>
<li>The URL input parameter will be given as an HTTP querystring: <code>?url=http://www.example.com/path/page.htm</code>. When no url parameter given, we will generate a form where it can be filled in.</li>
<p></p>
<li>We calculate the time it takes to retrieve the original page</li>
<p>
</p>
<p><code></code></p>
<p>#!/usr/bin/perl -w<br />
use strict;<br />
use CGI qw(:standard);<br />
use LWP::Simple qw(!head);<code>my $query = new CGI;<br />
my $url = $query-&gt;param('url');<br />
my $debug = 0;</code></p>
<p>print header();<br />
if(length($url) &gt; 0) {<br />
print getpage($url);<br />
} else {<br />
showform();<br />
}</p>
<p>sub getpage{<br />
my $url = shift;<br />
my $time1 = time();<br />
debuginfo(&#8220;Scraping &lt;a target=_blank href=&#8217;&#8221; . $url . &#8220;&#8216;&gt;link&lt;/a&gt; &#8230;&#8221;);<br />
my $page = get($url);<br />
my $time2 = time();<br />
debuginfo(&#8220;Time taken was &lt;b&gt;&#8221; . ($time2 &#8211; $time1) . &#8220;&lt;/b&gt; seconds&#8221;);<br />
debuginfo(&#8220;Total bytes scraped: &lt;b&gt;&#8221;. length($page)/1000 . &#8220;KB&lt;/b&gt;&#8221; );<br />
return $page;<br />
}</p>
<p>sub debuginfo{<br />
if ($debug &gt; 0) {<br />
my $text = shift;<br />
print &#8220;&lt;small&gt;&#8221; , $text , &#8220;&lt;/small&gt;&lt;br /&gt;n&#8221;;<br />
}</p>
<p>}</p>
<p>sub showform{<br />
print(&#8220;&lt;html&gt;&lt;head&gt;&#8221;);<br />
print(&#8220;&lt;title&gt;SCRAPER&lt;/title&gt;&#8221;);<br />
print(&#8220;&lt;link rel=stylesheet type=text/css href=http://www.forret.com/blog/style.css&gt;&#8221;);<br />
print(&#8220;&lt;/head&gt;&lt;body&gt;&lt;center&gt;n&#8221;);<br />
print(&#8220;&lt;form method=GET action=&#8217;scrape.pl&#8217;&gt;&#8221;);<br />
print(&#8220;URL: &lt;input name=url type=text size=60 value=http://www.forret.com&gt;&#8221;);<br />
print(&#8220;&lt;input type=submit&gt;&lt;/form&gt;n&#8221;);<br />
print(&#8220;&lt;/center&gt;&lt;/body&gt;&lt;/html&gt;n&#8221;);<br />
}</p>
<p>Next step: making sure image <code>src=</code> and hyperlink <code>href</code> keep on working (so convert relative links to absolute links!).</p>


<p>No related posts.</p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2005/01/perl-html-scraping-part-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Squid cachemgr.cgi UI hack</title>
		<link>http://blog.forret.com/2004/11/squid-cachemgrcgi-ui-hack/</link>
		<comments>http://blog.forret.com/2004/11/squid-cachemgrcgi-ui-hack/#comments</comments>
		<pubDate>Mon, 08 Nov 2004 15:01:00 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2004/11/squid-cachemgrcgi-ui-hack/</guid>
		<description><![CDATA[Squid has a little system statistics viewer built-in:
The cache manager (cachemgr.cgi) is a CGI utility for displaying statistics about the squid process as it runs. The cache manager is a convenient way to manage the cache and view statistics without logging into the server.
(from Squid FAQ)
The only thing is &#8230; it&#8217;s so ugly! It uses [...]


Related posts:<ol><li><a href='http://blog.forret.com/2004/10/squid-list-top-x-referers/' rel='bookmark' title='Permanent Link: Squid: list top X referers'>Squid: list top X referers</a> <small>If your Squid server logs the referers of its request...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Squid has a little system statistics viewer built-in:</p>
<blockquote><p>The cache manager (cachemgr.cgi) is a CGI utility for displaying statistics about the squid process as it runs. The cache manager is a convenient way to manage the cache and view statistics without logging into the server.<br />
(from <a href="http://www.squid-cache.org/Doc/FAQ/FAQ-9.html">Squid FAQ</a>)</p></blockquote>
<p>The only thing is &#8230; it&#8217;s so ugly! It uses plain HTML and cannot be customized, the FAQ says. However, there is a way to do it:</p>
<ol></ol>
<li>copy <code>cachemgr.cgi</code> to <code>cachemgr2.cgi</code> so if you do something wrong, the original is not lost.</li>
<li>open the CGI file in a text-editor. I used <code>vi</code>, but if you&#8217;re not used to working with it, use something else (emacs?).</li>
<li>in the binary file, look for some text portions that look like HTML code</li>
<li>while keeping in mind that the # of characters should remain the same, change the &lt;title&gt; and &lt;style&gt; to something that suits you. You will have to do this at 2 locations in the file: one for the homepage template and one for the other pages&#8217; template.</li>
<li>suggestion: just let the CGI use a <code>style.css</code> file that you drop into the same folder.<br />
<code>&lt;link rel="stylesheet" type="text/css" href="style.css" mce_href="style.css" /&gt;</code> and fill up with spaces to keep the same # characters</li>
<li>verify that the <code>cachemgr</code> and the <code>cachemgr2</code> have the same # bytes</li>
<li>now use <code>cachemgr2</code> to display your statistics.</li>
<p>I did something a bit different (I wanted to use the CSS of my own website), so I &#8216;ll show you the difference between the two versions.<br />
In order to get to the following comparison, I did a <code>strings cachemgr.cgi &gt; cachemgr.txt</code> to extract only the text parts, and I did a <code><b>diff</b> cachemgr.txt cachemgr2.txt</code> to compare both files. You cannot do a file comparison of 2 binary files.<br />
<code><br />
&lt;em&gt;173,174c173,174&lt;/em&gt;<br />
&lt; &lt;HTML&gt;&lt;HEAD&gt;&lt;TITLE&gt;Cache Manager Interface&lt;/TITLE&gt;<br />
&lt; &lt;STYLE type="text/css"&gt;&lt;!-- BODY{background-color:#ffffff;font-family:verdana,sans-serif} --&gt;&lt;/STYLE&gt;&lt;/HEAD&gt;<br />
---<br />
&gt; &lt;HTML&gt;&lt;HEAD&gt;&lt;TITLE&gt;Cache Manager (pforret)&lt;/TITLE&gt;<br />
&gt; &lt;link rel="stylesheet" type="text/css" href="http://www.forret.com/forret/forret.css" mce_href="http://www.forret.com/forret/forret.css" /&gt; &lt;/HEAD&gt;<br />
&lt;em&gt;199c199&lt;/em&gt;<br />
&lt; &lt;STYLE type="text/css"&gt;&lt;!-- BODY{background-color:#ffffff;font-family:verdana,sans-serif} TABLE{background-color:#333333;border:0pt;padding:0pt}TH,TD{background-color:#ffffff}--&gt;&lt;/STYLE&gt;<br />
---<br />
&gt; &lt;link rel="stylesheet" type=text/css href="http://www.forret.com/forret/forret.css" mce_href="http://www.forret.com/forret/forret.css"&gt;&lt;!-- TABLE{background-color:#333333;border:0pt;padding:0pt} TH,TD{background-color:#ffffff}--&gt;&lt;/STYLE&gt;<br />
</code></p>


<p>Related posts:<ol><li><a href='http://blog.forret.com/2004/10/squid-list-top-x-referers/' rel='bookmark' title='Permanent Link: Squid: list top X referers'>Squid: list top X referers</a> <small>If your Squid server logs the referers of its request...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2004/11/squid-cachemgrcgi-ui-hack/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Probe disk performance (MRTG)</title>
		<link>http://blog.forret.com/2004/11/probe-disk-performance-mrtg/</link>
		<comments>http://blog.forret.com/2004/11/probe-disk-performance-mrtg/#comments</comments>
		<pubDate>Wed, 03 Nov 2004 13:41:28 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2004/11/probe-disk-performance-mrtg/</guid>
		<description><![CDATA[The hdparam can be used to monitor the throughput speed of a hard disk:
# &#60;strong&#62;hdparm -tT /dev/hda&#60;/strong&#62;
/dev/hda:
Timing buffer-cache reads:   888 MB in  2.00 seconds = 444.00 MB/sec
Timing buffered disk reads:   20 MB in  3.30 seconds =   6.06 MB/sec
This would be an interesting performance metric to see plotted [...]


Related posts:<ol><li><a href='http://blog.forret.com/2004/10/probe-average-cpu-utilisation-mrtg/' rel='bookmark' title='Permanent Link: Probe average cpu utilisation (MRTG)'>Probe average cpu utilisation (MRTG)</a> <small>There are two main tools to keep track of your...</small></li>
<li><a href='http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/' rel='bookmark' title='Permanent Link: Calculate hit rate from a log file'>Calculate hit rate from a log file</a> <small>You have a huge file that contains one line per...</small></li>
<li><a href='http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/' rel='bookmark' title='Permanent Link: Date formatting in GAWK: boot time'>Date formatting in GAWK: boot time</a> <small>I have one server with apparently an exceptional stability: #...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>The <code>hdparam</code> can be used to monitor the throughput speed of a hard disk:<br />
<code># &lt;strong&gt;hdparm -tT /dev/hda&lt;/strong&gt;</code><br />
<code>/dev/hda:<br />
Timing buffer-cache reads:   888 MB in  2.00 seconds = 444.00 MB/sec<br />
Timing buffered disk reads:   20 MB in  3.30 seconds =   6.06 MB/sec</code></p>
<p>This would be an interesting performance metric to see plotted against time. So let&#8217;s convert it to a format ready for MRTG.</p>
<ul>
<li>The only numbers we need are the last ones: resulting speed. This can be parsed from the output as follows:<br />
<code>#hdparm -tT /dev/hda | gawk -F = "/seconds/ { print $2}"</code>&#160;</p>
<pre>440.00 MB/sec   3.30 MB/sec</pre>
</li>
<li>if we could suppose that the results will always be in &#8220;MB/sec&#8221;, we could parse out the numbers with<br />
<code>(...) | gawk "{print $1}"</code><br />
and then add a line to our MRTG config files to adjust the units:<br />
<code>kMG[_]: M,G,T,P,X</code><br />
But let&#8217;s say that KB/sec or GB/sec speeds are possible.</li>
<li>One <code>gawk</code> can do the conversion trick:<br />
<code>#(...) | gawk "/GB/ {print $1*1000000000} /MB/ {print $1*1000000} /KB/ {print $1*1000}"</code>&#160;</p>
<pre>440000000 3300000</pre>
</li>
<li>To have a complete MRTG-ready output, we also add the boot time on line 3 and the name of the MRTG output on line 4</li>
<li>Q: Do we need 2 <code>gawk</code>s one after the other? Can&#8217;t one do it?<br />
A: You could do it in 1, I guess, but the parsing would be more complex. I use 2 because the FS (field separator) changes: the first gawk uses the &#8216;=&#8217; character, the second uses the normal whitespace.</li>
</ul>


<p>Related posts:<ol><li><a href='http://blog.forret.com/2004/10/probe-average-cpu-utilisation-mrtg/' rel='bookmark' title='Permanent Link: Probe average cpu utilisation (MRTG)'>Probe average cpu utilisation (MRTG)</a> <small>There are two main tools to keep track of your...</small></li>
<li><a href='http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/' rel='bookmark' title='Permanent Link: Calculate hit rate from a log file'>Calculate hit rate from a log file</a> <small>You have a huge file that contains one line per...</small></li>
<li><a href='http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/' rel='bookmark' title='Permanent Link: Date formatting in GAWK: boot time'>Date formatting in GAWK: boot time</a> <small>I have one server with apparently an exceptional stability: #...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2004/11/probe-disk-performance-mrtg/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Date formatting in GAWK: boot time</title>
		<link>http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/</link>
		<comments>http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/#comments</comments>
		<pubDate>Tue, 02 Nov 2004 14:17:54 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/</guid>
		<description><![CDATA[I have one server with apparently an exceptional stability:
# uptime
3:45pm  up 524 days,  1:22,  1 user,  load average: 0.44, 0.16, 0.13
Unfortunately I know this is not correct (I remember rebooting it some weeks ago). So what are other ways to get the date/time of the last boot?
Looking at the RedHat manuals, [...]


Related posts:<ol><li><a href='http://blog.forret.com/2004/11/probe-disk-performance-mrtg/' rel='bookmark' title='Permanent Link: Probe disk performance (MRTG)'>Probe disk performance (MRTG)</a> <small>The hdparam can be used to monitor the throughput speed...</small></li>
<li><a href='http://blog.forret.com/2004/10/redhat-versions-what-am-i-running/' rel='bookmark' title='Permanent Link: Redhat versions: what am I running?'>Redhat versions: what am I running?</a> <small>If you manage multiple RedHat servers, or if you just...</small></li>
<li><a href='http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/' rel='bookmark' title='Permanent Link: Calculate hit rate from a log file'>Calculate hit rate from a log file</a> <small>You have a huge file that contains one line per...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I have one server with apparently an exceptional stability:<br />
<code># uptime</code></p>
<pre>3:45pm  up 524 days,  1:22,  1 user,  load average: 0.44, 0.16, 0.13</pre>
<p>Unfortunately I know this is not correct (I remember rebooting it some weeks ago). So what are other ways to get the date/time of the last boot?</p>
<p>Looking at the <a href="http://www.redhat.com/docs/manuals/linux/RHL-7.3-Manual/ref-guide/s1-proc-topfiles.html">RedHat manuals</a>, the following thing should work too:<br />
<code># <strong>cat /proc/stat</strong><br />
cpu 33813143 210619911 30093342 59435750<br />
cpu0 33813143 210619911 30093342 59435749<br />
(...)<br />
btime 1096157569<br />
(...)</code></p>
<p>The <code>btime</code> gives us the last boot time in seconds since 1 Jan 1970. I can find and convert it with <code>gawk</code>:<br />
<code># <strong>gawk "/btime/{ print (`date +%s` - $2) / (3600 * 24.0) ,"days -",strftime("%a %b %d %H:%M:%S %Z %Y",$2)}" /proc/stat</strong><br />
38.6473 days - Sun Sep 26 02:12:49 CEST 2004</code><br />
Which gives us an uptime of 38,6 days &#8211; that looks more like it!</p>
<p>Another way of calculating the uptime:<br />
<code># <strong>gawk "/cpu/ {print $1,($2 + $3 + $4 + $5)/(3600 * 24 * 100)}" /proc/stat</strong><br />
cpu 38.6515<br />
cpu0 38.6515</code><br />
Confirmation of the previous measurement!</p>
<p><code># <strong>cat /proc/uptime</strong><br />
45282758.17 663091.26</code><br />
The first number is the # of seconds since last boot. The other one (idle time) we don&#8217;t need. What is that in days?<br />
<code># <strong>gawk "{print $1/(3600 * 24.0)}" /proc/uptime</strong><br />
524.106</code></p>
<p>This is where the wrong data is coming from! So I&#8217;ll ignore this data.</p>
<p>Remark: This server is one of my oldest ones and is still running <em>Redhat 7.2 (Enigma)</em>. Looks like this bug was fixed in later versions of RedHat, since none of my other servers have it.</p>


<p>Related posts:<ol><li><a href='http://blog.forret.com/2004/11/probe-disk-performance-mrtg/' rel='bookmark' title='Permanent Link: Probe disk performance (MRTG)'>Probe disk performance (MRTG)</a> <small>The hdparam can be used to monitor the throughput speed...</small></li>
<li><a href='http://blog.forret.com/2004/10/redhat-versions-what-am-i-running/' rel='bookmark' title='Permanent Link: Redhat versions: what am I running?'>Redhat versions: what am I running?</a> <small>If you manage multiple RedHat servers, or if you just...</small></li>
<li><a href='http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/' rel='bookmark' title='Permanent Link: Calculate hit rate from a log file'>Calculate hit rate from a log file</a> <small>You have a huge file that contains one line per...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Probe average cpu utilisation (MRTG)</title>
		<link>http://blog.forret.com/2004/10/probe-average-cpu-utilisation-mrtg/</link>
		<comments>http://blog.forret.com/2004/10/probe-average-cpu-utilisation-mrtg/#comments</comments>
		<pubDate>Thu, 21 Oct 2004 22:44:27 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2004/10/probe-average-cpu-utilisation-mrtg/</guid>
		<description><![CDATA[There are two main tools to keep track of your CPU usage: top and vmstat.


top is an interactive tool: it shows you the CPU usage of each process, as well as overall statistics, updated every 5 seconds. It&#8217;s good for hands-on checking.

#top  17:18:34  up 2 days,  8:14,  3 users,  load [...]


Related posts:<ol><li><a href='http://blog.forret.com/2004/11/probe-disk-performance-mrtg/' rel='bookmark' title='Permanent Link: Probe disk performance (MRTG)'>Probe disk performance (MRTG)</a> <small>The hdparam can be used to monitor the throughput speed...</small></li>
<li><a href='http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/' rel='bookmark' title='Permanent Link: Date formatting in GAWK: boot time'>Date formatting in GAWK: boot time</a> <small>I have one server with apparently an exceptional stability: #...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>There are two main tools to keep track of your CPU usage: <code>top</code> and <code>vmstat</code>.</p>
<ul>
<li>
<code>top</code> is an interactive tool: it shows you the CPU usage of each process, as well as overall statistics, updated every 5 seconds. It&#8217;s good for hands-on checking.<br />
<code><br />
#top  17:18:34  up 2 days,  8:14,  3 users,  load average: 0.00, 0.00, 0.00<br />
47 processes: 46 sleeping, 1 running, 0 zombie, 0 stopped<br />
CPU states:   0.1% user   0.1% system   0.0% nice   0.0% iowait  99.6% idle<br />
Mem:  1030872k av, 1022256k used,    8616k free,<br />
                         0k shrd,  104844k buff<br />
     777088k actv,      12k in_d,   22296k in_c<br />
Swap: 2048276k av,    8120k used, 2040156k free<br />
                                 640080k cached<br />
  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND<br />
30776 root      19   0  1140 1140   852 R     0.9  0.1   0:00   0 top<br />
    1 root      15   0   504  464   436 S     0.0  0.0   0:03   0 init       (...)</code><br />
But say you want to get just one number (percentage) back, so you can use it for logging.
</li>
<li>
<code>vmstat</code> wil give you the following output:<br />
<code><br />
#vmstat<br />
procs                      memory      swap          io     system      cpu<br />
r  b  w   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id<br />
0  0  0   7964   8804 104712 640224    0    0     2    16  129    27  0  0 100<br />
</code></p>
<p>You can run <code>vmstat 1 5</code> to get 5 consecutive measurements (1 second apart). The number we want is the average CPU usage, or (100% &#8211; idle). The following command will do the job:<br />
<code>#vmstat 1 5 | gawk "/0/ {tot=tot+1; id=id+$16} END {print 100 - id/tot}"</code><br />
gives<br />
<code>0.4</code>
</li>
</ul>


<p>Related posts:<ol><li><a href='http://blog.forret.com/2004/11/probe-disk-performance-mrtg/' rel='bookmark' title='Permanent Link: Probe disk performance (MRTG)'>Probe disk performance (MRTG)</a> <small>The hdparam can be used to monitor the throughput speed...</small></li>
<li><a href='http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/' rel='bookmark' title='Permanent Link: Date formatting in GAWK: boot time'>Date formatting in GAWK: boot time</a> <small>I have one server with apparently an exceptional stability: #...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2004/10/probe-average-cpu-utilisation-mrtg/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Estimate # of lines in a log file</title>
		<link>http://blog.forret.com/2004/10/estimate-of-lines-in-a-log-file/</link>
		<comments>http://blog.forret.com/2004/10/estimate-of-lines-in-a-log-file/#comments</comments>
		<pubDate>Thu, 21 Oct 2004 12:30:27 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2004/10/estimate-of-lines-in-a-log-file/</guid>
		<description><![CDATA[Let&#8217;s say you need an (approximate) count of the number of lines in a huge file. The most obvious way of calculating this would be using wc, but this actually can be quite slow:
# time wc -l /var/log/squid/access.log
2812824 /var/log/squid/access.log
real    0m43.988s
(counting is done at 64.000 lines/sec)
Running wc without the -l (only count lines) [...]


Related posts:<ol><li><a href='http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/' rel='bookmark' title='Permanent Link: Calculate hit rate from a log file'>Calculate hit rate from a log file</a> <small>You have a huge file that contains one line per...</small></li>
<li><a href='http://blog.forret.com/2004/10/squid-list-top-x-referers/' rel='bookmark' title='Permanent Link: Squid: list top X referers'>Squid: list top X referers</a> <small>If your Squid server logs the referers of its request...</small></li>
<li><a href='http://blog.forret.com/2004/11/squid-cachemgrcgi-ui-hack/' rel='bookmark' title='Permanent Link: Squid cachemgr.cgi UI hack'>Squid cachemgr.cgi UI hack</a> <small>Squid has a little system statistics viewer built-in: The cache...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s say you need an (approximate) count of the number of lines in a huge file. The most obvious way of calculating this would be using <code>wc</code>, but this actually can be quite slow:<br />
<code># time wc -l /var/log/squid/access.log<br />
2812824 /var/log/squid/access.log<br />
real    0m43.988s</code><br />
(counting is done at 64.000 lines/sec)</p>
<p>Running <code>wc</code> without the <code>-l</code> (only count lines) would be ever slower because it would also count the words, instead of just the LF (linefeed) characters. But using <code>wc -c</code> is very fast! This is because the filesystem keeps track of each file&#8217;s filesize (= number of characters/bytes), so the file does not even have to be read to give this number. Can we estimate the # of lines from the # of bytes?</p>
<p>For the type of file we are talking about here (a Squid log file) there actually is a way. The file is more or less &#8217;square&#8217;, meaning that every line is about the same length (it contains date, status, URL, &#8230;).<br />
If we take the beginning of the file (the first 10000 lines):<br />
<code># head -10000 /var/log/squid/access.log | wc<br />
  10000  100000 1775257</code><br />
we see that every line is about 177 chars long.</p>
<p>The end of the file (the last 10000 lines):<br />
<code># tail -10000 /var/log/squid/access.log | wc<br />
  10000  100000 2047887</code><br />
gives us a number of 204 chars/line.</p>
<p>Let&#8217;s take some more data and combine both:<br />
<code># ( head -50000 /var/log/squid/access.log ; tail -50000 /var/log/squid/access.log ) | wc<br />
 100000 1000000 19488905</code><br />
which gives us an average of 195 chars/line.</p>
<p>A file size of 533.229.920 bytes (533MB) would lead us to estimate the # of lines to 2.734.512, where the actual # of lines is 2.818.184 (3% difference). That is: we lose 3% accuracy but the calculation takes almost no CPU time, instead of 45 seconds. This might be a trade-off you are willing to accept!</p>


<p>Related posts:<ol><li><a href='http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/' rel='bookmark' title='Permanent Link: Calculate hit rate from a log file'>Calculate hit rate from a log file</a> <small>You have a huge file that contains one line per...</small></li>
<li><a href='http://blog.forret.com/2004/10/squid-list-top-x-referers/' rel='bookmark' title='Permanent Link: Squid: list top X referers'>Squid: list top X referers</a> <small>If your Squid server logs the referers of its request...</small></li>
<li><a href='http://blog.forret.com/2004/11/squid-cachemgrcgi-ui-hack/' rel='bookmark' title='Permanent Link: Squid cachemgr.cgi UI hack'>Squid cachemgr.cgi UI hack</a> <small>Squid has a little system statistics viewer built-in: The cache...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2004/10/estimate-of-lines-in-a-log-file/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Calculate hit rate from a log file</title>
		<link>http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/</link>
		<comments>http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/#comments</comments>
		<pubDate>Thu, 21 Oct 2004 09:30:13 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/</guid>
		<description><![CDATA[You have a huge file that contains one line per request/transaction. Some of the lines are of one type (e.g. &#8216;HIT&#8217;), some of another (e.g. MISS). Let&#8217;s say you want to calculate the hitrate, but as fast as possible.
We take a Squid log file of about 140MB. How long does it take to count how [...]


Related posts:<ol><li><a href='http://blog.forret.com/2004/10/estimate-of-lines-in-a-log-file/' rel='bookmark' title='Permanent Link: Estimate # of lines in a log file'>Estimate # of lines in a log file</a> <small>Let&#8217;s say you need an (approximate) count of the number...</small></li>
<li><a href='http://blog.forret.com/2005/06/convert-bind-dns-zone-into-ptr-records/' rel='bookmark' title='Permanent Link: Convert Bind DNS zone into PTR records'>Convert Bind DNS zone into PTR records</a> <small>The following script I made in order to convert the...</small></li>
<li><a href='http://blog.forret.com/2004/11/probe-disk-performance-mrtg/' rel='bookmark' title='Permanent Link: Probe disk performance (MRTG)'>Probe disk performance (MRTG)</a> <small>The hdparam can be used to monitor the throughput speed...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>You have a huge file that contains one line per request/transaction. Some of the lines are of one type (e.g. &#8216;HIT&#8217;), some of another (e.g. MISS). Let&#8217;s say you want to calculate the hitrate, but as fast as possible.<br />
We take a Squid log file of about 140MB. How long does it take to count how many lines it has?<br />
<code># time wc -l /var/log/squid/access.log<br />
845212 /var/log/squid/access.log<br />
real 0m6.523s</code> (about 21.4 MB/s or 130.000 lines/s)</p>
<p>And now let&#8217;s just filter out the lines containing &#8216;HIT&#8217; and count those:<br />
<code>#time sh -c "grep -i HIT /var/log/squid/access.log | wc -l"</code><br />
Wow! This takes ages (I stopped it after 15 minutes) and the <code>grep</code> takes 100% CPU all the time. So let&#8217;s look for another solution.</p>
<p>Maybe <code>gawk</code>? First let&#8217;s see if it is much slower than <code>wc -l</code> for counting lines:<br />
<code># time gawk "END {print NR}" /var/log/squid/access.log<br />
845907<br />
real 0m26.129s</code> (5.3 MB/s or 32.000 lines/s &#8211; 4 times slower)<br />
And now let it count the hits too:<br />
<code>]# time gawk "BEGIN {hit=0} /HIT/ {hit = hit+1} END {print hit/NR*100}" '/var/log/squid/access.log'<br />
84.5023<br />
real 0m32.836s</code> (4MB/s or 25.000 lines/s &#8211; slow but acceptable)</p>
<p>Do we actually need a count on the whole file? What if we just took the last (i.e. most recent) 100.000 lines? The result would be a better indication of what the current hit rate is, and the speed of calculation would be more predictable.<br />
<code># time sh -c "tail -100000 /var/log/squid/access.log | gawk 'BEGIN {hit=0} /HIT/ {hit = hit+1} END {print hit/NR*100}'"<br />
92.305<br />
real 0m3.332s</code> (30.000 lines/s)</p>
<p>It is actually a bit slower the first time you run it, probably due to disk or filesystem caching. So if you want your hit rate calculation to take less than 2 seconds, you could take the last 50.000 lines. Done!</p>


<p>Related posts:<ol><li><a href='http://blog.forret.com/2004/10/estimate-of-lines-in-a-log-file/' rel='bookmark' title='Permanent Link: Estimate # of lines in a log file'>Estimate # of lines in a log file</a> <small>Let&#8217;s say you need an (approximate) count of the number...</small></li>
<li><a href='http://blog.forret.com/2005/06/convert-bind-dns-zone-into-ptr-records/' rel='bookmark' title='Permanent Link: Convert Bind DNS zone into PTR records'>Convert Bind DNS zone into PTR records</a> <small>The following script I made in order to convert the...</small></li>
<li><a href='http://blog.forret.com/2004/11/probe-disk-performance-mrtg/' rel='bookmark' title='Permanent Link: Probe disk performance (MRTG)'>Probe disk performance (MRTG)</a> <small>The hdparam can be used to monitor the throughput speed...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Squid: list top X referers</title>
		<link>http://blog.forret.com/2004/10/squid-list-top-x-referers/</link>
		<comments>http://blog.forret.com/2004/10/squid-list-top-x-referers/#comments</comments>
		<pubDate>Tue, 19 Oct 2004 16:55:54 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2004/10/squid-list-top-x-referers/</guid>
		<description><![CDATA[If your Squid server logs the referers of its request (i.e.
1. you&#8217;ve configured squid-cache with --enable-referer-log before compiling and
2. you&#8217;ve included a referer_log /var/log/squid/referer.log in your squid.conf file),
you can easily show top 50 of most popular referers with a simple Bourne shell:
#!/bin/bash

this script is &#8216;top_referers.sh&#8217;
(c) 2004 Peter Forret &#8211; Open Source
REFERERS=/var/log/squid/referer.log
OUTPUT=/var/www/html/stats/referer.txt
MAXLINES=50(
echo REPORT MADE AT `date`
echo [...]


Related posts:<ol><li><a href='http://blog.forret.com/2004/11/squid-cachemgrcgi-ui-hack/' rel='bookmark' title='Permanent Link: Squid cachemgr.cgi UI hack'>Squid cachemgr.cgi UI hack</a> <small>Squid has a little system statistics viewer built-in: The cache...</small></li>
<li><a href='http://blog.forret.com/2004/10/estimate-of-lines-in-a-log-file/' rel='bookmark' title='Permanent Link: Estimate # of lines in a log file'>Estimate # of lines in a log file</a> <small>Let&#8217;s say you need an (approximate) count of the number...</small></li>
<li><a href='http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/' rel='bookmark' title='Permanent Link: Calculate hit rate from a log file'>Calculate hit rate from a log file</a> <small>You have a huge file that contains one line per...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>If your Squid server logs the referers of its request (i.e.<br />
1. you&#8217;ve configured <a href="http://www.squid-cache.org">squid-cache</a> with <code>--enable-referer-log</code> before compiling and<br />
2. you&#8217;ve included a <code>referer_log /var/log/squid/referer.log</code> in your <code>squid.conf</code> file),<br />
you can easily show top 50 of most popular referers with a simple Bourne shell:<br />
<code>#!/bin/bash</code></p>
<ol>
<li>this script is &#8216;top_referers.sh&#8217;</li>
<li>(c) 2004 Peter Forret &#8211; Open Source<br />
REFERERS=/var/log/squid/referer.log<br />
OUTPUT=/var/www/html/stats/referer.txt<br />
MAXLINES=50<code>(<br />
echo REPORT MADE AT `date`<br />
echo =============================<br />
$OUTPUT</code></li>
</ol>
<p>Then add it to your crontab:<br />
<code>10 * * * * /(path)/top_referers.sh</code><br />
and you have an hourly updated stat!<br />
Add a little HTML formatting if you&#8217;re aesthetically demanding!</p>


<p>Related posts:<ol><li><a href='http://blog.forret.com/2004/11/squid-cachemgrcgi-ui-hack/' rel='bookmark' title='Permanent Link: Squid cachemgr.cgi UI hack'>Squid cachemgr.cgi UI hack</a> <small>Squid has a little system statistics viewer built-in: The cache...</small></li>
<li><a href='http://blog.forret.com/2004/10/estimate-of-lines-in-a-log-file/' rel='bookmark' title='Permanent Link: Estimate # of lines in a log file'>Estimate # of lines in a log file</a> <small>Let&#8217;s say you need an (approximate) count of the number...</small></li>
<li><a href='http://blog.forret.com/2004/10/calculate-hit-rate-from-a-log-file/' rel='bookmark' title='Permanent Link: Calculate hit rate from a log file'>Calculate hit rate from a log file</a> <small>You have a huge file that contains one line per...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2004/10/squid-list-top-x-referers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Redhat versions: what am I running?</title>
		<link>http://blog.forret.com/2004/10/redhat-versions-what-am-i-running/</link>
		<comments>http://blog.forret.com/2004/10/redhat-versions-what-am-i-running/#comments</comments>
		<pubDate>Tue, 19 Oct 2004 15:32:32 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.forret.com/2004/10/redhat-versions-what-am-i-running/</guid>
		<description><![CDATA[If you manage multiple RedHat servers, or if you just stumble on a Linux server, and you have no idea what kind of machine it is, nor what the version of the OS is, try the following commands:

# more /proc/version
Linux version 2.4.20-24.9 (bhcompile@porky.devel.redhat.com)
(gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)) #1
Mon Dec 1 11:35:51 EST [...]


Related posts:<ol><li><a href='http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/' rel='bookmark' title='Permanent Link: Date formatting in GAWK: boot time'>Date formatting in GAWK: boot time</a> <small>I have one server with apparently an exceptional stability: #...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>If you manage multiple RedHat servers, or if you just stumble on a Linux server, and you have no idea what kind of machine it is, nor what the version of the OS is, try the following commands:</p>
<blockquote><p>
# <strong>more /proc/version</strong><br />
Linux version 2.4.20-24.9 (bhcompile@porky.devel.redhat.com)<br />
(gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)) #1<br />
Mon Dec 1 11:35:51 EST 2003<br />
# <strong>more /proc/cpuinfo</strong><br />
vendor_id : GenuineIntel<br />
model name : Intel(R) Pentium(R) 4 CPU 2.00GHz<br />
cpu MHz : 1992.653<br />
cache size : 512 KB<br />
(&#8230;)<br />
<a href="http://www.tldp.org/HOWTO/BogoMips/">bogomips</a> : 3971.48<br />
# <strong>more /proc/meminfo</strong><br />
MemTotal: 1030872 kB<br />
(&#8230;)<br />
# <strong>cat /etc/redhat-release</strong> (only for RedHat distributions)<br />
Red Hat Linux release 9 (Shrike)
</p></blockquote>
<p>So now you know: a 2GHz Pentium 4 with 1GB of memory, running RedHat 9 &#8216;Shrike&#8217;.<br />
For more info on RedHat versions: <a href="http://www.unixgods.org/~tilo/redhat_versions.html">Taroon, Shrike, Enigma, &#8230;</a></p>


<p>Related posts:<ol><li><a href='http://blog.forret.com/2004/11/date-formatting-in-gawk-boot-time/' rel='bookmark' title='Permanent Link: Date formatting in GAWK: boot time'>Date formatting in GAWK: boot time</a> <small>I have one server with apparently an exceptional stability: #...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://blog.forret.com/2004/10/redhat-versions-what-am-i-running/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 1.903 seconds -->
