Binary confusion: kilobytes and kibibytes

When I created my Bandwidth Calculator, easily the most popular web tool I ever made, I came across the following problem: in computer technology there is a habit of using kilobyte (KB) as 1024 bytes, megabyte (MB) as 1024*1024 (1.048.576) bytes. Most of you might think this is correct, but it’s not. The International System of Units (SI) (that defines the kilo, mega, giga, … and milli, micro, nano prefixes) uses only base 10 values. A kilo is always 1000, even for bytes. In order to find a solution for the IT ‘contamination’ of using kilo for 210 instead of 103, the IEC introduced new units in 1998:

In 1999, the International Electrotechnical Commission (IEC) published Amendment 2 to “IEC 60027-2: Letter symbols to be used in electrical technology – Part 2: Telecommunications and electronics”;. This standard, which had been approved in 1998, introduced the prefixes kibi-, mebi-, gibi-, tebi-, pebi-, exbi-, to be used in specifying binary multiples of a quantity. The names come from the first two letters of the original SI prefixes followed by bi which is short for “binary”. It also clarifies that, from the point of view of the IEC, the SI prefixes only have their base-10 meaning and never have a base-2 meaning.
(from en.wikipedia.org)

So this is the correct usage for file, disk, memory size:

Kilobytes (KB) 1.000 Kibibyte (KiB) 1024
Megabyte (MB) 1.000 ^ 2 Mebibyte (MiB) 1024 ^ 2
Gigabyte (GB) 1.000 ^ 3 Gibibyte (GiB) 1024 ^ 3
Terabyte (TB) 1.000 ^ 4 Tebibyte (TiB) 1024 ^ 4
Petabyte (PB) 1.000 ^ 5 Pebibyte (PiB) 1024 ^ 5

The problem is: the industry has not adopted these standards. If Windows shows the size of a disk, it converts 28.735.078.400 bytes to “26.7 GB”. It should be either 28.7 GB, or 26.7 GiB. Remember the 1.44MB floppy? It actually never existed: it is either 1.40MiB or 1.47MB.

On September 18 2003 Reuters has reported that Apple, Dell, Gateway, Hewlett-Packard, IBM, Sharp, Sony and Toshiba have been sued in a class-action suit in Los Angeles Superior Court for “deceiving” the true capacity of their hard drives. This of course was due to ambiguity of “GB” when used by software and hardware vendors. This precedent might prompt Apple to adapt binary prefixes in its Mac OS, as well as other companies to put pressure on Microsoft to adapt them in its Windows operating systems.
from members.optus.net

One could argue: people have always used the MB = 1024*1024 for disk drives, why change now? Well, clarity is a good reason, and unambiguity. NASA lost the Mars Orbiter because engineers had mixed metric speed (km/h) with English speed (mi/h). Don’t even get me started on miles per gallon.

So: a disk of 160GB should have 160.000.000.000 bytes. And it is about 150GiB. Get over it.

Related posts:

  1. Wiki markup languages: syntax confusion In the last couple of months I have been working...

9 Responses to Binary confusion: kilobytes and kibibytes

  1. You wrote “Well, clarity is a good reason, and unambiguity.”

    The problem is that the IEC’s proposal INTRODUCES MORE AMBIGUITY because it attempts to change EXISITING definitions. In other words, the documentation of the last 25 or so years becomes unclear as the reader now has to ask himself/herself “Is this use of ‘MB’ in this text the IEC definition, or the original definition? One can only assume that they redefined KB, MB, and GB because they wanted the prefixes to be consistent with the other (standardized) uses, but the better solution would have been, instead, to create new terms for the less common usage of base-10 for KB/MB/GB/etc. (i.e. “kibi-” meaning 1,000 instead of “kibi-” meaning 1,024). See lyberty.com/encyc/articles/kb_kilobytes.html

  2. I think you should use k for the kilo prefix instead of K.

  3. To above:
    Conventionally, the capital letter is always used in the positive powers, and lowercase in negative powers.

  4. Actually, capital-K for the positive power is only used in IT, as in the SI definition “K” is for Kelvin. The other positive powers do use uppercase :)

    As for kilo/kibi, I agree with the above poster that the proposal is actually backwards! For bytes, the measure has been and always will be 2^10, 2^20, etc. because of one simple thing: computers use binary, not decimal. Also, the term sounds corny, I actually thought someone had made a joke on the article where the definition was.

    The only renegade dudes on this issue were the HD manufacturers, but that was because they were taking advantage of the “ambiguity”. The only place where kilo=1000 in the SI sense is when talking about data transfer (kilobits, megabits) and even then it isn’t always used: my ISP gives me 1Mbps, and my DSL router shows 1024Kbps as my download link speed.

    Me thinking about “Kirby-bytes” when I see that mock prefix…

  5. @Lyberty: the proposal does not introduce more ambiguity, its goal is to remedy the existing confusion. In the past, manufacturers used kilo for 1024 and Pluto was a planet. Both historical errors have been corrected.
    @Danix: you defend the ‘old’ system and by citing the DSL example prove exactly why it is flawed. Confusion about units is what made NASA lose a Mars Orbiter. So just accept it: ‘Kilo’ is always 1000, Mega is ’1000000′.

  6. Can someone please, PLEASE! provide, or tell me where to find a simple DRAWING illustrating megabytes to kilobytes? Is that too much to ask? I’ll appreciate anything I can get! Anything! I’m at the end of my rope. I ain’t real smart. Thanx. Herb

  7. Ben Schaffhausen

    Kilo has meant 1000 since long before “byte” was a term. The people who started using “kilo” to mean 1024 out of convenience were the ones who were wrong- not people trying to set things straight today.

    The problem – “kilo” (and mega, giga, tera, etc) has adopted two usages, one as 1000 (base ten), and one as 1024 (base 2).

    The obvious solution is to adopt a new set of prefixes for the _new_ use, the base-2 usage. This has been done, let’s accept it.

    It’s either that or anytime anyone publishes anything using the kilo, mega, giga, tera or larger prefixes in the “small” computer- software/hardware/communications industry they need to put a definition footnote with every usage designating which interpretation they are using.

    Until then every time I see the term “MB” I need to ask myself which definition they are using (let alone if they really mean “bytes” or “bits”)

  8. Herb, you can do it yourself! Draw a blue line with a length of 1000 mm. Now draw a parallel red line with a length of 1024 mm. The blue line represents a kilobyte and the red line represents a kibibyte.

  9. I have mixed feelings. There are points to be considered on both sides of the argument. One Kilo, Mega, etc are base 10 prefixes in the “metric” system. Since binary number systems do not have equivalent naming, metric names were “borrowed” to approximate their originally intended values. 50 years ago, a bit (BInary digIT), was the smallest routable/storable unit of data. Nibbles are 4 bits, bytes (characters) were 8 bits. A double byte (word) was 16 bits. I suppose double word is 32 bits. Who knows what 64 or 128 bit chunks are called. BEFORE 1993, it was common knowledge among computer enthusiasts, that KB meant an approximation of KiloBytes equaling 1024 times 8 bits rather than 1000.
    Kilo was not a literal metric value, it is/was the closest base two approximation of that value. The only organizations/manufacturers that did not follow this “standard” were the storage/hard disk companies.
    Most telco, isp, or data networks “used to” measure bandwidth load or throughput in kilobits per second. Everything computerized acknowledged kilo to mean 1024. Mega meaning 1048576. By nature, humans are lazy and like to drop or round off the details.

    A Kbyte is an exact value, 1024. It’s name however can be considered inaccurate (get over it, it has been that way since the first computer). An Mbyte is 1048576. After 50 years, not 25, you cannot recall text books, documentation, or “common” knowledge. A Kilo is a 1000 times something true. Like every rule, there is an exception. When bit or byte is suffixed to the prefix, the nearest base 2 value applies. The exception to this rule, is hard disk storage. They may have won or got the suit dismissed, but that does not make them right. They just had better lawyers than the class action lawyers.