What is TWiki?
A leading open source enterprise wiki and web application platform used by 50,000 small businesses, many Fortune 500 companies, and millions of people.
Learn more.
The md5sum of a "burnt" CD can be different than the md5sum of the associated iso file and
not indicate an error.
The most widely recognized cause of the problem is the addition of padding bytes by some CD writing software.
I don't know if there are other possible causes, like possibly recording in disk-at-once mode?
One suggestion that I haven't tried is to try burning in disk-at-once mode (Windows or Linux) to avoid the problem. Another is to use the -nopad option (Linux) — since I (so far) don't burn CDs in Linux, I haven't tried that either.
This page describes some ways to check the validity of a burnt CD, either via the md5sum or via cmp (compare), why sometimes that md5sum can be incorrect but the CD ok, ways to avoid that problem, and the best ways to check the md5sum of a burnt CD to avoid uncertainty.
When I wrote the original version of this page, I was very uncertain about what I thought I was seeing, so I collected data about md5sums under various circumstances. That data is still available in older versions of this page.
See AboutThesePages.
Contents
Checking a CD using cmp
From
BadISOMd5sumNotAlways by Pierre Fortin:
$ cmp /dev/cdrom /ISO/Mandrake/MandrakeLinux-9.2beta1-CD1.i586.iso
cmp: EOF on /ISO/Mandrake/MandrakeLinux-9.2beta1-CD1.i586.iso
The EOF on the ISO image means that the CD image is an EXACT copy PLUS padding. An EOF on /mnt/cdrom would have meant the CD was incomplete.
Checking the md5sum of a CD-Rom
In Linux
The Wrong Way
Do not use either of these two commands, which will provide valid results
sometimes. The problem is, if padding bytes have been added, the results from these commands are not valid.
- md5sum <device>
- dd if=<device> | md5sum
The trick is to compute the md5sum only for the original length of the ISO image.
Note: <device> represents the path to the cdrom device, like /dev/cdrom (Mandrake 7.2, IIRC), or /dev/cdrom/cdrom2 (Mandrake 8.1) -- check your system as the device numbers seem to vary, although /dev/cdrom is often a link to the correct device.
Note also that high speed CD readers, with or without anti-vibration correction, may also cause invalid results — see
#Trouble_with_High_Speed_or_Anti_ , below.
The Right Ways
Volker Kuhlmann provided the information that the trick is to check the md5sum only for the original length of the ISO file (ignore the padding), and he referred me to his script that does this. See .
Before I integrated that information into this page, I also found
BadISOMd5sumNotAlways by Pierre Fortin which suggests the same things but which provides a more visible way to accomplish that:
Just to prove the CD was really OK, I issued this piped string of commands:
$ dd if=/dev/cdrom | head -c 682575872 | md5sum
1333162+0 records in
1333161+0 records out
7a0479dc917d35bd822cecb558c8d432 -
Got the correct md5sum... :^) BTW, 682575872 is simply the length of that particular ISO image.
So, more generically, we can use this pipe stream:
dd if=/dev/cdrom | head -c `stat --format=%s ISO_image` | md5sum
i.e.,
$ dd if=/dev/cdrom | head -c `stat --format=%s /ISO/Mandrake/MandrakeLinux-9.2beta1-CD1.i586.iso` | md5sum
1333162+0 records in
1333161+0 records out
7a0479dc917d35bd822cecb558c8d432 -
-- PierreFortin - 27 Jul 2003
Note that the trick is to specify the length of the original ISO image, and the `stat --format=%s ISO_image` gets that original length.
Ok, it's a couple of years later and I'm trying to remember where the /ISO/Mandrake/MandrakeLinux-9.2beta1-CD1.i586.iso [was|came from]--my best guess is that I had downloaded that image (or whatever image I was testing against--today I'm trying to test a CD given to me for which I don't have a "good" disk image, so I don't see a way to (easily) get the original length of the ISO image. -- Main.RandyKramer - 09 Feb 2005
In Windows
So far, I don't know of a way.
Trouble with High Speed or Anti-Vibration CD-Rom Drives
I did find that reading a CD in one of the high speed anti-vibration CD drives can be a problem -- in trying to do the md5sum after burning on such a CD drive using "md5sum /dev/cdroms/cdrom1" I got inconsistent results. (In fact in six trials, I didn't get the same results twice. Perhaps it is a bad drive, but I've used it for all kinds of things including successful installs of Mandrake 7.2.)
On another machine, the drive is a standard 16x and the results were consistent -- identical on the three trials I made, one of which was done using "dd if=/dev/cdroms/cdrom1 | md5sum".
Amplifying: While reading from a high speed drive, I hear the drive sort of cycle — not sure whether it runs a bit then stops, then restarts, stops, etc., or whether it has two speeds and alternates between them. In any event, I think it's a case of the software and drive not "synching" properly — the drive zigs when the software expects a zag (or something similar).
My fix has been to buy older slower (used) drives (which are also cheaper).
As a possible alternate solution (in Linux) you can use hdparm to set the drive to a slower speed (I haven't tried this). But, that won't work, IMHO, when you are booting from a CD, although maybe you could boot the machine, change the appropriate parameters with hdparm and try a warm boot -- maybe the appropriate parameters would be retained in the correct place during a reboot, but I suspect not.
Some Results
When I first wrote this page, I was surprised and doubted what I seemed to see. Since then, I've received enough confirmation from others to believe the results. Hence, I've removed the md5sum data I was saving on various burnt CDs — it is still available in revision 1.21 of this page. (Click on "R1.21" at the bottom of this page to view it.)
Other Notes
I ran md5sum on some small files with the same content but with different file names and / or different dates. It became apparent that md5sum does not check the date or filename as the md5sums were all the same. (This is not clearly specified on the man or info page, IIRC.)
Resources
Threads on
newbie@linux-mandrakePLEASENOSPAM.com (
example post) and earlier on the Canterbury Linux Users Group confirm that problems like this exist. The thread on the Canterbury Linux Users Group includes this
post from Volker Kuhlmann, partially quoted below:
To: linux-users@HIDDEN
Subject: cdrom md5 sums was: dd help PLEASE
From: Volker Kuhlmann <list0570@HIDDEN>
Date: Tue, 11 Mar 2003 10:54:43 +1300
The implementation of the isofs inLinux is quite bad (e.g. the method of making inodes will prevent hardlinked files from ever being stored properly on an isofs). The kernel also has the habit (ever since the first version) of reading too much data from the device, i.e. it reads past end of file on the disk.
Needless to say this can cause I/O errors (oh what a surprise). For this reason only cdrecord has a -pad option, which simply writes additional zeros past the end of the filesystem onto the disk. Of course, this also stuffs your md5 sums. Another bug in the kernel is that it can't properly detect end-of-file on CD media. These additional zeros will screw your md5.
For the record, all these are 100% identical:
- cat /dev/cdrom | md5sum
- md5sum < /dev/cdrom
- dd if=/dev/cdrom bs=2k | md5sum
- dd </dev/cdrom bs=2k | md5sum
Further Info from Volker Kuhlmann
In some private correspondence with Volker Kuhlmann, he sent me this (to be "integrated").
A CD has a certain number of blocks you can burn. If you create a filesystem, it'll be taking up a certain number of blocks too. When you write a CD, the blocks of the filesystem simply get burnt onto the CD, starting with block 1 and ending with N. Obviously N has to be smaller than the number of blocks on the CD which can be burnt. The filesystem finishes at N, all data and all files are included in those 1..N blocks. For some reason, the kernel wants to read more than N blocks (this is a bug), even though it will never use the data read from blocks after N. Obviously the recording on the CD finishes with block N, trying to read block N+1 results in a read error (there's simply nothing there). To keep the kernel happy, the trick is to write a few more blocks filled with zeros starting with block N+1 and finishing with N+X. It's irrelevant what they contain - they will never be used, but they have to be readable.
An iso file contains an iso9660 filesystem, and you know how many blocks it takes up by looking at the file size (you could also look at the filesystem inside the file, with isoinfo). If you create an MD5 checksum of the filesystem, you simply create a checksum of the whole file.
When you want to check a burnt CD, you must make sure you create the MD5 checksum of it from block 1 to block N, and not from 1 to N-2, or 1 to N+X. If the kernel wasn't buggy, a cat /dev/cdrom should read blocks 1 to N+X, your checksum will be useless and your test will fail. This doesn't mean the CD actually has an error on block 1 to N. My script writecd --blockread /dev/cdrom will read the correct number of blocks 1 to N (that's why I wrote it). My script md5 will create MD5 sums of iso files and check CDs against it, also with the correct number of blocks.
Other Resources
(Not necessarily directly relevant to this issue.)
Contributors
- RandyKramer - 14 Jan 2002
- Mario Michael da Costa
- Volker Kuhlmann - quoted posts dated 11 Mar 2003, and other correspondence sent to me in July, 2003
- Pierre Fortin - information from BadISOMd5sumNotAlways, almost all of it quoted on this page (at this time)
- <If you edit this page, add your name here, move this to the next line>
Edit Record
Major refactoring on July 31, 2003. More should be done, I need to let it sit a while.
Page Ratings