Yesterday, I faced a very strange story that I would like to tell you to prove the importance of “integrity” in information security. Wikipedia defines data integrity as following:
“Data Integrity in its broadest meaning refers to the trustworthiness of system resources over their entire life cycle.“
The “entire life cycle” is very important in this case. I had to upgrade the firmware of an appliance manufactured by “A”. I visited (over HTTPS) the support website of “A”, went to the download sections and grabbed the necessary files to perform the maintenance. The website provided MD5 hashes for all the files. Good practice! Once the files transferred on my laptop, md5sum reported the same hashes. My files were ready!
Just a small reminder for those who don’t know what’s a hashing algorithm. Based on a variable amount of data, a hashing algorithm computes a fixed size message digest. Well know algorithms are MD5, SHA1 or HMAC. Practically, the generated message digest will uniquely identify the original data. Example, almost all operating systems have tools to compute the MD5 or SHA1 digest of files:
$ md5sum /tmp/file.txt 451024bdf01d5d4f64567bea70c402be /tmp/file.txt $ sha1sum /tmp/file.txt 93c6c7c22e0846ca1944f76ceb6981a2f49ce70e /tmp/file.txt
This is a common way to control the integrity of files distributed online. Hashes are given on the original website. You perform the same operation on your local files, if the message digest is the same, files are identical!
Once at the customer premises, another good security practice: I was not authorized to connect my laptop on their management network. I simply copied the files to a clean (read: safe, scanned) USB stick to transfer them to the management workstation. Finally, I uploaded the files on the appliance and launched the upgrade procedure. After many coffees, the device was still decompressing the firmware (a 670MB archive). Strange, I decided to investigate…
I checked the USB stick: the firmware file looked ok, I could read it, even the file size was the same as the original. I generated the MD5 hash on the file directly from the USB stick and… it was not the same! The file was corrupted during the transfer between my laptop to the USB stick!? No error message was displayed during the copy operation, the stick was properly unmounted, no USB/SCSI errors were reported by my laptop kernel. I’m still wondering what happened!
Hopefully, the second attempt to upgrade the appliance was successful. What are the lessons learned from this story?
- Integrity is a key element in information security (That’s the “I” in the CIA triad)
- MD5/SHA1 hashes are a common way to verify the integrity of files downloaded via public resources. It must be checked not only while receiving the data from the source itself but during the complete data life-cycle: transfer, storage and retrieval. (what I omitted to do in this story – shame on me!)
- Data integrity can be compromised by multiple factors:
- Security threads (ex: a virus)
- Human errors
- Physical factors (ex: a bad sector on a disk)
- Software bugs
If I failed (and we learn by our mistakes) to check the integrity of the files from A to Z, the vendor “A” also failed somewhere:
- The process to decompress the firmware image did not report a problem with the file and crashed silently leaving the web console with a time counter running.
- Some vendors still fail to implement integrity checks on the firmware they have to process. Distributed files are simply not signed. It means that can be altered and injected in the device (MitM attack). There exist solutions to validate the integrity of a file from a consistency point of view (using CRC or “Cyclic Redundant Checks“).
Keep this in mind and stay safe!
Note: For a while, MD5 is considered as broken. It has been proven that MD5 is vulnerable to collision attacks. But it remains mainly used to check downloaded files integrity.