The gzip Recovery Toolkit

So you thought you had your files backed up - until it came time to restore. Then you found out that you had bad sectors and you've lost almost everything because gzip craps out 10% of the way through your archive. The gzip Recovery Toolkit has a program - gzrecover - that attempts to skip over bad data in a gzip archive. This saved me from exactly the above situation. Hopefully it will help you as well.

I'm very eager for feedback on this program. If you download and try it, I'd appreciate and email letting me know what your results were. My email is arenn@urbanophile.com. Thanks.

ATTENTION

99% of "corrupted" gzip archives are caused by transferring the file via FTP in ASCII mode instead of binary mode. Please re-transfer the file in the correct mode first before attempting to recover from a file you believe is corrupted.

Disclaimer and Warning

This program is provided AS IS with absolutely NO WARRANTY. It is not guaranteed to recover anything from your file, nor is what it does recover guaranteed to be good data. The bigger your file, the more likely that something will be extracted from it. Also keep in mind that this program gets faked out and is likely to "recover" some bad data. Everything should be manually verified.

Downloading and Installing

You need the following packages:

First, build and install zlib if necessary. Next, unpack the gzrt sources. Then cd to the gzrt directory and build the gzrecover program by typing make. Install manually by copying to the directory of your choice.

Usage

Run gzrecover on a corrupted .gz file. Anything that can be read from the file will be written to a file with the same name, but with a .recovered appended (any .gz is stripped). You can override this with the -o option. To get a verbose readout of exactly where gzrecover is finding bad bytes, use the -v option to enable verbose mode. This will probably overflow your screen with text so best to redirect output to a file. Once gzrecover has finished, you will need to manually verify any data recovered as it is quite likely that our output file is corrupt and has some garbage data in it. Note that gzrecover will take longer than regular gunzip. The more corrupt your data the longer it takes. If your archive is a tarball, read on.

For tarballs, the tar program will choke because GNU tar cannot handle errors in the file format. Fortunately, GNU cpio (tested at version 2.6 or higher) handles corrupted files out of the box.

Here's an example:

$ ls *.gz
my-corrupted-backup.tar.gz
$ gzrecover my-corrupted-backup.tar.gz
$ ls *.recovered
my-corrupted-backup.tar.recovered
$ cpio -F my-corrupted-backup.tar.recovered -i -v

Note that newer versions of cpio can spew voluminous error messages to your terminal. Also, cpio might take quite a long while to run.

If you have a previous release, please note that the patches to GNU tar have been discontinued. They were only marginally successful at best and GNU cpio does what is needed out of the box and does it far better.

Copyright

The gzip Recovery Toolkit v0.6
Copyright (c) 2002-2012 Aaron M. Renn (arenn@urbanophile.com)

The gzrecover program is licensed under the GNU General Public License.


Copyright © 2003-2012 Aaron M. Renn (arenn@urbanophile.com) All Rights Reserved