The gzip Recovery Toolkit

So you thought you had your files backed up onto that jaz cartridge - until it came time to restore. Then you found out that you had bad sectors and you've lost almost everything because gzip craps out 10% of the way through your archive. The gzip Recovery Toolkit has a program - gzrecover - that attempts to skip over bad data in a gzip archive. This saved me from exactly the above situation. Hopefully it will help you as well.

I'm very eager for feedback on this program. If you download and try it, I'd appreciate and email letting me know what your results were. My email is arenn@urbanophile.com. Thanks.

ATTENTION

99% of "corrupted" gzip archives are caused by transferring the file via FTP in ASCII mode instead of binary mode. Please re-transfer the file in the correct mode first before attempting to recover from a file you believe is corrupted.

Disclaimer and Warning

This program is provided AS IS with absolutely NO WARRANTY. It is not guaranteed to recover anything from your file, nor is what it does recover guaranteed to be good data. The bigger your file, the more likely that something will be extracted from it. Also keep in mind that this program gets faked out and is likely to "recover" some bad data. Everything should be manually verified.

Downloading and Installing

You need the following packages:

First, build and install zlib if necessary. Next, unpack the gzrt sources. Then cd to the gzrt directory and build the gzrecover program by typing make. Install manually by copying to the directory of your choice.

Release 0.5 uses buffered reads in gzrecover instead of mmap(2), which should allow it to work on any sized file or on systems without mmap(2) support. I had a beta test report of someone using this successfully on a 168GB file. However, if this doesn't work for you for some reason, the previous version is still available for download via HTTP or FTP.

Usage

Run gzrecover on a corrupted .gz file. Anything that can be read from the file will be written to a file with the same name, but with a .recovered appended (any .gz is stripped). You can override this with the -o option. To get a verbose readout of exactly where gzrecover is finding bad bytes, use the -v option to enable verbose mode. This will probably overflow your screen with text so best to redirect output to a file. Once gzrecover has finished, you will need to manually verify any data recovered as it is quite likely that our output file is corrupt and has some garbage data in it. If your archive is a tarball, read on.

For tarballs, the tar program will choke because GNU tar cannot handle errors in the file format. Fortunately, GNU cpio (tested at version 2.5 or higher) handles corrupted files out of the box.

Here's an example:

$ ls *.gz
my-corrupted-backup.tar.gz
$ gzrecover my-corrupted-backup.tar.gz
$ ls *.recovered
my-corrupted-backup.tar.recovered
$ cpio -F my-corrupted-backup.tar.recovered -i -v

If you have a previous release, please note that the patches to GNU tar have been discontinued. They were only marginally successful at best and GNU cpio does what is needed out of the box and does it far better.

Notifications of Updates

If you want to receive notifications of new releases - which are very infrequent - please subscribe to notifications on the gzrecover freshmeat page

Copyright

The gzip Recovery Toolkit v0.5
Copyright (c) 2002-2006 Aaron M. Renn (arenn@urbanophile.com)

The gzrecover program and tar patches are licensed under the GNU General Public License.


Copyright © 2003-2006 Aaron M. Renn (arenn@urbanophile.com) All Rights Reserved
Back to my software page to up to my homepage.

Just Say No to Frames, Ads, Flash, and Animated GIF's