gzrecover - Recover data from a corrupted gzip file

gzrecover is a program that will attempt to extract any readable data
out of a gzip file that has been corrupted. 

*****************************************************************************
ATTENTION!!!!  99% of "corrupted" gzip archives are caused by transferring
the file via FTP in ASCII mode instead of binary mode.  Please re-transfer
the file in the correct mode first before attempting to recover from a file
you believe is corrupted.
*****************************************************************************

It is highly likely that not all data in the file will be successfully
retrieved.  In the event that the compressed file was a tar archive, the
standard tar program will probably not be able to extract all of the 
files in the recovered file.  I have submitted patches to the tar maintainer
that will add a `--recover' option to tar to do this for you.  See below
for more info.

INSTALLATION:

To build gzrecover, type "make" at the command line.  This will build the
gzrecover executable.

gzrecover relies on the zlib compression library which is not included in the
distribution.  You can download it from http://www.gzip.org/zlib/ if you need
it.  This needs to be installed before building gzrecover.

gzrecover relies on memory mapping files.  So you must compile and run
on a system that supports mmap();

To install the executable, copy it into the directory of your choice.  For
example:

cp gzrecover /usr/local/bin

USAGE:

gzrecover [ -hsv ] [-o <filename>] <filename> 

By default, gzrecover writes its output to <filename>.recovered.  If the
original filename ended in .gz, that extension is removed.  Options include:

-o <name> - Sets the output file name
-s        - Splits each recovered segment into its own file,
            with numeric suffixes (.1, .2, etc) (UNTESTED)
-h        - Print the help message
-v        - Verbose logging on

Running gzrecover on an uncorrupted gzip file should simply uncompress it.
However, substituting gzrecover for gzip on a regular basis is not
recommended.

Any recovered data should be manually verified for validity. 

RECOVERING TAR FILES

If your .gz file is a tar archive, it is likely the recovered file cannot
be processed by tar out of the box.  I have written a patch to the GNU
tar program that will enable it to recover anything that might be
in the file.  As with gzrecover, there are no guarantees, false files will
likely be generated, and anything extracted should be manually verified
for correctness.

First, download the latest development version of tar, 1.3.25 from
ftp://alpha.gnu.org/gnu/tar/tar-1.13.25.tar.gz.  Extract the file, which
will unpack into a directory called tar-1.13.25.  Then

cd tar-1.13.25/src
patch < <full path to tar-recovery.patch>

Then just build and install GNU tar as the package directs.  I recommend
installing it somewhere private and not overwriting the tar on your system.

To extract files, use the --recover option:

tar --recover -xvf <filename from gzrecover output> 

Note that this will be an extremely slow extract. I recommend redirecting
output to a file.

PUTTING IT ALL TOGETHER

Your file foo.tar.gz is on a tape with bad data.  To recover, copy the
tape file to foo.tar.gz and:

gzrecover foo.tar.gz
tar --recover -xvf foo.tar.recovered

No guarantees, but I hope this helps you as much as it helped me!

COPYRIGHT NOTICE (gzrecover.c):

gzrecover was written by Aaron M. Renn (arenn@urbanophile.com)
Copyright (c) 2002-2003 Aaron M. Renn. 

  This software is provided 'as-is', without any express or implied
  warranty.  In no event will the author be held liable for any damages
  arising from the use of this software.

  Permission is granted to anyone to use this software for any purpose,
  including commercial applications, and to alter it and redistribute it
  freely, subject to the following restrictions:

  1. The origin of this software must not be misrepresented; you must not
     claim that you wrote the original software. If you use this software
     in a product, an acknowledgment in the product documentation would be
     appreciated but is not required.
  2. Altered source versions must be plainly marked as such, and must not be
     misrepresented as being the original software.
  3. This notice may not be removed or altered from any source distribution.

  Aaron M. Renn
  arenn@urbanophile.com

If you use the zlib library in a product, I would appreciate *not*
receiving lengthy legal documents to sign. The sources are provided
for free but without warranty of any kind.

If you redistribute modified sources, I would appreciate that you include
in the file ChangeLog history information documenting your changes.

COPYRIGHT NOTICE (tar patches):

The patches to the GNU tar program were written by Aaron M. Renn.
Copyright (c) 2002-2003 Aaron M. Renn (arenn@urbanophile.com)

This code is licensed under the same GNU General Public License v2
(or at your option, any later version) at GNU tar.  See
http://www.gnu.org/licenses/gpl.html

