Advanced | Help | Encyclopedia
Directory


Rzip

The title of this article is incorrect because of technical limitations. The correct title is rzip.

rzip is a data compression program based on bzip2.

Table of contents

Compression Algorithm

rzip operates in two stages. The first stage finds and encodes large chunks of duplicated data over potentially very long distances (up to nearly a gigabyte) in the input file. The second stage is to use a standard compression algorithm (bzip2) to compress the output of the first stage.

It is quite common these days to need to compress files that contain long distance redundancies. For example, when compressing a set of home directories several users might have copies of the same file, or of quite similar files. It is also common to have a single file that contains large duplicated chunks over long distances, such as pdf files containing repeated copies of the same image. Most compression programs won't be able to take advantage of this redundancy, and thus might achieve a much lower compression ratio than rzip can achieve.

The same algorithm is also used in rsync.

Advantages

The key difference between rzip and other well known compression algorithms is its ability to take advantage of very long distance redundency. The well known deflate algorithm used in gzip uses a maximum history buffer of 32k. The BWT block sorting algorithm used in bzip2 is limited to 900k of history. The history buffer in rzip can be up to 900MB long, several orders of magnitude larger than gzip or bzip2.

Disadvantages

rzip is not for everyone! The two biggest disadvantages are that you can't pipeline rzip (so it can't read from standard input or write to standard output), and that it uses lots of memory. A typical compression run on a large file might use a couple of hundred MB of ram. If you have ram to burn and want the best possible compression rate then rzip is probably for you, otherwise stick with bzip2 or gzip.

History

rzip was originally written by Andrew Tridgell as part of his PhD research.

See also

External links








Links: Addme | Keyword Research | Paid Inclusion | Femail | Software | Completive Intelligence

Add URL | About Slider | FREE Slider Toolbar - Simply Amazing
Copyright © 2000-2008 Slider.com. All rights reserved.
Content is distributed under the GNU Free Documentation License.