Chapter 8: Archiving & Compression

On this page…

    The following were removed from Chapter 8: “Archiving & Compression”.

    Get the Best Compression Possible with gzip

    gzip -[0-9]

    Just as with zip, it’s possible to adjust the level of compression that gzip uses when it does its job. The gzip command uses a scale from 0 to 9, in which 0 means “no compression at all” (which is like tar, as you’ll see later), 1 means “do the job quickly, but don’t bother compressing very much,” and 9 means “compress the heck out of the files, and I’ll wait as long as I need to”. The default is 6, but modern computers are fast enough that it’s probably just fine to use 9 all the time.

    $ ls -l
    -rw-r--r-- scott scott 1236574 moby-dick.txt
    $ gzip -c -1 moby-dick.txt > moby-dick.txt.gz
    $ ls -l
    -rw-r--r-- scott scott 1236574 moby-dick.txt
    -rw-r--r-- scott scott  571005 moby-dick.txt.gz
    $ gzip -c -9 moby-dick.txt > moby-dick.txt.gz
    $ ls -l
    -rw-r--r-- scott scott 1236574 moby-dick.txt
    -rw-r--r-- scott scott  487585 moby-dick.txt.gz

    Remember to use the -c option and pipe the output into the actual .gz file due to the way gzip works, as discussed in “Archive and Compress Files Using gzip.”

    Note

    If you want to be clever, define an alias in your .bash_aliases file that looks like this:

    alias gzip='gzip -9'

    That way, you’ll always use -9 and won’t have to think about it.

    Get the Best Compression Possible with bzip2

    bzip2 -[0-9]

    Just as with zip and gzip, it’s possible to adjust the level of compression that bzip2 uses when it does its job. The bzip2 command uses a scale from 0 to 9, in which 0 means “no compression at all” (which is like tar, as you’ll see later), 1 means “do the job quickly, but don’t bother compressing very much,” and 9 means “compress the heck out of the files, and I’ll wait as long as I need to”. The default is 6, but modern computers are fast enough that it’s probably just fine to use 9 all the time.

    $ ls -l
    -rw-r--r-- scott scott 1236574 moby-dick.txt
    $ bzip2 -k -1 moby-dick.txt
    $ ls -l
    -rw-r--r-- scott scott 1236574 moby-dick.txt
    -rw-r--r-- scott scott  424084 moby-dick.txt.bz2
    $ bzip2 -k -9 moby-dick.txt
    $ ls -l
    -rw-r--r-- scott scott 1236574 moby-dick.txt
    -rw-r--r-- scott scott  367248 moby-dick.txt.bz2

    From 424KB with 1 to 367KB with 9—that’s quite a difference! Also notice the difference in ultimate file size between gzip and bzip2. At -9, gzip compressed moby-dick.txt down to 488KB, while bzip2 mashed it even further to 367KB. The bzip2 command is noticeably slower than the gzip command, but on a fast machine that means that bzip2 takes two or three seconds longer than gzip, which frankly isn’t much to worry about.

    Note

    If you want to be clever, define an alias in your .bash_aliases file that looks like this:

    alias bzip2='bzip2 -9'

    That way, you’ll always use -9 and won’t have to think about it.

    WebSanity Top Secret