Gzip

為什麼在標準輸入上壓縮文件會產生比作為參數給出的相同文件更小的輸出?

  • April 4, 2017

當我做:

# gzip -c foo > foo1.gz 
# gzip < foo > foo2.gz

為什麼foo2.gz最終尺寸小於foo1.gz

因為它保存了文件名和時間戳,以便在您稍後解壓縮後嘗試恢復兩者。由於在您的第二個範例foo中給出了gzipvia <stdin>,因此它無法儲存文件名和時間戳資訊。

從手冊頁:

  -n --no-name
         When compressing, do not save the original file name and time stamp by default. (The original name is always saved if the name had
         to  be truncated.) When decompressing, do not restore the original file name if present (remove only the gzip suffix from the com-
         pressed file name) and do not restore the original time stamp if present (copy it from the compressed file). This  option  is  the
         default when decompressing.

  -N --name
         When compressing, always save the original file name and time stamp; this is the default. When decompressing, restore the original
         file name and time stamp if present. This option is useful on systems which have a limit on file name  length  or  when  the  time
         stamp has been lost after a file transfer.

我在這裡重現了這個問題:

[root@xxx601 ~]# cat /etc/fstab > file.txt
[root@xxx601 ~]# gzip < file.txt > file.txt.gz
[root@xxx601 ~]# gzip -c file.txt > file2.txt.gz
[root@xxx601 ~]# ll -h file*
-rw-r--r--. 1 root root  465 May 17 19:35 file2.txt.gz
-rw-r--r--. 1 root root 1.2K May 17 19:34 file.txt
-rw-r--r--. 1 root root  456 May 17 19:34 file.txt.gz

在我的範例中,file.txt.gz相當於您的foo2.gz. 當它本來可以訪問資訊時,使用該-n選項會禁用此行為:

[root@xxx601 ~]# gzip -nc file.txt > file3.txt.gz
[root@xxx601 ~]# ll -h file*
-rw-r--r--. 1 root root  465 May 17 19:35 file2.txt.gz
-rw-r--r--. 1 root root  456 May 17 19:43 file3.txt.gz
-rw-r--r--. 1 root root 1.2K May 17 19:34 file.txt
-rw-r--r--. 1 root root  456 May 17 19:34 file.txt.gz

正如您在上面看到的,文件大小file.txtfile3.txt匹配,因為它們現在都省略了名稱和日期。

引用自:https://unix.stackexchange.com/questions/203977