Linux

busybox tar 或將大文件拆分為多個小文件

  • April 27, 2015

我在 at91sam9g20 上執行 Linux 版本 3.4.8。

我想獲取一個大記錄並將其拆分為多個文件。我嘗試了多種方法,但似乎都沒有正常工作,例如

tar -c -M --tape-length=102400 --file=disk1.tar mytest.tar.g
z
tar: invalid option -- M
BusyBox v1.20.2 (2012-09-24 16:21:25 CEST) multi-call binary.

Usage: tar -[cxthvO] [-X FILE] [-T FILE] [-f TARFILE] [-C DIR] [FILE]...

Create, extract, or list files from a tar file

Operation:
       c       Create
       x       Extract
       t       List
       f       Name of TARFILE ('-' for stdin/out)
       C       Change to DIR before operation
       v       Verbose
       O       Extract to stdout
       h       Follow symlinks
       exclude File to exclude
       X       File with names to exclude
       T       File with names to include

似乎busybox有一個精簡版的tar,它不允許某些參數。

當我嘗試拆分時,我得到以下資訊:

/:# split -sh: split: 未找到

有沒有使用busybox命令集將大文件拆分為多個文件的方法?

Currently defined functions:
       [, [[, addgroup, adduser, ar, arping, ash, awk, basename, blkid,
       bunzip2, bzcat, cat, catv, chattr, chgrp, chmod, chown, chroot, chrt,
       chvt, cksum, clear, cmp, cp, cpio, crond, crontab, cut, date, dc, dd,
       deallocvt, delgroup, deluser, devmem, df, diff, dirname, dmesg, dnsd,
       dnsdomainname, dos2unix, du, dumpkmap, echo, egrep, eject, env,
       ether-wake, expr, false, fdflush, fdformat, fgrep, find, fold, free,
       freeramdisk, fsck, fuser, getopt, getty, grep, gunzip, gzip, halt,
       hdparm, head, hexdump, hostid, hostname, hwclock, id, ifconfig, ifdown,
       ifup, inetd, init, insmod, install, ip, ipaddr, ipcrm, ipcs, iplink,
       iproute, iprule, iptunnel, kill, killall, killall5, klogd, last, less,
       linux32, linux64, linuxrc, ln, loadfont, loadkmap, logger, login,
       logname, losetup, ls, lsattr, lsmod, lsof, lspci, lsusb, lzcat, lzma,
       makedevs, md5sum, mdev, mesg, microcom, mkdir, mkfifo, mknod, mkswap,
       mktemp, modprobe, more, mount, mountpoint, mt, mv, nameif, netstat,
       nice, nohup, nslookup, od, openvt, passwd, patch, pidof, ping,
       pipe_progress, pivot_root, poweroff, printenv, printf, ps, pwd, rdate,
       readlink, readprofile, realpath, reboot, renice, reset, resize, rm,
       rmdir, rmmod, route, run-parts, runlevel, sed, seq, setarch,
       setconsole, setkeycodes, setlogcons, setserial, setsid, sh, sha1sum,
       sha256sum, sha512sum, sleep, sort, start-stop-daemon, strings, stty,
       su, sulogin, swapoff, swapon, switch_root, sync, sysctl, syslogd, tail,
       tar, tee, telnet, test, tftp, time, top, touch, tr, traceroute, true,
       tty, udhcpc, umount, uname, uniq, unix2dos, unlzma, unxz, unzip,
       uptime, usleep, uudecode, uuencode, vconfig, vi, vlock, watch,
       watchdog, wc, wget, which, who, whoami, xargs, xz, xzcat, yes, zcat

您可以使用dd帶有bs,countskip參數的 busybox 小程序將大文件拆分成塊。

dd手冊頁部分來自busybox

dd

$$ if=FILE $$ $$ of=FILE $$ $$ ibs=N $$ $$ obs=N $$ $$ bs=N $$ $$ count=N $$ $$ skip=N $$

$$ seek=N $$ $$ conv=notrunc|noerror|sync|fsync $$

       Copy a file with converting and formatting

               if=FILE         Read from FILE instead of stdin
               of=FILE         Write to FILE instead of stdout
               bs=N            Read and write N bytes at a time
               ibs=N           Read N bytes at a time
               obs=N           Write N bytes at a time
               count=N         Copy only N input blocks
               skip=N          Skip N input blocks
               seek=N          Skip N output blocks
               conv=notrunc    Don't truncate output file
               conv=noerror    Continue after read errors
               conv=sync       Pad blocks with zeros
               conv=fsync      Physically write data out before finishing

所以基本上你會做這樣的事情:

$ dd if=bigfile of=part.0 bs=1024 count=1024 skip=0
$ dd if=bigfile of=part.1 bs=1024 count=1024 skip=1024
$ dd if=bigfile of=part.2 bs=1024 count=1024 skip=2048

對於每個part.X文件dd寫入count * bs bytes忽略skip輸入文件中的第一個字節。

一個非常基本的單行(結合sedxargsdd來自busybox的小程序)可能看起來像這樣:

seq 0 19 | xargs -n1 sh -c 'dd if=bigfile of=part.$0 bs=1024 count=1024 skip=$(expr $0 \* 1024)'

part.X最多生成 20 個1048576 bytes大小的文件。

範例拆分bigfile

$ ls -l
total 2940
-rw-rw-r-- 1 user user 3000000 Apr 27 13:21 bigfile

$ seq 0 20 | xargs -n1 sh -c 'dd if=bigfile of=part.$0 bs=1024 count=1024 skip=$(expr $0 \* 1024)'
1024+0 records in
1024+0 records out
1024+0 records in
1024+0 records out
881+1 records in
881+1 records out
0+0 records in
0+0 records out
[...]

$ ls -l
total 5968
-rw-rw-r-- 1 user user 3000000 Apr 27 13:21 bigfile
-rw-rw-r-- 1 user user 1048576 Apr 27 13:43 part.0
-rw-rw-r-- 1 user user 1048576 Apr 27 13:43 part.1
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.10
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.11
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.12
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.13
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.14
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.15
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.16
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.17
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.18
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.19
-rw-rw-r-- 1 user user  902848 Apr 27 13:43 part.2
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.3
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.4
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.5
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.6
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.7
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.8
-rw-rw-r-- 1 user user       0 Apr 27 13:43 part.9

cat可以使用(或dd再次使用seek參數)輕鬆完成恢復。0字節文件可以跳過:

$ cat part.0 part.1 part.2 > bigfile.res
$ diff bigfile bigfile.res

根據您的需要,您不應該使用seq和計算大文件的特定大小,而是在 shell 腳本中執行所有操作。

引用自:https://unix.stackexchange.com/questions/198832