“xargs -I s printf s”比“xargs -n 1 printf”更兼容嗎？

March 14, 2021

概括
是否xargs -I s printf s比更兼容xargs -n 1 printf？
背景
處理可能包含 0x00 的二進制數據。我知道如何將二進制數據轉換為文本，如下所示：
# make sure that you have done this: export LC_ALL=C
od -A n -t x1 -v | # or -t o1 or -t u1 or whatever
tr ABCDEF abcdef | # because POSIX doesn't specify in which case
tr -d ' \t\n' | # because POSIX says they are delimiters
fold -w 2 |
grep . # to make sure to terminate final line with LF
…這裡是如何轉換回二進制：
# input: for each line, /^[0-9a-f]\{2\}$/
# also make sure export LC_ALL=C before
awk -v _maxlen="$(getconf ARG_MAX 2&gt;/dev/null)" '
 BEGIN{
   # (1) make a table
   # assume that every non-null byte can be converted easily
   # actually not portable in Termux; LC_ALL=C does not work and
   # awk is gawk by default, which depends on locale.
   # to deal with it, here is alternative:
   # for(i=0;i&lt;256;i++){
   #   xc[sprintf("%02x",i)]=sprintf("\\\\%03o",i);
   #   xl[sprintf("%02x",i)]=5;
   # }
   # # and skip to (2)
   # but why not just env -i awk to force one true awk, if so.
   # also is not it pretty rare that C locale is not available?
   for(i=1;i&lt;256;i++){
     xc[sprintf("%02x",i)]=sprintf("%c",i);
     xl[sprintf("%02x",i)]=1;
   }
   
   # now for chars that requires special converting.
   
   # numbers; for previous char is \\ooo.
   for(i=48;i&lt;58;i++){
     xc[sprintf("%02x",i)]=sprintf("\\\\%03o",i);
     xl[sprintf("%02x",i)]=5;
   }
   
   # and what cannot be easily passed to xargs -n 1 printf
   
   # null
   xc["00"]="\\\\000"; xl["00"]=5;
   
   # &lt;space&gt;
   xc["09"]="\\\\t";   xl["09"]=3;
   xc["0a"]="\\\\n";   xl["0a"]=3;
   xc["0b"]="\\\\v";   xl["0b"]=3;
   xc["0c"]="\\\\f";   xl["0c"]=3;
   xc["0d"]="\\\\r";   xl["0d"]=3;
   xc["20"]="\\\\040"; xl["20"]=5;
   
   # meta chars for printf
   xc["25"]="%%";      xl["25"]=2;
   xc["5c"]="\\\\\\\\";xl["5c"]=4;
   
   # hyphen; to prevent to be treated as if it were an option
   xc["2d"]="\\\\055"; xl["2d"]=5;
   
   # chars for quotation
   xc["22"]="\\\"";    xl["22"]=2;
   xc["27"]="\\'\''";  xl["27"]=2;
   
   # (2) preparation
   
   # reason why 4096: _POSIX_ARG_MAX
   # reason why length("printf "): because of ARG_MAX
   # reason why 4096/2 and _maxlen/2: because some xargs such as GNU specifies buffer length less than ARG_MAX
   if(_maxlen==""){
     maxlen=(4096/2)-length("printf ");
   }else{
     maxlen=int(_maxlen/2)-length("printf ");
   }
   
   ORS=""; LF=sprintf("\n");
   arglen=0;
 }
 {
   # (3) actual conversion here.
   
   # XXX. not sure why arglen+4&gt;maxlen.
   # but I think maximum value for xl[$0] is 5.
   # and maybe final LF is 1.
   if(arglen+4&gt;maxlen){
     print LF;
     arglen=0;
   }
   print xc[$0];
   arglen+=xl[$0];
 }
 END{
   # for some xargs who hates input w/o LF termination
   if(NR&gt;0)print LF;
 }
' |
xargs -n 1 printf
我發現空輸入的問題：在 GNU/Linux 中，它失敗了，如下所示：
$ xargs -n 1 printf &lt;/dev/null
printf: missing operand
Try 'printf --help' for more information.
然後我發現，xargs -n 1 printf 2>/dev/null || :添加塊，並且是替代品。我只看到第一個實際用於 ShellShoccar-jpn 的程序，但我認為它有點有力。第二個也沒有最後一個乾淨。第三個不僅可以替代 GNU/Linux，而且可以替代所有其他（或大多數其他）環境？由於我只有 GNU/Linux，我不知道如何在其他環境中驗證我的想法。最簡單的方法是獲取它們的來源並參考它們，或者參考它們的手冊。如果根本無法驗證，那我只好放棄了。if(NR==0)printf"\"\"\n";``END``xargs -I s printf s
我的知識
正如 POSIX 所說，這似乎printf至少需要一個參數。
有些xargs忽略沒有 LF 終止的輸入；grep ^ | xargs something here比xargs something here可能沒有 LF 終止的輸入更便攜。
xargs 對於沒有非空行的輸入是不可移植的；printf ' \n\n' | xargs echo foo在 FreeBSD 和fooGNU/Linux 上不輸出任何內容。在這種情況下，您必須使 xargs 命令對此類輸入安全，或者讓命令忽略錯誤。
FreeBSD 的 xargs 接收它的參數就好像它們是一樣的，$@而 GNU/Linux 的就好像它們是"$@".
通過反斜杠轉義適用於 xargs，例如作為輸出printf '\\\\\\'"'" | sed "$(printf 's/[\047\042\\]/\\\\&/g')" | xargs printf獲取。\'
附言
我發現它xargs -E ''比沒有選項更兼容，因為一些 xargs defaults -E _。

xargs就可移植性（和界面設計）而言，它可能是最差的 POSIX 實用程序。我會遠離它。怎麼樣：
&lt;file.hex awk -v q="'" -v ORS= '
 BEGIN{
   for (i=0; i&lt;256; i++) c[sprintf("%02x", i)] = sprintf("\\%o", i)
 }
 NR % 50 == 1 {print sep"printf "q; sep = q"\n"}
 {print c[$0]}
 END {if (sep) print q"\n"}
' | sh
相反，例如？
該awk部分輸出如下內容：
printf '\61\12\62\12\63\12\64\12\65\12\66\12\67\12\70\12\71\12\61\60\12\61\61\12\61\62\12\61\63\12\61\64\12\61\65\12\61\66\12\61\67\12\61\70\12\61\71\12\62\60'
printf '\12\62\61\12\62\62\12\62\63\12\62\64\12\62\65\12\62\66\12\62\67\12\62\70\12\62\71\12\63\60\12\63\61\12\63\62\12\63\63\12\63\64\12\63\65\12\63\66\12\63'
printf '\67\12\63\70\12\63\71\12\64\60\12\64\61\12\64\62\12\64\63\12\64\64\12\64\65\12\64\66\12\64\67\12\64\70\12\64\71\12\65\60\12\65\61\12\65\62\12\65\63\12'
printf '\65\64\12\65\65\12\65\66\12\65\67\12\65\70\12\65\71\12\66\60\12\66\61\12\66\62\12\66\63\12\66\64\12\66\65\12\66\66\12\66\67\12\66\70\12\66\71\12\67\60'
printf '\12\67\61\12\67\62\12\67\63\12\67\64\12\67\65\12\67\66\12\67\67\12\67\70\12\67\71\12\70\60\12\70\61\12\70\62\12\70\63\12\70\64\12\70\65\12\70\66\12\70'
printf '\67\12\70\70\12\70\71\12\71\60\12\71\61\12\71\62\12\71\63\12\71\64\12\71\65\12\71\66\12\71\67\12\71\70\12\71\71\12\61\60\60\12'
為了sh解釋。在內置的sh實現中printf，這不會分叉額外的程序。在那些不是的情況下，這些行應該足夠短以避免 ARG_MAX 限制，但printf每 50 個字節仍然不會超過一個。
請注意，您不能僅根據 ARG_MAX 的值來真正確定命令行的最大長度。如何達到和處理該限制很大程度上取決於系統及其版本。在許多情況下，ARG_MAX 處於累積大小的限制，包括argv[]指針envp[]列表（通常在 64 位系統上每個參數/envvar 為 8 個字節）加上每個 arg/env 字元串的字節大小（包括 NUL 分隔符）。Linux 對單個參數的大小也有獨立的限制。
另請注意，替換\12為\nexample 僅在基於 ASCII 的系統上有效。POSIX 沒有指定字元的編碼（NUL 除外）。仍有 POSIX 系統使用 EBCDIC 的某些變體而不是 ASCII。

引用自：https://unix.stackexchange.com/questions/636673

“xargs -I s printf s”比“xargs -n 1 printf”更兼容嗎？

概括

背景

我的知識

附言

相關問答

使用 POSIX 參數擴展從變數中刪除重複的字元串模式

如何在 POSIX shell 腳本中檢查變數是否為整數（避免空格問題）？

如果我正在編寫 bash 腳本，我為什麼要關心 POSIX？

何時使用 XPG* 版本的命令？

POSIX Shell：使用函式輸出到日誌並在行前添加時間戳

如何在 POSIX shell 腳本中創建算術循環？