Text-Processing
用 sed/grep/whatever 擦除 2 行模式
我有一個巨大的 cvs 日誌文件,從無用的資訊中清除,讀取類似
Working file: unmodifiedfile1.c ================ Working file: modifiedfile1.h ---------------------------------- revision 1.3 Fixed some bug ================ Working file: unmodifiedfile2.h ================ Working file: modifiedfile2.h ---------------------------------- revision 1.1 Added some feature ================ Working file: unmodifiedfile3.h
我想清理與未修改文件相關的行:
Working file: modifiedfile1.h ---------------------------------- revision 1.3 Fixed some bug ================ Working file: modifiedfile2.h ---------------------------------- revision 1.1 Added some feature ================
要匹配的模式是
Working file: FILENAME ================
到目前為止,我能夠做的是以下幾點:
sed '/Working file:/ N ; s/\n/PLACEHOLDER/' changelog.txt | grep -v 'PLACEHOLDER===' | sed 's/PLACEHOLDER/\n/
我敢肯定,但是有一個更清潔的解決方案,我的 sed 無知排除了我……(另外,如果有必要,獎金將能夠刪除最新的行)
附言
以以下結尾的輸出:
================ Working file: unmodifiedfile3.h
也可以接受
sed '/Working file:/ N ; s/\n/PLACEHOLDER/' changelog.txt | grep -v 'PLACEHOLDER===' | sed 's/PLACEHOLDER/\n/
確實可以縮短為:
$ sed '/Working file:/{N;/===/d}' changelog.txt Working file: modifiedfile1.h ---------------------------------- revision 1.3 Fixed some bug ================ Working file: modifiedfile2.h ---------------------------------- revision 1.1 Added some feature ================ Working file: unmodifiedfile3.h
- 刪除所有包含
Working file:
和後續行的行(如果包含===
)以及最後一行(如果包含)Working file:
感謝@ilkkachu 的建議。如果模式需要在行首匹配,請使用
^Working file:
$ cat ip.txt Working file: 123 ================ Working file: f1 ---------------------------------- revision 1.3 Fixed some bug ================ Working file: abc ================ Working file: file ---------------------------------- revision 1.1 Added some feature ================ Working file: xyz $ sed '/Working file:/{N;/===/d}' ip.txt | sed '${/Working file:/d}' Working file: f1 ---------------------------------- revision 1.3 Fixed some bug ================ Working file: file ---------------------------------- revision 1.1 Added some feature ================
但
這應該接近您所追求的:
<cvslog sed -n '/Working file/ { N; /\n=\+$/b; :a; N; /\n=\+$/!ba; p; }'
輸出:
Working file: modifiedfile1.h ---------------------------------- revision 1.3 Fixed some bug ================ Working file: modifiedfile2.h ---------------------------------- revision 1.1 Added some feature ================
解釋
sed
這是帶有註釋的相同腳本:/Working file/ { N # append next line to pattern space /\n=\+$/b # is it a file separator -> next file :a N # append next line to pattern space /\n=\+$/!ba # isn't it a file separator -> read next line p # otherwise print accumulated text }
awk
如果您告訴
awk
使用文件分隔線作為記錄分隔符 (RS
),定義一個合理的選擇標準變得相當簡單:<cvslog awk 'NF>2' RS='\n=+\n' FS='\n' ORS='\n\n'
輸出:
Working file: modifiedfile1.h ---------------------------------- revision 1.3 Fixed some bug Working file: modifiedfile2.h ---------------------------------- revision 1.1 Added some feature
bash 和 coreutils
只是為了好玩:
csplit cvslog '/=\{16\}/1' '{*}' wc -l xx* | head -n-1 | while read n f; do if (( n > 2 )); then cat $f fi done
輸出:
Working file: modifiedfile1.h ---------------------------------- revision 1.3 Fixed some bug ================ Working file: modifiedfile2.h ---------------------------------- revision 1.1 Added some feature ================