Text-Processing
唯一地刪除多個文件中不需要的行
我需要從多個輸出文件中刪除不必要的數據,每個文件的一部分看起來像這樣
# BLASTN 2.3.0+ # Query: M03117:99:000000000-ALL7G:1:1101:18697:4431_2:N:0:196 # Database: /home/alex/blast/db/tryps_ITS/tryps_ITS_db # 0 hits found # BLASTN 2.3.0+ # Query: M03117:99:000000000-ALL7G:1:1101:26276:5181_1:N:0:196 # Database: /home/alex/blast/db/tryps_ITS/tryps_ITS_db # Fields: subject title, query acc., subject acc., evalue, q. start, q. end, s. start, s. end # 1 hits found FJ712717_(modified) Trypanosoma brucei brucei from mouse 18S ribosomal RNA gene, partial sequence; internal transcribed spacer 1, complete sequence; and 5.8S ribosomal RNA gene, partial sequence M03117:99:000000000-ALL7G:1:1101:26276:5181_1:N:0:196 FJ712717_(modified) 1.42e-137 1 271 53 323 # BLASTN 2.3.0+ # Query: M03117:99:000000000-ALL7G:1:1101:26276:5181_2:N:0:196 # Database: /home/alex/blast/db/tryps_ITS/tryps_ITS_db # Fields: subject title, query acc., subject acc., evalue, q. start, q. end, s. start, s. end # 1 hits found FJ712717_(modified) Trypanosoma brucei brucei from mouse 18S ribosomal RNA gene, partial sequence; internal transcribed spacer 1, complete sequence; and 5.8S ribosomal RNA gene, partial sequence M03117:99:000000000-ALL7G:1:1101:26276:5181_2:N:0:196 FJ712717_(modified) 1.06e-87 1 197 436 236 # BLASTN 2.3.0+ # Query: M03117:99:000000000-ALL7G:1:1101:10339:5290_1:N:0:196 # Database: /home/alex/blast/db/tryps_ITS/tryps_ITS_db # 0 hits found # BLASTN 2.3.0+ # Query: M03117:99:000000000-ALL7G:1:1101:10339:5290_2:N:0:196 # Database: /home/alex/blast/db/tryps_ITS/tryps_ITS_db # 0 hits found
前 4 行代表一個輸出結果
# BLASTN 2.3.0+ # Query: M03117:99:000000000-ALL7G:1:1101:7647:16266_2:N:0:215 # Database: /home/alex/blast/db/tryps_ITS/tryps_ITS_db # 0 hits found
我需要刪除所有命中為 0 的輸出結果,即所有 4 行(如上所示)
我注意到結果是:找到 1 個命中,添加了 2 個額外的行。第 6 行不以“#”符號開頭。我該如何使用
grep -B
命令執行此操作?我的預期輸出是一個只有“1 hits found”結果的文件。如下
# BLASTN 2.3.0+ # Query: M03117:99:000000000-ALL7G:1:1101:26276:5181_1:N:0:196 # Database: /home/alex/blast/db/tryps_ITS/tryps_ITS_db # Fields: subject title, query acc., subject acc., evalue, q. start, q. end, s. start, s. end # 1 hits found FJ712717_(modified) Trypanosoma brucei brucei from mouse 18S ribosomal RNA gene, partial sequence; internal transcribed spacer 1, complete sequence; and 5.8S ribosomal RNA gene, partial sequence M03117:99:000000000-ALL7G:1:1101:26276:5181_1:N:0:196 FJ712717_(modified) 1.42e-137 1 271 53 323 # BLASTN 2.3.0+ # Query: M03117:99:000000000-ALL7G:1:1101:26276:5181_2:N:0:196 # Database: /home/alex/blast/db/tryps_ITS/tryps_ITS_db # Fields: subject title, query acc., subject acc., evalue, q. start, q. end, s. start, s. end # 1 hits found FJ712717_(modified) Trypanosoma brucei brucei from mouse 18S ribosomal RNA gene, partial sequence; internal transcribed spacer 1, complete sequence; and 5.8S ribosomal RNA gene, partial sequence M03117:99:000000000-ALL7G:1:1101:26276:5181_2:N:0:196 FJ712717_(modified) 1.06e-87 1 197 436 236 # BLASTN 2.3.0+ # Query: M03117:99:000000000-ALL7G:1:1101:11481:5777_1:N:0:196 # Database: /home/alex/blast/db/tryps_ITS/tryps_ITS_db # Fields: subject title, query acc., subject acc., evalue, q. start, q. end, s. start, s. end # 1 hits found JN673389_(modified) Trypanosoma congolense isolate TS07210 18S ribosomal RNA gene, partial sequence; internal transcribed spacer 1, 5.8S ribosomal RNA gene, and internal transcribed spacer 2, complete sequence; and 28S ribosomal RNA gene, partial sequence M03117:99:000000000-ALL7G:1:1101:11481:5777_1:N:0:196 JN673389_(modified) 2.04e-105 1 231 23 253 # BLASTN 2.3.0+ # Query: M03117:99:000000000-ALL7G:1:1101:11481:5777_2:N:0:196 # Database: /home/alex/blast/db/tryps_ITS/tryps_ITS_db # Fields: subject title, query acc., subject acc., evalue, q. start, q. end, s. start, s. end # 1 hits found TCU22315_(modified) Trypanosoma congolense IL1180 18S, 5.8S, 28S-LS1, srRNA1, complete sequence, and 28S-LS2 ribosomal RNA, partial sequence M03117:99:000000000-ALL7G:1:1101:11481:5777_2:N:0:196 TCU22315_(modified) 1.40e-75 1 156 1176 1021
您可以使用
tac
反轉文件的行並在匹配模式之前刪除 3 行,包括包含使用匹配模式的行sed
,如下所示:tac filename | sed '/0 hits/I,+3 d' | tac
如果您想就地編輯文件,您可以在命令中使用
-i
選項,例如,sed
tac filename | sed -i '/0 hits/I,+3 d' filename | tac