模式匹配並刪除整行

November 25, 2018

如果 File1 的 Column1 與 Column 1 File2 完全匹配，我想刪除文件 1 的所有行。

文件 1：

r001:21:10    21    AAAAAATTTGC    *     =    XM:21
r002:21:10    21    YAAAATTTGC     *     =    nM:21
r001:21:10    21    TTAAAATTTGC    *     =    XM:21
r0012:21:10   21    LLAAAATTTGC    *     +    XM:21
r001:21:10    21    AAAAAATTTGC    *     =    GM:21

文件2：

r001:21:10
r001:21:20
r002:41:36
r002:41:99
r002:41:87
r0012:21:1

預期輸出：

r002:21:10    21    YAAAATTTGC     *     =    nM:21
r0012:21:10   21    LLAAAATTTGC    *     +    XM:21

你可以使用這個awk：
$ awk 'FNR==NR {a[$i]; next}; !($1 in a)' f2 f1
r002:21:10    21    YAAAATTTGC     *     =    nM:21
r0012:21:10   21    LLAAAATTTGC    *     +    XM:21
解釋
FNR==NR {a[$i]; next}它讀取第一個文件並將內容保存到a數組中。
!($1 in a)在讀取第二個文件時，它會檢查第一個欄位是否在a數組中。如果不是，則列印該行。

你也可以做

$ grep -wvFf file2 file1
r002:21:10    21    YAAAATTTGC     *     =    nM:21
r0012:21:10   21    LLAAAATTTGC    *     +    XM:21

來自man grep：

  -F, --fixed-strings
         Interpret PATTERN as a  list  of  fixed  strings,  separated  by
         newlines,  any  of  which is to be matched. 
  -f FILE, --file=FILE
         Obtain  patterns  from  FILE,  one  per  line.  
  -v, --invert-match
         Invert the sense of matching, to select non-matching lines. 
  -w, --word-regexp
         Select  only  those  lines  containing  matches  that form whole
         words.  The test is that the matching substring must  either  be
         at  the  beginning  of  the  line,  or  preceded  by  a non-word
         constituent character.

注意：但是，這將搜尋的每一行的全部內容file1，而不僅僅是第一列。

引用自：https://unix.stackexchange.com/questions/117456

模式匹配並刪除整行

解釋

相關問答

僅對子字元串進行更改操作

找到兩個連續的重複行

如何替換文件中的字元串？

刪除重複的行，同時保持行的順序

如何在模式（標記）之前將文件的內容插入到另一個文件中？

根據字元串列表和相應替換列表替換文件中的字元串