Grep 以 1 開頭的行，但不是 10、11、100 等

July 31, 2018

我有一個帶有標籤計數的基因組數據文件，我想知道有多少表示一次：
$ grep "^1" file |wc -l
包含以 1 開頭的所有行，因此它包含代表 10 次、11 次、100 次、1245 次等的標籤。我該怎麼做？
Current format
79      TGCAG.....
1       TGCAG.....
1257    TGCAG.....
1       TGCAG......
我只想要以下幾行：
1       TGCAG.....
所以它不能包含以 1257 開頭的行。**注意：**上面的文件是製表符分隔的。

身體裡的問題
選擇以 a 開頭1併後跟空格的行
grep -c '^1\s'          file
grep -c '^1[[:space:]]' file
這也將給出行數（不需要呼叫 wc）
標題中的問題
A後面1 沒有另一個數字（或什麼都沒有）：
grep -cE '^1([^0-9]|$)' file 
但是上述兩種解決方案都有一些有趣的問題，請繼續閱讀。
在問題的正文中，使用者聲稱該文件是“製表符分隔的”。
分隔符
標籤
以 a 開頭的行1後跟一個製表符（命令中的實際製表符）。如果分隔符是空格（或任何其他，或沒有），則此操作失敗：
grep '^1    ' file
空間
以 a 開頭的行，1後跟一個空格（命令中的實際空格）。如果分隔符是任何其他或無，則此操作失敗。：
grep '^1 ' file
製表符或空格
grep '^1(   | )' file
grep '^1[[:blank:]]' file
空白
一個更靈活的選項是包含多個空格（水平和垂直）字元。[:space:]字元類集由 (space), \t(horizontal tab),\r (carriage return),\n(newline), \v(vertical tab) and\f(form feed). But grep can not match a newline (it is an internal limitation that could only be avoided with the-zoption). It is possible to use it as a description on the delimiter. It is also possible, and shorter, to use the GNU available shorthand of\s:
grep -c '^1[[:space:]]` file
grep -c '^1\s'          file

```

But this option will fail if the delimiter is something like a colon `:` or any other punctuation character (or any letter).


Boundary
========


Or, we can use the transition from a digit to a "not a digit" boundary, well, actually "a character not in `[_[:alnum:]]` (`_a-zA-Z0-9`)":



```
grep -c  '^1\b' file       # portable but not POSIX.
grep -c  '^1\&gt;' file       # portable but not POSIX.
grep -wc '^1'   file       # portable but not POSIX.
grep -c  '^1\W' file       # portable but not POSIX (not match only a `1`) (not underscore in BSD).

```

This will accept as valid lines that start with a 1 and are followed by some punctuation character.`

引用自：https://unix.stackexchange.com/questions/458954

Grep 以 1 開頭的行，但不是 10、11、100 等

身體裡的問題

標題中的問題

分隔符

標籤

空間

製表符或空格

空白

相關問答

如何用任何兩個數字匹配或第一個數字比第二個數字小一個的字元串？

如何用grep找出文本文件中出現頻率最高的年份並輸出該年份？

從 curl 輸出中提取 IP:PORT

如何在日誌文件中用逗號分隔

如何在文件中使用 grep 命令

使用 xargs 傳遞值以生成動態輸出文件名