Grep

Grep 與輸入符不匹配

  • July 9, 2015

我正在嘗試查找帶有輸入符的行,但我沒有得到我期望的結果。我已經將其縮減為這個概念驗證:

$ uname -a
CYGWIN_NT-6.1 Aodh 2.0.4(0.287/5/3) 2015-06-09 12:22 x86_64 Cygwin

$ grep --version
grep (GNU grep) 2.21
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others, see <http://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.

$ od -c cr_poc.txt
0000000   h   e   l   l   o       w   o   r   l   d   ;  \r  \n  \r  \n
0000020

$ od -x cr_poc.txt
0000000 6568 6c6c 206f 6f77 6c72 3b64 0a0d 0a0d
0000020

$ grep $'\r' cr_poc.txt; echo $?
1

我嘗試了各種其他方法來尋找\r角色,但都沒有奏效。

請注意,這是在 Cygwin 上,這當然可能是問題的一部分。

瀏覽了各種輸入,我覺得grep行尾有它自己的魔力:

$ printf "foo\rbar\n" | grep -oz $'\r' | od -c
0000000  \r  \n
0000002
$ printf "foo\rbar\r\n" | grep -oz $'\r' | od -c
0000000
$ printf "foo\rbar\r" | grep -oz $'\r' | od -c
0000000  \r  \n  \r  \n
0000004

(這-z是我試圖匹配所有內容的蹩腳嘗試grep。)所以我在聯機幫助頁中搜尋LF,導致我:

-U, --binary
     Treat the file(s) as binary.  By default, under MS-DOS  and  MS-
     Windows,  grep  guesses the file type by looking at the contents
     of the first 32KB read from the file.  If grep decides the  file
     is  a  text  file, it strips the CR characters from the original
     file contents (to make regular expressions with  ^  and  $  work
     correctly).  Specifying -U overrules this guesswork, causing all
     files to be read and passed to the matching mechanism  verbatim;
     if  the  file is a text file with CR/LF pairs at the end of each
     line, this will cause some regular expressions  to  fail.   This
     option  has  no  effect  on  platforms other than MS-DOS and MS-
     Windows.

引用自:https://unix.stackexchange.com/questions/214770