Grep

讓 grep 理解字節轉義

  • March 12, 2018

我正在嘗試匹配一些 UTF-8 字元。問題是grep不翻譯\x字節轉義,所以這失敗了:

echo -e '\xd8\xaa' | grep -P '\xd8\xaa'

雖然這成功:

echo -e '\xd8\xaa' | grep -P $(printf '\xd8\xaa')

grep 可以不使用 printf 直接理解字節轉義嗎?如何?

這失敗了:

$ echo -e '\xd8\xaa' | grep -P '\xd8\xaa' | hexdump

這成功了:

$ echo -e '\xd8\xaa' | grep -P $'\xd8\xaa' | hexdump
0000000 aad8 000a                              
0000003

文件

來自man bash

$‘string’ 形式的單詞被特殊處理。該單詞擴展為字元串,並按照 ANSI C 標準的規定替換反斜杠轉義字元。反斜杠轉義序列(如果存在)按如下方式解碼:

          \a     alert (bell)
          \b     backspace
          \e
          \E     an escape character
          \f     form feed
          \n     new line
          \r     carriage return
          \t     horizontal tab
          \v     vertical tab
          \\     backslash
          \'     single quote
          \"     double quote
          \?     question mark
          \nnn   the eight-bit character whose value is the octal value nnn (one to three digits)
          \xHH   the eight-bit character whose value is the hexadecimal value HH (one or two hex digits)
          \uHHHH the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHH (one to four hex digits)
          \UHHHHHHHH
                 the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHHHHHH (one to eight hex digits)
          \cx    a control-x character

擴展的結果是單引號的,就好像美元符號不存在一樣。

引用自:https://unix.stackexchange.com/questions/429622