Bash

僅對以特定字元串開頭的行執行 sed 操作

  • December 3, 2021

我有以下文件格式

Received from +11231231234 at 2021-10-10T19:56:50-07:00:
This is a message that contains words like from, at, etc.

Sent to +11231231234 at 2021-10-11T06:50:57+00:00:
This is another message that contains words like to, at, etc.

我想清理“已接收”和“已發送”行,以下 sed 命令實現了這一點

cat file |  sed 's/from//g' | sed 's/to/    /g' | sed 's/+\w\+//' | sed 's/at//g' | \
sed 's/T/ /g' | sed 's/[[:digit:].]*\:$//' | sed 's/[[:digit:].]*\:$//' | sed 's/-$//' |  \
sed 's/-$//' | sed 's/+$//'

結果如下

Received    2021-10-10 19:56:50
This is a message that contains words like  ,  , etc.

Sent        2021-10-11 06:50:57
This is another message that contains words like  ,  , etc.

如您所見,它確實很好地清理了“已接收”和“已發送”行。但它也清理了消息行!如何僅在以 “Received” 和 “Sent” 開頭的行上應用這些操作?

這就是 sed 中的地址的用途:

sed -E '/^(Received|Sent) (from|to) \+[0-9]+ at/ s/ .*([0-9]{4}-[0-9]{2}-[0-9]{2})T([0-9:]{8}).*/        \1 \2/'
  • 地址表示替換僅適用於以 or 開頭ReceivedSent後跟fromor to+後跟數字 and 的行at
  • 替換從空格開始匹配,它擷取日期([0-9]{4}重複四次的數字等);它匹配T並再次擷取時間。時間之後發生的事情是匹配的,但沒有被擷取。然後,將整個匹配的部分替換為幾個空格和擷取的日期和時間。

引用自:https://unix.stackexchange.com/questions/679965