Shell-Script

提取字元串後跟特定的單詞/符號

  • January 24, 2019

我的輸入文件 input.txt 中有如下所示的兩行,我需要從第一行提取 claimStartDate 並從第二行提取 claimEndDate。

<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00">

<ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00">

rm input.txt
awk '/<ProfessionalClaim/' test.xml | head -1 > input.txt
awk '/<ProfessionalClaim/' test.xml | tail -1 >> input.txt
awk '{match($0, "claimStartDate=\"([^\"]+)\"", start); print start[1]} \
    {match($0, "claimEndDate=\"([^\"]+)\"", end); print end[1]}' input.txt
$ awk '/F_LINE/ {match($0, "claimStartDate=\"([^\"]+)\"", start); print start[1]} \         
      /L_LINE/ {match($0, "claimEndDate=\"([^\"]+)\"", end); print end[1]}' input.txt
2018-04-02
2018-04-17

編輯由於您的新資訊:

$ awk 'NR==1 {match($0, "claimStartDate=\"([^\"]+)\"", start); print start[1]} \            
      NR==2 {match($0, "claimEndDate=\"([^\"]+)\"", end); print end[1]}' input.txt
2018-04-02
2018-04-17

您也可以一次性完成所有操作:

$ grep "<ProfessionalClaim" text.xml \
| sed -n '1p;$p' \
| $ awk 'NR==1 {match($0, "claimStartDate=\"([^\"]+)\"", start); print start[1]} \            
        NR==2 {match($0, "claimEndDate=\"([^\"]+)\"", end); print end[1]}'
  • grep找到所有符合<ProfessionalClaimintext.xml
  • sed將行截斷到第一個和最後一個 onyl
  • awk將列印claimStartDate第一行和ClaimEndDate第二行

引用自:https://unix.stackexchange.com/questions/496392