Awk

嘗試將簡單的 C 程序轉換為 AWK 程序

  • November 18, 2015

大家好,幾週前我編寫了一個 C 程序,它提示使用者輸入文本文件的名稱,然後提示使用者輸入單詞。然後程序將輸入的文本文件輸出為文本左側的數字,並輸出該單詞在文本文件中出現的次數。它還輸出單詞所在的匹配行號。

這是它的一個範例:

輸入文本文件的名稱:bond.txt

Enter the pattern to search for: Bond

File contents:

1) Secret agent Bond had been warned not to tangle with Goldfinger.

2) But the super-criminal's latest obsession was too strong, too dangerous.

3) He had to be stopped.

4) Goldfinger was determined to take possession of half the supply of

5) mined gold in the world--to rob Fort Knox!

6) For this incredible venture he had enlisted the aid of the top

7) criminals in the U.S.A, including a bevy of beautiful thieves from the

8) Bronx. And it would take all of Bond's unique talents to make it fail--

9) as fail it must.

There is a match on line 1

There is a match on line 8

'Bond' appeared 2 times in the file bond.txt.

目前,我正在嘗試通過重複我在 C 中執行的程序但使用 awk 來練習 awk 程式。

到目前為止,我可以收集以下內容:

BEGIN{
   printf("Enter filename : ")
   getline file < "-"
   while((getline < file)) {
       {print "File Contents:"}
       {printf("%5d) %s\n", NR,$0)}
   }
}

什麼是可以讓我逐行解析文本文件以搜尋使用者輸入的單詞的最佳和有效方法?有什麼訣竅、技巧嗎?謝謝你。

$ awk '/Bond/{c++; print "There is a match on line " NR} END{print "\"Bond\" appeared " c " times in the file " FILENAME}' bond.txt
There is a match on line 1
There is a match on line 8
"Bond" appeared 2 times in the file bond.txt

這個怎麼運作

awk 隱式循環所有輸入行。

  • /Bond/{c++; print "There is a match on line " NR}

對於匹配 regex 的行Bond,計數器c會增加並列印一條消息,顯示匹配所在的行。在 awk 中,到目前為止讀取的行數是NR.

  • END{print "\"Bond\" appeared " c " times in the file " FILENAME}

讀取最後一行後,將列印一條消息,顯示匹配的總數。

多行版本

對於那些喜歡他們的程式碼分佈在多行中的人:

awk '

/Bond/{
   c++
   print "There is a match on line " NR
}

END{
   print "\"Bond\" appeared " c " times in the file " FILENAME
}
' bond.txt

在文件摘要之前顯示文件內容

這種方法讀取文件兩次。第一次,它列印用行號格式化的文件版本。第二次列印摘要輸出:

$ awk 'FNR==NR{printf("%5d) %s\n", NR,$0);next} /Bond/{c++; print "There is a match on line " FNR} END{print "\"Bond\" appeared " c " times in the file " FILENAME}' bond.txt{,}
   1) Secret agent Bond had been warned not to tangle with Goldfinger.
   2) But the super-criminal's latest obsession was too strong, too dangerous.
   3) He had to be stopped.
   4) Goldfinger was determined to take possession of half the supply of
   5) mined gold in the world--to rob Fort Knox!
   6) For this incredible venture he had enlisted the aid of the top
   7) criminals in the U.S.A, including a bevy of beautiful thieves from the
   8) Bronx. And it would take all of Bond's unique talents to make it fail--
   9) as fail it must.
There is a match on line 1
There is a match on line 8
"Bond" appeared 2 times in the file bond.txt

以上與第一個版本有兩個不同之處。首先,該文件在命令行上提供了兩次 asbond.txt bond.txt或者,使用 bash大括號擴展技巧, as bond.txt{,}.

其次,我們添加了命令:

FNR==NR{printf("%5d) %s\n", NR,$0);next}

FNR==NR僅當NR 是到目前為止讀取的總行數且 FNR 是從目前文件讀取的行數時才執行此命令。所以,當 時FNR==NR,我們第一次讀取文件。然後我們printf格式化輸出並跳轉到該next行,跳過腳本中的其餘命令。

選擇

在這個版本中,我們只讀取了一次文件,列印了格式化的版本,同時將摘要資訊保存在最後列印:

$ awk '{printf("%5d) %s\n", NR,$0)} /Bond/{c++; s=s ORS "There is a match on line " FNR} END{print s; print "\"Bond\" appeared " c " times in the file " FILENAME}' bond.txt
   1) Secret agent Bond had been warned not to tangle with Goldfinger.
   2) But the super-criminal's latest obsession was too strong, too dangerous.
   3) He had to be stopped.
   4) Goldfinger was determined to take possession of half the supply of
   5) mined gold in the world--to rob Fort Knox!
   6) For this incredible venture he had enlisted the aid of the top
   7) criminals in the U.S.A, including a bevy of beautiful thieves from the
   8) Bronx. And it would take all of Bond's unique talents to make it fail--
   9) as fail it must.

There is a match on line 1
There is a match on line 8
"Bond" appeared 2 times in the file bond.txt

引用自:https://unix.stackexchange.com/questions/243905