Awk

AWK:在源詞之後插入目標詞的快速方法

  • February 1, 2022

我不熟悉awk。為了在 198058 隨機行中的源術語之後插入單個目標術語,我在此處有此程式碼

awk -i inplace '(NR==FNR){a[$1];next}
   (FNR in a) && gsub(/\<Source Term\>/,"& Target Term")
    1
   ' <(shuf -n 198058 -i 1-$(wc -l < file)) file

file包含這樣的句子行

David has to eat his vegetables .
This weather is very cold .
Can you please stop this music ? This is terrible music .
The teddy bear is very plushy .
I must be going !

例如,如果我想在“天氣”之後插入“Wetter”這個詞,那麼某行會是這樣的

This weather Wetter is very cold .

如何重寫程式碼,所以我只需要包含兩個不同的文件,其中包含源術語和目標術語的列表?

假設源術語文件被呼叫sourceterms,目標術語文件被呼叫targetterms

如果sourceterms包含這些術語的列表

vegetables
weather
terrible
plushy
going

targetterms包含這些條款

Gemüse
Wetter
schreckliche
flauschig
gehen

我希望我的程式碼檢查每一行file是否包含源術語並在其後插入目標術語,因此我file將如下所示:

David has to eat his vegetables Gemüse .
This weather Wetter is very cold .
Can you please stop this music ? This is terrible schreckliche music .
The teddy bear is very plushy flauschig.
I must be going gehen!

是否可以重寫上面的程式碼?

將 GNU awk(OP 正在使用)用於 ARGIND 和字邊界:

$ cat tst.awk
ARGIND == 1 { olds[FNR] = "\\<" $1 "\\>"; next }
ARGIND == 2 { map[olds[FNR]] = "& " $1; next }
{
   for ( old in map ) {
       new = map[old]
       gsub(old,new)
   }
   print
}
$ awk -f tst.awk sourceterms targetterms file
David has to eat his vegetables Gemüse .
This weather Wetter is very cold .
Can you please stop this music ? This is terrible schreckliche music .
The teddy bear is very plushy flauschig .
I must be going gehen !

以上假設您的源不包含任何正則表達式元字元,並且您的替換文本不包含&反向引用元字元。它還假設如果相同的單詞同時出現在源和目標中,您並不關心替換發生的順序。

引用自:https://unix.stackexchange.com/questions/688449