Bash

grep 使用數組值並使其更快

  • November 29, 2018

大批

$$ 1 $$是從 30k 行 CSV 中提取的字元串:範例:

samsung black 2014

我需要將這些行與數組(arrayItems)中包含的值之一匹配。

arrayItems 包含 221 個值,例如:

apple
sony
samsung

實際腳本:

while IFS=$';' read -r -a array
do
   mapfile -t arrayItems < $itemsFile
   ## now loop through the above array
   for itemToFind in "${arrayItems[@]}"
   do
      itemFound=""
      itemFound="$(echo ${array[1]} | grep -o '^$itemToFind')"
      if [ -n "$itemFound" ] 
      then 
         echo $itemFound 
         # so end to search in case the item is found
         break
      fi
   done
  # here I do something with ${array[2]}, ${array[4]} line by line and so on, 
  # so I can't match the whole file $file_in at once but online line by line.
done < $file_in

問題是 grep 不匹配。

但如果我嘗試像這樣對 $itemToFind 進行硬編碼:

itemFound="$(echo ${array[1]} | grep -o '^samsung')"

另一件事是……如何更快地做到這一點,因為 $file_in 是 30k 行 CSV?

您可以將 grep 與文件模式選項 (-f) 一起使用

例子:

$ echo -e "apple\nsony\nsamsung" > file_pattern
$ grep -f file_pattern your.csv

編輯:針對您的新限制:

sed 's/^/\^/g' $itemsFile > /tmp/pattern_file
while IFS=$';' read -r -a array
do
   echo ${array[1]} | grep -q -f /tmp/pattern_file.txt
   if [ $? -eq 0 ]; then 
       # here I do something with ${array[2]}, ${array[4]} line by line and so on, 
       # so I can't match the whole file $file_in at once but online line by line.
   fi
done < $file_in

引用自:https://unix.stackexchange.com/questions/484894