Bash
grep 使用數組值並使其更快
大批
$$ 1 $$是從 30k 行 CSV 中提取的字元串:範例:
samsung black 2014
我需要將這些行與數組(arrayItems)中包含的值之一匹配。
arrayItems 包含 221 個值,例如:
apple sony samsung
實際腳本:
while IFS=$';' read -r -a array do mapfile -t arrayItems < $itemsFile ## now loop through the above array for itemToFind in "${arrayItems[@]}" do itemFound="" itemFound="$(echo ${array[1]} | grep -o '^$itemToFind')" if [ -n "$itemFound" ] then echo $itemFound # so end to search in case the item is found break fi done # here I do something with ${array[2]}, ${array[4]} line by line and so on, # so I can't match the whole file $file_in at once but online line by line. done < $file_in
問題是 grep 不匹配。
但如果我嘗試像這樣對 $itemToFind 進行硬編碼:
itemFound="$(echo ${array[1]} | grep -o '^samsung')"
另一件事是……如何更快地做到這一點,因為 $file_in 是 30k 行 CSV?
您可以將 grep 與文件模式選項 (-f) 一起使用
例子:
$ echo -e "apple\nsony\nsamsung" > file_pattern $ grep -f file_pattern your.csv
編輯:針對您的新限制:
sed 's/^/\^/g' $itemsFile > /tmp/pattern_file while IFS=$';' read -r -a array do echo ${array[1]} | grep -q -f /tmp/pattern_file.txt if [ $? -eq 0 ]; then # here I do something with ${array[2]}, ${array[4]} line by line and so on, # so I can't match the whole file $file_in at once but online line by line. fi done < $file_in