比較行與 awk 與同時讀取行

April 24, 2013

我有兩個文件，一個有 17k 行，另一個有 4k 行。我想將位置 115 與位置 125 與第二個文件中的每一行進行比較，如果匹配，則將第一個文件中的整行寫入一個新文件。我想出了一個解決方案，我使用 ‘cat $filename | 讀取文件。同時閱讀LINE’。但大約需要 8 分鐘才能完成。有沒有其他方法像使用’awk’來減少這個過程時間。
我的程式碼
cat $filename | while read LINE
do
 #read 115 to 125 and then remove trailing spaces and leading zeroes
 vid=`echo "$LINE" | cut -c 115-125 | sed 's,^ *,,; s, *$,,' | sed 's/^[0]*//'`
 exist=0
 #match vid with entire line in id.txt
 exist=`grep -x "$vid" $file_dir/id.txt | wc -l`
 if [[ $exist -gt 0 ]]; then
   echo "$LINE" &gt;&gt; $dest_dir/id.txt
 fi
done

以下應該可以工作，更新以去除空白：

#!/usr/bin/awk -f
# NR is the current line number (doesn't reset between files)
# FNR is the line number within the current file
# So NR == FNR  takes only the first file
NR == FNR {
   # Mark the current line as existing, via an associative array.
   found[$0]=1

   # Skip to the next line, so we don't go through the next block
   next
}
{
   # Take the columns we're looking for
   cols = substr($0,115,11)

   # Strip whitespace (space and tab) from the beginning (^) and end ($) 
   gsub(/^[ \t]+/,"", cols)
   gsub(/[ \t]+$/,"", cols)

   # Check the associative array to see if this was in the first file
   # If so, print the full line
   if(found[cols]) print;
}

將其放入文件中並使用以下方法之一呼叫

awk -f script.awk patterns.txt full.txt
./script.awk patterns.txt full.txt

引用自：https://unix.stackexchange.com/questions/73555

比較行與 awk 與同時讀取行

相關問答

僅刪除單引號中的逗號

bash 將行轉換為列

在某行之後將長行拆分為最大長度的單獨行

用另一個變數的值替換一個變數

第一行的最後一個字母和下一行的第一個字母

如何使用 sed、grep 或 awk 根據另一個文件中的行號將某些行保留在文件中