Text-Processing
比較行與 awk 與同時讀取行
我有兩個文件,一個有 17k 行,另一個有 4k 行。我想將位置 115 與位置 125 與第二個文件中的每一行進行比較,如果匹配,則將第一個文件中的整行寫入一個新文件。我想出了一個解決方案,我使用 ‘cat $filename | 讀取文件。同時閱讀LINE’。但大約需要 8 分鐘才能完成。有沒有其他方法像使用’awk’來減少這個過程時間。
我的程式碼
cat $filename | while read LINE do #read 115 to 125 and then remove trailing spaces and leading zeroes vid=`echo "$LINE" | cut -c 115-125 | sed 's,^ *,,; s, *$,,' | sed 's/^[0]*//'` exist=0 #match vid with entire line in id.txt exist=`grep -x "$vid" $file_dir/id.txt | wc -l` if [[ $exist -gt 0 ]]; then echo "$LINE" >> $dest_dir/id.txt fi done
以下應該可以工作,更新以去除空白:
#!/usr/bin/awk -f # NR is the current line number (doesn't reset between files) # FNR is the line number within the current file # So NR == FNR takes only the first file NR == FNR { # Mark the current line as existing, via an associative array. found[$0]=1 # Skip to the next line, so we don't go through the next block next } { # Take the columns we're looking for cols = substr($0,115,11) # Strip whitespace (space and tab) from the beginning (^) and end ($) gsub(/^[ \t]+/,"", cols) gsub(/[ \t]+$/,"", cols) # Check the associative array to see if this was in the first file # If so, print the full line if(found[cols]) print; }
將其放入文件中並使用以下方法之一呼叫
awk -f script.awk patterns.txt full.txt ./script.awk patterns.txt full.txt