如何逐行比較兩個文件？

April 16, 2019

我有兩個文件 A 和 B 幾乎相同，有些行不同，有些行混亂。由於這兩個文件是 systemverilog 文件，因此這些行還包含特殊字元，例如; , = +等。
我想遍歷fileA的每一行並檢查fileB中是否有相應的匹配項。比較應遵循規則
可以忽略行首和行尾的空格。
單詞之間的多個空格/製表符可以被視為單個空格。
空行可以忽略
結果應顯示文件 A 中存在但文件 B 中不存在的行。
我試過tkdiff了，但由於有些線條混亂，它顯示出很多差異。

我無法談論它的便攜性，但我試圖涵蓋所有基礎。我已盡力根據您的資訊在測試中複製這兩個文件。如果您在使用 sed 時遇到特殊字元問題，可以在 cleanLine 函式的第二行中將其轉義。

#!/bin/bash

# compare two files and return lines in
# first file that are missing in second file

ProgName=${0##*/}
Pid=$$
CHK_FILE="$1"
REF_FILE="$2"
D_BUG="$3"
TMP_FILE="/tmp/REF_${Pid}.tmp"
declare -a MISSING='()'
m=0

scriptUsage() {
cat &lt;&lt;ENDUSE

   $ProgName  &lt;file_to_check&gt; &lt;reference_file&gt; [-d|--debug]

   Lines in 'file_to_check' not present in 'reference_file'
     are printed to standard output.

   file_to_check:     File being checked
   reference_file:    File to be checked against
   -d|--debug:        Run script in debug mode (Optional)
   -h|--help:         Print this help message

ENDUSE
}

# delete temp file on any exit
trap 'rm $TMP_FILE &gt; /dev/null 2&gt;&1' EXIT


#-- check args
 [[ $CHK_FILE == "-h" || $CHK_FILE == "--help" ]] && { scriptUsage; exit 0; }
 [[ -n $CHK_FILE && -n $REF_FILE ]] || { &gt;&2 echo "Not enough arguments!"; scriptUsage; exit 1; }
 [[ $D_BUG == "-d" || $D_BUG == "--debug" ]] && set -x
 [[ -s $CHK_FILE ]] || { &gt;&2 echo "File $CHK_FILE not found"; exit 1; }
 [[ -s $REF_FILE ]] || { &gt;&2 echo "File $REF_FILE not found"; exit 1; }
#--


#== edit temp file to 3 match comparison rules
 # copy ref file to temp for editing
 cp "$REF_FILE" $TMP_FILE || { &gt;&2 echo "Unable to create temporary file"; exit 1; }
 # rule 3 - ignore empty lines
 sed -i '/^\s*$/d' $TMP_FILE
 # rule 1 - ignore begin/end of line spaces
 sed -i 's/^[[:space:]][[:space:]]*//;s/[[:space:]][[:space:]]*$//' $TMP_FILE
 # rule 2 - multi space/tab as single space
 sed -i 's/[[:space:]][[:space:]]*/ /g' $TMP_FILE
#==


# function to clean LINE to match 3 rules
# & escape '/' and '.' for later sed command
cleanLine() {
 var=$(echo "$1" | sed 's/^[[:space:]][[:space:]]*//;s/[[:space:]][[:space:]]*$//;s/[[:space:]][[:space:]]*/ /g')
 echo $var | sed 's/\//\\\//g;s/\./\\\./g'
}


### parse check file
while IFS='' read -r LINE || [[ -n $LINE ]]
 do
   if [[ -z $LINE ]]
     then
       continue
     else
       CLN_LINE=$(cleanLine "$LINE")
       FOUND=$(sed -n "/$CLN_LINE/{p;q}" $TMP_FILE)
       [[ -z $FOUND ]] && MISSING[$m]="$LINE" && ((m++))
       FOUND=""
   fi
done &lt; "$CHK_FILE"
###


#++ print missing line(s) (if any)
 if (( $m &gt; 0 ))
   then
     printf "\n  Missing line(s) found:\n"
     #*SEE BELOW ON THIS
     for (( p=0; $p&lt;$m; p++ ))
       do
         printf "    %s\n" "${MISSING[$p]}"
     done
     echo
   else
     printf "\n  **No missing lines found**\n\n"
 fi
#* using 'for p in ${MISSING[@]}' causes:
#* "SPACED LINES" to become:
#* "SPACED"
#* "LINES" when printed to stdout!
#++

引用自：https://unix.stackexchange.com/questions/394811

如何逐行比較兩個文件？

相關問答

在文件中的特定列上使用 Diff

是否有工具或腳本可以通過僅比較文件大小和文件內容的一小部分來快速找到重複項？

提取兩個文件之間按順序交換的行的索引

將書籤從一個 pdf 複製到另一個的腳本

這個使用“find … -exec sh -c ‘…’ sh {} +”的查找命令如何工作？

如何衡量人類可讀文本文件（許可證文件）之間的相似度或距離？