Text-Processing

比較 2 個不同文件中的 2 列

  • November 9, 2015

我需要根據第一列減去 2 個文件之間的第二列和第三列,並且不顯示 diff_column_2(DIFF-IO)=diff_column_3(DIFF-SELECT)=0 的行。請注意,除了兩個文件中存在的最後一行之外,順序可以是隨機的,並且應該保留在輸出中的 END 處。

ref_file: _

   testing 20 10
   jobs 15 20
   issues 0 1
   work 15 25
   Total 50 56

head_file: _

   testing 20 10
   jobs 15 30
   work 12 25
   games 1 2
   Total 48 67

期望的輸出:

TABLE,REF-IO,HEAD-IO,DIFF-IO,REF-SELECT,HEAD-SELECT,DIFF-SELECT
jobs,15,15,0,20,30,-10
work,15,12,3,25,25,0
games,0,1,-1,0,2,-2
issues,0,0,0,1,0,1
Total,50,48,2,56,67,-11
awk '
   BEGIN {
       print "TABLE,REF-IO,HEAD-IO,DIFF-IO,REF-SELECT,HEAD-SELECT,DIFF-SELECT"
       OFS = ", "
   }
   FNR==NR {
       A[$1]=$2
       B[$1]=$3
       next
   }
   {
       if (!($1 in A)) {
           A[$1] = B[$1] = 0
       }
       diff_io = $2 - A[$1]
       diff_sel= $3 - B[$1]
       C[$1] = 1
   }
   diff_io || diff_sel {
       if (first) {
           print line
       }
       first = 1
       line = $1 OFS $2 OFS A[$1] OFS diff_io OFS $3 OFS B[$1] OFS diff_sel
   }
   END {
       for (name in A) {
           if (!(name in C)) {
               print name, 0, A[name], -A[name], 0, B[name], -B[name]
           }
       }
       print line
   }
   ' head_file ref_file

簡而言之,您可以通過以下方式完成任務join

join -a1 -a2 -e0 <(sort head_file) <(sort ref_file) -o0,1.2,2.2,0,1.3,2.3 |
awk '
   BEGIN {
       print "TABLE,REF-IO,HEAD-IO,DIFF-IO,REF-SELECT,HEAD-SELECT,DIFF-SELECT"
       OFS = ", "
   }
   {
       $4=$2-$3
       $7=$5-$6
   }
   /Total/ {
       end=$0
       next
   }
   $4!=0 || $7!=0;
   END {
       print end
   }'

引用自:https://unix.stackexchange.com/questions/241399