Awk

如何合併和修改兩個文件的列

  • February 24, 2022

我需要根據第一列和第二列合併兩個文件。我需要添加第 3 列和第 4 列之間的差異,並為每種類型的第 2 列添加 TOTAL 行。這是兩個輸入文件:

文件 1:

VALIDATION_DATA_DBF           DELETE  226  6.4
TXT_DBF                       DELETE  80   0.15
DEFAULT_PROFILE               SELECT  45   1.2
TRAINING_DBF                  SELECT  130  5.25
TESTING_DBF                   SELECT  5    0.001
WARNING_DBF                   UPDATE  8    0.055
WARNING_DBF                   INSERT  5    2.4

文件 2:

VALIDATION_DATA_DBF           DELETE  200  7.4
TXT_DBF                       DELETE  70   1.15
DEFAULT_PROFILE               SELECT  40   0.2
TRAINING_DBF                  SELECT  135  7.25
TESTING_DBF                   SELECT  7    0.009
PERF_DBF                      SELECT  10   0.004
WARNING_DBF                   UPDATE  2    1.055

合併後的輸出文件應如下所示:

TABLE                TYPE    COUNT1 COUNT2 DIFF_COUNT TIME1 TIME2 DIFF_TIME
VALIDATION_DATA_DBF  DELETE  226    200    26         6.4    7.4    -1
TXT_DBF              DELETE  80     70     10         0.15   1.15   -1
TOTAL                DELETE  306    270    36         6.55   8.55   -2
DEFAULT_PROFILE      SELECT  45     40     5          1.2    0.2     1   
TRAINING_DBF         SELECT  130    135    -5         5.25   7.25   -2
TESTING_DBF          SELECT  5      7      -2         0.001  0.009  -0.008
PERF_DBF             SELECT  0      10     -10        0      0.004  -0.004
TOTAL                SELECT  180    192    -12        6.451  7.463  -1.012 
WARNING_DBF          UPDATE  8      2      6          0.055  1.055  -1
TOTAL                UPDATE  8      2      6          0.055  1.055  -1
WARNING_DBF          INSERT  5      0      5          2.4    0      2.4
TOTAL                INSERT  5      0      5          2.4    0      2.4   

將 GNU awk 用於數組和 ARGIND 數組:

awk '
   {
       counts[$2][$1][ARGIND] = $3
       times[$2][$1][ARGIND] = $4
   }
   END {
       print "TABLE", "TYPE", \
           "COUNT1", "COUNT2", "DIFF_COUNT", \
           "TIME1", "TIME2", "DIFF_TIME"
   
       for ( type in counts ) {
           delete totCounts
           delete totTimes
           for ( table in counts[type] ) {
               print table, type,                                   \
                   counts[type][table][1]+0,                        \
                   counts[type][table][2]+0,                        \
                   counts[type][table][1] - counts[type][table][2], \
                   times[type][table][1]+0,                         \
                   times[type][table][2]+0,                         \
                   times[type][table][1] - times[type][table][2]
   
               totCounts[1] += counts[type][table][1]
               totCounts[2] += counts[type][table][2]
               totTimes[1]  += times[type][table][1]
               totTimes[2]  += times[type][table][2]
           }
           print "TOTAL", type, \
               totCounts[1], totCounts[2], totCounts[1] - totCounts[2], \
               totTimes[1],  totTimes[2],  totTimes[1]  - totTimes[2]
       }
   }
' file1 file2 | column -t
TABLE                TYPE    COUNT1  COUNT2  DIFF_COUNT  TIME1  TIME2  DIFF_TIME
VALIDATION_DATA_DBF  DELETE  226     200     26          6.4    7.4    -1
TXT_DBF              DELETE  80      70      10          0.15   1.15   -1
TOTAL                DELETE  306     270     36          6.55   8.55   -2
WARNING_DBF          UPDATE  8       2       6           0.055  1.055  -1
TOTAL                UPDATE  8       2       6           0.055  1.055  -1
WARNING_DBF          INSERT  5       0       5           2.4    0      2.4
TOTAL                INSERT  5       0       5           2.4    0      2.4
PERF_DBF             SELECT  0       10      -10         0      0.004  -0.004
TESTING_DBF          SELECT  5       7       -2          0.001  0.009  -0.008
TRAINING_DBF         SELECT  130     135     -5          5.25   7.25   -2
DEFAULT_PROFILE      SELECT  45      40      5           1.2    0.2    1
TOTAL                SELECT  180     192     -12         6.451  7.463  -1.012

引用自:https://unix.stackexchange.com/questions/691916