Linux

將多列文件與 awk 進行比較並列印輸出

  • June 7, 2022

文件1:

cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0701_DS005 2022-06-02 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0701_DS005 2022-06-03 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0701_DS005 2022-06-04 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 286 F30109_THK_T0_CFPUR0701_DS005 2022-06-05 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 287 F30109_THK_T0_CFPUR0701_DS005 2022-06-06 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 288 F30109_THK_T0_CFPUR0701_DS005 2022-06-07 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS006 hostgroups-1d1w 282 F30109_THK_T0_CFPUR0701_DS006 2022-06-01 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0702_DS006 2022-06-02 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0702_DS006 2022-06-03 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0702_DS006 2022-06-04 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 286 F30109_THK_T0_CFPUR0702_DS006 2022-06-05 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 287 F30109_THK_T0_CFPUR0702_DS006 2022-06-06 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS006 hostgroups-1d1w 288 F30109_THK_T0_CFPUR0703_DS006 2022-06-07 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 282 F30109_THK_T0_CFPUR0703_DS007 2022-06-01 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0703_DS007 2022-06-02 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0703_DS007 2022-06-03 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0703_DS007 2022-06-04 00:00:00

文件2:

cfpur0701 hostgroups-1d1w 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-04 00:00:00 2022-06-11 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-05 00:00:00 2022-06-12 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-06 00:00:00 2022-06-13 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-07 00:00:00 2022-06-14 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-01 00:00:00 2022-06-08 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-04 00:00:00 2022-06-11 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-05 00:00:00 2022-06-12 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-06 00:00:00 2022-06-13 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-07 00:00:00 2022-06-14 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-01 00:00:00 2022-06-08 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-04 00:00:00 2022-06-11 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-05 00:00:00 2022-06-12 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-06 00:00:00 2022-06-13 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-07 00:00:00 2022-06-14 00:00:00
cfpur0801 hostgroups-1d1w 2022-06-01 00:00:00 2022-06-08 00:00:00
cfpur0801 hostgroups-1d1w 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0801 hostgroups-1d1w 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0801 hostgroups-1d1w 2022-06-04 00:00:00 2022-06-11 00:00:00

期望的輸出:

cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0701_DS005 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0701_DS005 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0701_DS005 2022-06-04 00:00:00 2022-06-11 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 286 F30109_THK_T0_CFPUR0701_DS005 2022-06-05 00:00:00 2022-06-12 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 287 F30109_THK_T0_CFPUR0701_DS005 2022-06-06 00:00:00 2022-06-13 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 288 F30109_THK_T0_CFPUR0701_DS005 2022-06-07 00:00:00 2022-06-14 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS006 hostgroups-1d1w 282 F30109_THK_T0_CFPUR0701_DS006 2022-06-01 00:00:00 2022-06-08 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0702_DS006 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0702_DS006 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0702_DS006 2022-06-04 00:00:00 2022-06-11 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 286 F30109_THK_T0_CFPUR0702_DS006 2022-06-05 00:00:00 2022-06-12 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 287 F30109_THK_T0_CFPUR0702_DS006 2022-06-06 00:00:00 2022-06-13 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS006 hostgroups-1d1w 288 F30109_THK_T0_CFPUR0703_DS006 2022-06-07 00:00:00 2022-06-14 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 282 F30109_THK_T0_CFPUR0703_DS007 2022-06-01 00:00:00 2022-06-08 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0703_DS007 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0703_DS007 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0703_DS007 2022-06-04 00:00:00 2022-06-11 00:00:00

file1 中的行數總是多於 file2。我需要執行以下操作:

比較 file1 和 file2 中的第 1 列,如果它們匹配,則將 file1 中的第 6 列與 file2 中的第 2 列進行比較,最後將 file1 中的第 9 列與文件 2 中的 column3 進行比較。當滿足所有 3 個條件時,獲取時間戳(第 5 列& 6 ) 來自 file2 並將其附加到 file1 產生輸出文件。

我嘗試了幾個版本的 awkNR==FNR並沒有取得太大進展。

awk 'FNR==NR{map[$1,$2,$3,$4] = $5 FS $6; next}
    map[$1,$6,$9,$10] != "" {print $0,map[$1,$6,$9,$10]}' file2 file1

我在映射中使用了日期和時間欄位,因為我認為您需要它(與 file1 的最後兩列匹配)

在第一個文件傳遞(FNR==NR)中,我們將要附加的欄位(最後 2 個)儲存到通常的數組中,前 4 個用作散列。next有必要避免為第一個文件的任何行執行其餘程式碼。

對於第二個文件,我們附加數組值並列印行,僅當所選欄位形成現有數組散列時。

引用自:https://unix.stackexchange.com/questions/705322