Text-Processing
.csv 文件中的重複值
大家,我有這個腳本,到目前為止我使用它沒有任何問題,它使用兩個文件來創建一個 .csv,這兩個文件來自另一個腳本,它包含有關連接到網路的設備的資訊,這個是導致錯誤時文件的外觀。
文件 1.dat:
SN: FCQ1632Y0UQ Estadio_Admon ip_address: 148.000.000.123
文件 2.dat:
Device ID: ESTADIO_19 IP address: 148.000.000.119 Interface: FastEthernet0/3 Port ID (outgoing port): GigabitEthernet0 Device ID: ESTADIO_18 IP address: 148.000.000.118 Interface: FastEthernet0/4 Port ID (outgoing port): GigabitEthernet0 Device ID: ESTADIO_16 IP address: 148.000.000.116 Interface: FastEthernet0/6 Port ID (outgoing port): GigabitEthernet0 Device ID: ESTADIO_PALCOS IP address: 148.000.000.66 Interface: GigabitEthernet0/2 Port ID (outgoing port): GigabitEthernet0/1 SN: FCQ1632Y0US Device ID: ESTADIO_22 IP address: 148.000.000.122 Interface: FastEthernet0/8 Port ID (outgoing port): GigabitEthernet0 Device ID: SIPCCEF485DE89A IP address: 148.000.000.92 Interface: FastEthernet0/16 Port ID (outgoing port): Port 1 Device ID: SIPCCEF485DE87B IP address: 148.000.000.72 Interface: FastEthernet0/13 Port ID (outgoing port): Port 1 Device ID: SIPCCEF485E5719 IP address: 148.000.000.76 Interface: FastEthernet0/17 Port ID (outgoing port): Port 1 Device ID: SIPCCEF485DE894 IP address: 148.000.000.84 Interface: FastEthernet0/14 Port ID (outgoing port): Port 1 Device ID: ESTADIO_TAQUILLAS IP address: 148.000.000.125 Interface: GigabitEthernet0/1 Port ID (outgoing port): GigabitEthernet1/0/27 SN: FOC1616Y091
腳本:
awk -v orig=$(awk '$1=="SN:" {print $2}' file1.dat) ' BEGIN { RS = "\n\n" FS = "\n" OFS = "," print "Device_SN_O,Device_SN_D,Interface,Port_ID" } { for(i=1; i<=NF; i++) { split($i, a, ": "); k[a[1]] = a[2] } print orig, k["SN"], k["Interface"], k["Port ID (outgoing port)"] }' file2.dat>final.csv
預期輸出:
Device_SN_O,Device_SN_D,Interface,Port_ID FCQ1632Y0UQ,,FastEthernet0/3,GigabitEthernet0 FCQ1632Y0UQ,,FastEthernet0/4,GigabitEthernet0 FCQ1632Y0UQ,,FastEthernet0/6,GigabitEthernet0 FCQ1632Y0UQ,FCQ1632Y0US,GigabitEthernet0/2,GigabitEthernet0/1 FCQ1632Y0UQ,,FastEthernet0/8,GigabitEthernet0 FCQ1632Y0UQ,,FastEthernet0/16,Port 1 FCQ1632Y0UQ,,FastEthernet0/13,Port 1 FCQ1632Y0UQ,,FastEthernet0/17,Port 1 FCQ1632Y0UQ,,FastEthernet0/14,Port 1 FCQ1632Y0UQ,FOC1616Y091,GigabitEthernet0/1,GigabitEthernet1/0/27
我得到的輸出:
Device_SN_O,Device_SN_D,Interface,Port_ID FCQ1632Y0UQ,,FastEthernet0/3,GigabitEthernet0 FCQ1632Y0UQ,,FastEthernet0/4,GigabitEthernet0 FCQ1632Y0UQ,,FastEthernet0/6,GigabitEthernet0 FCQ1632Y0UQ,FCQ1632Y0US,GigabitEthernet0/2,GigabitEthernet0/1 FCQ1632Y0UQ,FCQ1632Y0US,FastEthernet0/8,GigabitEthernet0 FCQ1632Y0UQ,FCQ1632Y0US,FastEthernet0/16,Port 1 FCQ1632Y0UQ,FCQ1632Y0US,FastEthernet0/13,Port 1 FCQ1632Y0UQ,FCQ1632Y0US,FastEthernet0/17,Port 1 FCQ1632Y0UQ,FCQ1632Y0US,FastEthernet0/14,Port 1 FCQ1632Y0UQ,FOC1616Y091,GigabitEthernet0/1,GigabitEthernet1/0/27
如您所見,Device_SN_D 正在重複,直到找到不同的,我在不同的迭代中使用了相同的腳本,這是第一個給我這個錯誤的腳本。
希望你能幫我解決這個問題。
當您的數據沒有任何 “SN: …..” 時,您不分配 k
$$ “SN” $$一個空值,所以最後一個值仍然在那裡。 您只需要在處理下一行之前添加一個 :
delete k
(參見https://unix.stackexchange.com/a/147958/27616),以便使用“新鮮 k 數組”處理下一行舉個例子:
awk -v orig=$(awk '$1=="SN:" {print $2}' file1.dat) ' BEGIN { RS = "\n\n" FS = "\n" OFS = "," print "Device_SN_O,Device_SN_D,Interface,Port_ID" } { for(i=1; i<=NF; i++) { split($i, a, ": "); k[a[1]] = a[2] } print orig, k["SN"], k["Interface"], k["Port ID (outgoing port)"] delete k; rem="So that the next line is processed with an emptied k array" }' file2.dat>final.csv
根據您提供的數據,它在 final.csv 中給出:
Device_SN_O,Device_SN_D,Interface,Port_ID FCQ1632Y0UQ,,FastEthernet0/3,GigabitEthernet0 FCQ1632Y0UQ,,FastEthernet0/4,GigabitEthernet0 FCQ1632Y0UQ,,FastEthernet0/6,GigabitEthernet0 FCQ1632Y0UQ,FCQ1632Y0US,GigabitEthernet0/2,GigabitEthernet0/1 FCQ1632Y0UQ,,FastEthernet0/8,GigabitEthernet0 FCQ1632Y0UQ,,FastEthernet0/16,Port 1 FCQ1632Y0UQ,,FastEthernet0/13,Port 1 FCQ1632Y0UQ,,FastEthernet0/17,Port 1 FCQ1632Y0UQ,,FastEthernet0/14,Port 1 FCQ1632Y0UQ,FOC1616Y091,GigabitEthernet0/1,GigabitEthernet1/0/27
正如預期的那樣