Ubuntu
根據其他 2 列的值在 CSV 文件中創建新列
我有一個包含 23 列的 csv,其中包含來自網路掃描的數據。我需要根據最後兩列(22 和 23)的數據創建一個新列。我想要的輸出如下:
新列標題 = 已標記
if column 22 = Malicious and column 23= C&C-FileDownload then new column 24= 1
有人可以幫助我使用 Ubuntu 實現這一目標嗎?我一直在研究這個,我可以看到 awk 是使用的工具,但我對此很陌生。
到目前為止,我已經嘗試過:
awk 'NR==1{$24="merge";print;next} \ $22 == "Malicious" || $23 == "C&C-FileDownload" {$24=1}1' Malware-44-1.csv > test1.csv
但它沒有添加帶有“1”的新列,它確實添加了“Merged”作為列,但沒有用逗號分隔它。我正在使用以下輸出:
awk -F, -v OFS=',' 'NR==1{ $24="merge"; print; next } { $24=($22 == "Malicious" && $23 == "C&C-FileDownload") }1 ' master.csv > output1.csv
現在看起來像這樣:
ts,uid,id.orig_h,id.orig_p,id.resp_h,id.resp_p,proto,service,duration,orig_bytes,resp_bytes,conn_state,local_orig,local_resp,missed_bytes,history,orig_pkts,orig_ip_bytes,resp_pkts,resp_ip_bytes,tunnel_parents,label,detailed-label ,merge 1547150789.067208,CzsY0D4B96NTr8m7ld,192.168.1.199,59222,46.101.251.172,80,tcp,http,1.686784,149,171750,SF,-,-,11584,ShADadttfF,122,7741,122,178102,-,Malicious,C&C-FileDownload ,0
這現在幾乎可以工作了,但只顯示列表中的最後一個條件:
awk -F, -v OFS=, 'NR==1{ $24="label1"; print; next } { $24=($22 == "Malicious" && $23 == "C&C")?0:"" } { $24=($22 == "Malicious" && $23 == "C&C-FileDownload")?1:"" } { $24=($22 == "Malicious" && $23 == "C&C-HeartBeat")?2:"" } { $24=($22 == "Malicious" && $23 == "C&C-HeartBeat-Attack")?3:"" } { $24=($22 == "Malicious" && $23 == "C&C-HeartBeat-FileDownload")?4:"" } { $24=($22 == "Malicious" && $23 == "C&C-Mirai")?5:"" } { $24=($22 == "Malicious" && $23 == "C&C-Torii")?6:"" } { $24=($22 == "Malicious" && $23 == "DDoS")?7:"" } { $24=($22 == "Malicious" && $23 == "FileDownload")?8:"" } { $24=($22 == "Malicious" && $23 == "Okiru")?9:"" } { $24=($22 == "Malicious" && $23 == "Okiru-Attack")?10:"" } { $24=($22 == "Malicious" && $23 == "PartOfAHorizontalPortScan")?11:"" } { $24=($22 == "Malicious" && $23 == "PartOfAHorizontalPortScan-Attack")?12:"" } { $24=($22 == "Malicious" && $23 == "C&C-PartOfAHorizontalPortScan")?13:"" } { $24=($22 == "Malicious" && $23 == "Attack")?14:"" } { $24=($22 == "Benign" && $23 == "-")?15:"" } 1' master.csv > masteroutput1.csv
當我遇到語法錯誤時,我刪除了 "" 後面的括號。
您需要告訴 awk 輸入欄位分隔符是什麼。
-F,
我們告訴它是一個逗號字元。您還需要告訴輸出欄位分隔符是什麼。我們指定了-v OFS=,
也應該是一個逗號字元。awk -F, -v OFS=, 'NR==1{ $24="merge"; print; next } { $24=($22 == "Malicious" && $23 == "C&C-FileDownload") }1 ' Malware-44-1.csv > output.csv
我還更新了命令,如果條件不滿足,則 column#24 將為 0,否則為 1,因此所有記錄將具有相同數量的列;
如果您想將這些列留空而不是用 0 填充,那麼:
awk -F, -v OFS=, 'NR==1{ $24="merge"; print; next } { $24=($22 == "Malicious" && $23 == "C&C-FileDownload")?1:"") }1 ' Malware-44-1.csv > output.csv
要定義多個規則,請執行以下操作:
awk -F, -v OFS=, 'NR==1{ $24="merge"; print; next } { $24=($22 == "Malicious" && $23 == "C&C-FileDownload")?1:"") } { $24=( .... ) } { $24=( .... ) } { # and some more ... } 1' Malware-44-1.csv > output.csv
或者您也可以在列印目前記錄後單獨列印:
awk 'NR==1{ print $0 ",merge" } NR>1{ print $0 "," ($22 == "Malicious" && $23 == "C&C-FileDownload")?1:"") } ' Malware-44-1.csv > output.csv