僅在 csv 文件中的欄位 1,2 上列印重複行

August 22, 2017

通過以下命令，我們可以列印文件中的重複行
uniq -d string file.txt
但是我們如何在 csv 文件上做到這一點？
我們只需要在 csv 文件中的欄位 1,2 上列印重複行 - 不包括欄位 3
FS - “,”
例如：
spark2-thrift-sparkconf,spark.history.fs.logDirectory,{{spark_history_dir}}
spark2-thrift-sparkconf,spark.history.fs.logDirectory,true
spark2-thrift-sparkconf,spark.history.Log.logDirectory,true
spark2-thrift-sparkconf,spark.history.DF.logDirectory,true
預期成績：
spark2-thrift-sparkconf,spark.history.fs.logDirectory,{{spark_history_dir}}
spark2-thrift-sparkconf,spark.history.fs.logDirectory,true
第二：
如何從 csv 文件中排除重複行（我的意思是只刪除欄位 1,2 上的重複行
預期輸出：
spark2-thrift-sparkconf,spark.history.Log.logDirectory,true
spark2-thrift-sparkconf,spark.history.DF.logDirectory,true

$ awk -F, 'NR==FNR{a[$1,$2]++; next} a[$1,$2]&gt;1' file.txt file.txt 
spark2-thrift-sparkconf,spark.history.fs.logDirectory,{{spark_history_dir}}
spark2-thrift-sparkconf,spark.history.fs.logDirectory,true
兩次使用相同輸入文件的兩個文件處理
NR==FNR{a[$1,$2]++; next}使用前兩個欄位作為鍵，保存出現次數
a[$1,$2]>1僅當第二遍期間計數大於 1 時才列印
對於相反的情況，更改條件檢查的簡單問題
$ awk -F, 'NR==FNR{a[$1,$2]++; next} a[$1,$2]==1' file.txt file.txt 
spark2-thrift-sparkconf,spark.history.Log.logDirectory,true
spark2-thrift-sparkconf,spark.history.DF.logDirectory,true

引用自：https://unix.stackexchange.com/questions/387590

僅在 csv 文件中的欄位 1,2 上列印重複行

相關問答

如何遍歷目錄中的所有 csv 文件，選擇一系列列並合併為單個 csv？

如果 E 或 F 列為空或值為 0，則複製 B 列

循環瀏覽具有特定副檔名的文件（並非所有副檔名都可能存在）

如何多次檢查while循環內的條件然後執行命令

如何在 bash 腳本的 curl 命令中傳遞變數

防止 bash 腳本以非零退出程式碼退出