Text-Processing

正確格式化 CSV 文件,以便正確地從 CSV 中獲取數據

  • May 6, 2021

我有一個 CSV 文件,如下所示:

我想INITIAL OFFER從這個文件中刪除“”塊並只保留“ FINAL OFFER”塊我還想從第一個欄位中刪除逗號(,)並從最後一列中刪除多餘的空格,以便更輕鬆地搜尋這些列.

輸入

500076592,      INITIAL OFFER
500076592,|11|1|1|100 MB|2 Minutes|1.0 SAR
500076592,|11|2|3|300 MB|5 Minutes|3.0 SAR
500076592,|1|1|1|100 MB|NA|0.5 SAR
500076592,|1|2|3|300 MB|NA|1.5 SAR
500076592,|1|4|7|1000 MB|NA|5.0 SAR
500076592,|2|1|1|4096 MB|NA|1.5 SAR
500076592,|2|2|3|6144 MB|NA|2.0 SAR
500076592,|2|4|7|10240 MB|NA|4.0 SAR
500076592,|5|1|1|4096 MB|NA|2.0 SAR
500076592,|5|2|3|6144 MB|NA|2.5 SAR
500076592,|5|4|7|10240 MB|NA|5.0 SAR
500076592,|6|1|1|NA|2 Minutes|0.5 SAR
500076592,|6|2|3|NA|5 Minutes|1.5 SAR
500076592,|6|4|7|NA|10 Minutes|3.0 SAR
500076592,
500076592,|FINAL OFFER
500076592,|2|1|1|4096 MB|NA|1.5 SAR
500076592,|2|2|3|6144 MB|NA|2.0 SAR
500076592,|2|4|7|10240 MB|NA|4.0 SAR
500076592,|5|1|1|4096 MB|NA|2.0 SAR
500076592,|5|2|3|6144 MB|NA|2.5 SAR
500076592,|5|4|7|10240 MB|NA|5.0 SAR
500076592,|1|1|1|100 MB|NA|0.5 SAR
500076592,|1|2|3|300 MB|NA|1.5 SAR
500076592,|1|4|7|1000 MB|NA|5.0 SAR
500076592,|11|1|1|100 MB|2 Minutes|1.0 SAR
500076592,|11|2|3|300 MB|5 Minutes|3.0 SAR
500076592,|6|1|1|NA|2 Minutes|0.5 SAR
500076592,|6|2|3|NA|5 Minutes|1.5 SAR
500076592,|6|4|7|NA|10 Minutes|3.0 SAR
500076592,
500028952,      INITIAL OFFER
500028952,|11|1|1|250 MB|2 Minutes|3.0 SAR
500028952,|11|2|3|650 MB|10 Minutes|8.0 SAR
500028952,|11|4|7|1550 MB|30 Minutes|18.5 SAR
500028952,|1|1|1|250 MB|NA|2.5 SAR
500028952,|1|2|3|650 MB|NA|6.5 SAR
500028952,|1|4|7|1550 MB|NA|15.5 SAR
500028952,|2|1|1|4096 MB|NA|1.5 SAR
500028952,|2|2|3|6144 MB|NA|2.0 SAR
500028952,|2|4|7|10240 MB|NA|4.0 SAR
500028952,|5|1|1|4096 MB|NA|2.0 SAR
500028952,|5|2|3|6144 MB|NA|2.5 SAR
500028952,|5|4|7|10240 MB|NA|5.0 SAR
500028952,|6|1|1|NA|2 Minutes|0.5 SAR
500028952,|6|2|3|NA|10 Minutes|1.5 SAR
500028952,|6|4|7|NA|30 Minutes|3.0 SAR
500028952,
500028952,|FINAL OFFER
500028952,|2|1|1|4096 MB|NA|1.5 SAR
500028952,|2|2|3|6144 MB|NA|2.0 SAR
500028952,|2|4|7|10240 MB|NA|4.0 SAR
500028952,|1|1|1|250 MB|NA|2.5 SAR
500028952,|1|2|3|650 MB|NA|6.5 SAR
500028952,|1|4|7|1550 MB|NA|15.5 SAR
500028952,|11|1|1|250 MB|2 Minutes|3.0 SAR
500028952,|11|2|3|650 MB|10 Minutes|8.0 SAR
500028952,|11|4|7|1550 MB|30 Minutes|18.5 SAR
500028952,|5|1|1|4096 MB|NA|2.0 SAR
500028952,|5|2|3|6144 MB|NA|2.5 SAR
500028952,|5|4|7|10240 MB|NA|5.0 SAR
500028952,|6|1|1|NA|2 Minutes|0.5 SAR
500028952,|6|2|3|NA|10 Minutes|1.5 SAR
500028952,|6|4|7|NA|30 Minutes|3.0 SAR
500028952,

輸出

500076592,|FINAL OFFER
500076592,|2|1|1|4096 MB|NA|1.5 SAR
500076592,|2|2|3|6144 MB|NA|2.0 SAR
500076592,|2|4|7|10240 MB|NA|4.0 SAR
500076592,|5|1|1|4096 MB|NA|2.0 SAR
500076592,|5|2|3|6144 MB|NA|2.5 SAR
500076592,|5|4|7|10240 MB|NA|5.0 SAR
500076592,|1|1|1|100 MB|NA|0.5 SAR
500076592,|1|2|3|300 MB|NA|1.5 SAR
500076592,|1|4|7|1000 MB|NA|5.0 SAR
500076592,|11|1|1|100 MB|2 Minutes|1.0 SAR
500076592,|11|2|3|300 MB|5 Minutes|3.0 SAR
500076592,|6|1|1|NA|2 Minutes|0.5 SAR
500076592,|6|2|3|NA|5 Minutes|1.5 SAR
500076592,|6|4|7|NA|10 Minutes|3.0 SAR
500028952,|FINAL OFFER
500028952,|2|1|1|4096 MB|NA|1.5 SAR
500028952,|2|2|3|6144 MB|NA|2.0 SAR
500028952,|2|4|7|10240 MB|NA|4.0 SAR
500028952,|1|1|1|250 MB|NA|2.5 SAR
500028952,|1|2|3|650 MB|NA|6.5 SAR
500028952,|1|4|7|1550 MB|NA|15.5 SAR
500028952,|11|1|1|250 MB|2 Minutes|3.0 SAR
500028952,|11|2|3|650 MB|10 Minutes|8.0 SAR
500028952,|11|4|7|1550 MB|30 Minutes|18.5 SAR
500028952,|5|1|1|4096 MB|NA|2.0 SAR
500028952,|5|2|3|6144 MB|NA|2.5 SAR
500028952,|5|4|7|10240 MB|NA|5.0 SAR
500028952,|6|1|1|NA|2 Minutes|0.5 SAR
500028952,|6|2|3|NA|10 Minutes|1.5 SAR
500028952,|6|4|7|NA|30 Minutes|3.0 SAR
500028952,

如果您使用管道作為分隔符,則可以awk根據欄位數輕鬆過濾數據,例如:

awk -F'|' 'NF==2 { f=1 } NF==1 { f=0 } f' infile

打高爾夫球:

awk -F\| 'NF==1{f=0}NF==2{f=1}f'
sed -e '/FINAL OFFER/p;/INITIAL OFFER/,/FINAL OFFER/ d' input.csv  > output.csv

這將再次列印 FINAL OFFER 行,因為它即將被/INITIAL OFFER/,/FINAL OFFER/範圍刪除。

引用自:https://unix.stackexchange.com/questions/648393