Bash
忽略引號內的分隔符
我有一個
.csv
文件如下:"ID0054XX","PT. SUMUT","18 JL.BONJOL","SUMATERA UTARA, NORTH","MEDAN","","ID9856","PDSUIDSAXXX","","","","Y" "ID00037687","PAN INDONESIA, PT.","JALAN JENDERAL, SUDIRMAN, SENAYAN","","INDIA","","ID566543","PINBIDJAXXX","","0601","","Y"
我有一個腳本,將每個逗號分隔值分配給一個唯一變數,
,
用作分隔符。腳本部分如下:
IFS=, [ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; } while read Key Name Address1 Address2 City State Country SwiftCode Nid Chips Aba IsSwitching do echo "-------------------------------------------------------------------" echo "From Key : $Key" echo "-------------------------------------------------------------------" echo "-------------------------------------------------------------------" echo "From Name : $Name"
它所做的是將引號內帶有逗號的值與我想要的輸出分開,將每個值唯一地分開到它們各自的變數。
我嘗試替換逗號,
IFS=[","]
但沒有運氣。非常感謝任何建議/幫助。
您在這裡做錯了幾件事:
雖然這是可能的,但效率非常低。它很慢,很難寫,很難讀,而且很難正確地做。外殼不是為這種事情設計的。 2. 您正在嘗試在沒有 csv 解析器的情況下解析 csv 文件。
CSV 不是一種簡單的格式。您可以像在此處一樣擁有包含分隔符的欄位。您還可以擁有跨越多行的欄位。嘗試使用簡單的模式匹配來解析任意 CSV 數據是非常非常複雜的,而且很難做到正確。
糟糕的,hacky的解決方案是做這樣的事情:
$ sed 's/","/"|"/g' file.csv | while IFS='|' read -r Key Name Address1 Address2 City \ State Country SwiftCode Nid Chips Aba IsSwitching; do echo "From Key : $Key"; echo "From Name : $Name"; done From Key : "ID0054XX" From Name : "PT. SUMUT" From Key : "ID00037687" From Name : "PAN INDONESIA, PT."
這將替換所有
","
,"|"
然後|
用作分隔符。當然,如果您的任何欄位可以包含|
.好的、乾淨的方法是使用適當的腳本語言,而不是 shell 和 csv 解析器。例如,在 Perl 1中:
$ cat file.csv | perl -MText::CSV -le ' $csv = Text::CSV->new({binary=>1}); while ($row = $csv->getline(STDIN)){ my ($Key, $Name, $Address1, $Address2, $City, $State, $Country, $SwiftCode, $Nid, $Chips, $Aba, $IsSwitching) = @$row; print "From Key: $Key\nFrom Name: $Name";}' From Key: ID0054XX From Name: PT. SUMUT From Key: ID00037687 From Name: PAN INDONESIA, PT.
或者,作為腳本:
#!/usr/bin/perl -l use strict; use warnings; use Text::CSV; open(my $fh, "file.csv"); my $csv = Text::CSV->new({binary=>1}); while (my $row = $csv->getline($fh)){ my ( $Key, $Name, $Address1, $Address2, $City, $State, $Country, $SwiftCode, $Nid, $Chips, $Aba, $IsSwitching ) = @$row; print "From Key: $Key\nFrom Name: $Name"; }
請注意,您必須先安裝
Text::CSV
模組 (cpanm Text::CSV
),並且您可能想要安裝(大多數發行版上的cpanm
軟體包)cpanminus
或者,在 Python 3 中:
#!/usr/bin/env python3 import csv with open('file.csv', newline='') as csvfile: linereader = csv.reader(csvfile, delimiter=',', quotechar='"') for row in linereader: print("From Key: %s\nFrom Name: %s" % (row[0], row[1]))
將上面的 Python 程式碼保存為腳本並在文件上執行將列印:
$ foo.py From Key: ID0054XX From Name: PT. SUMUT From Key: ID00037687 From Name: PAN INDONESIA, PT.
1是的,我知道那是一個 UUoC,但以這種方式寫成一個單行字更簡單。