Text-Processing
查找重複的第一個欄位並在單行中連接其值
我有一個文件,其條目
key: value
格式如下:貓數據.txt
name: 'tom' tom_age: '31' status_tom_mar: 'yes' school: 'anne' fd_year_anne: '1987' name: 'hmz' hmz_age: '21' status_hmz_mar: 'no' school: 'svp' fd_year_svp: '1982' name: 'toli' toli_age: '41'
同樣…
我只需要查找並列印那些
key: value
具有重複鍵的單個條目。下面的程式碼讓我得到了重複的鍵
cat data.txt | awk '{ print $1 }' | sort | uniq -d name: school:
但是,我想要在一行中連接重複鍵的值的輸出。
預期輸出:
name: ['tom', 'hmz', 'toli'] school: ['anne', 'svp'] tom_age: '31' status_tom_mar: 'yes' fd_year_anne: '1987' hmz_age: '21' status_hmz_mar: 'no' fd_year_svp: '1982' toli_age: '41'
你能建議嗎?
在
awk
:$ awk -F': ' ' { count[$1]++; data[$1] = $1 in data ? data[$1]", "$2 : $2 } END { for (id in count) { printf "%s: ",id; print (count[id]>1 ? "[ "data[id]" ]" : data[id]) } }' data.txt hmz_age: '21' tom_age: '31' fd_year_anne: '1987' school: [ 'anne', 'svp' ] name: [ 'tom', 'hmz', 'toli' ] toli_age: '41' fd_year_svp: '1982' status_hmz_mar: 'no' status_tom_mar: 'yes'
Perl 方法:
$ perl -F: -lane 'push @{$k{$F[0]}},$F[1]; END{ for $key (keys(%k)){ $data=""; if(scalar(@{$k{$key}})>1){ $data="[" . join(",",@{$k{$key}}) . "]"; } else{ $data=${$k{$key}}[0]; } print "$key: $data" } }' data.txt status_tom_mar: 'yes' fd_year_anne: '1987' tom_age: '31' toli_age: '41' fd_year_svp: '1982' hmz_age: '21' school: [ 'anne', 'svp'] name: [ 'tom', 'hmz', 'toli'] status_hmz_mar: 'no'
或者,也許更容易理解:
perl -F: -lane '@fields=@F; push @{$key_hash{$fields[0]}},$fields[1]; END{ for $key (keys(%key_hash)){ $data=""; @key_data=@{$key_hash{$key}}; if(scalar(@key_data)>1){ $data="[" . join(",", @key_data) . "]"; } else{ $data=$key_data[0] } print "$key: $data" } }' data.txt