Linux
通過修改重新排序管道分隔欄位
我有一個大約 8k 行的 Linux 文本文件,格式如下
|f_name:x|l_name:x|address:x x|city:x|state:x|zip:x|country:x|ordernumber:x|code:x|downloaded:x|exp:09/2017|ip:x.x.x.x|
我想將其排序為以下格式:
ordernumber:x,exp:09/2017,code:x,f_name:x,l_name:x,address:x x,city:x,state:x,zip:x,country:x,ip:x.x.x.x
筆記
文本中的一些數據與欄位存在問題
|address:x x|
它可以像
|address:x x |
這意味著space
在結尾之前有一個|
;我想刪除輸出中的一個或多個空格。並且存在與欄位有關的數據
|exp:09/2017|
;有顯示的數據,|exp:9/2017|
所以我想添加0
,以防月份是一個數字,所以它會09/2017
在輸出中。請注意,年份可能不同。
例子:
|f_name:x|l_name:x|address:x x |city:x|state:x|zip:x|country:x|ordernumber:x|code:x|downloaded:x|exp:9/2017|ip:x.x.x.x|
預期輸出:
ordernumber:x,exp:09/2017,code:x,f_name:x,l_name:x,address:x x,city:x,state:x,zip:x,country:x,ip:x.x.x.x**
GNU**
awk
**解決方案:awk '{ split($12, a, /[/:]/); if (length(a[2]) == 1) $12=sprintf("%s:%02d/%d", a[1], a[2], a[3]); sub(/ *$/, "", $4); print $9, $12, $10, $2, $3, $4, $5, $6, $7, $8, $13 }' FS='|' OFS=',' file
輸出:
ordernumber:x,exp:09/2017,code:x,f_name:x,l_name:x,address:x x,city:x,state:x,zip:x,country:x,ip:x.x.x.x
概括
我寫了一個 Awk 腳本、一個 Python 腳本和一個 Bash 腳本,每一個都可以解決你的問題。它們都產生相同的輸出。
這是範例數據(取自您的問題並放入文件中
data.csv
):|f_name:x|l_name:x|address:x x|city:x|state:x|zip:x|country:x|ordernumber:x|code:x|downloaded:x|exp:09/2017|ip:x.x.x.x|
這是執行腳本的輸出:
ordernumber:x,exp:09/2017,code:x,f_name:x,l_name:x,address:x x,city:x,state:x,zip:x,country:x,ip:x.x.x.x
awk
這是一個
awk
腳本:#!/usr/bin/env awk # transformcsv.awk # Set the input field-separator (FS) and the output field-separator (OFS) BEGIN{ FS="|"; OFS=","; } # Skip empty lines /^\s*$/ {next;} # Print lines with the fields reordered as desired { print $9,$12,$10,$2,$3,$4,$5,$6,$7,$8,$13 }
以下是您將如何執行它:
awk -f transformcsv.awk data.csv
您也可以將其作為單線執行:
awk 'BEGIN{FS="|";OFS=",";}/^\s*$/ {next;}{print $9,$12,$10,$2,$3,$4,$5,$6,$7,$8,$13}' data.csv
Python
這是 Python 腳本:
#!/usr/bin/env python # -*- coding: ascii -*- """transformcsv.py""" import sys import csv # Make a list with the field names in their input order # NOTE: We padding colums because each row begins # and ends with the delimiter `|` fieldnames = ( "padding_1", "f_name", "l_name", "address", "city", "state", "zip", "country", "ordernumber", "code", "downloaded", "exp", "ip", "padding_2" ) # Make a list with the field names in their output order reordered_fieldnames = ( "ordernumber", "exp", "code", "f_name", "l_name", "address", "city", "state", "zip", "country", "ip" ) # Read each input row and print out the reordered row with open(sys.argv[1]) as csvfile: reader = csv.DictReader(csvfile, fieldnames=fieldnames, delimiter='|') for row in reader: print(','.join([row[field] for field in reordered_fieldnames]))
以下是執行腳本的方式:
python transformcsv.py data.csv
重擊
**注意:**這在大文件上可能會很慢。你可能不應該使用它——我只是為了好玩才包含它。
這是 Bash shell 腳本:
#!/usr/bin/env bash # transformcsv.sh while read LINE; do if [[ -n "${LINE}" ]]; then # Extract the field values f_name="$(echo "${LINE}" | cut -d'|' -f2)" l_name="$(echo "${LINE}" | cut -d'|' -f3)" address="$(echo "${LINE}" | cut -d'|' -f4)" city="$(echo "${LINE}" | cut -d'|' -f5)" state="$(echo "${LINE}" | cut -d'|' -f6)" zip="$(echo "${LINE}" | cut -d'|' -f7)" country="$(echo "${LINE}" | cut -d'|' -f8)" ordernumber="$(echo "${LINE}" | cut -d'|' -f9)" code="$(echo "${LINE}" | cut -d'|' -f10)" downloaded="$(echo "${LINE}" | cut -d'|' -f11)" exp="$(echo "${LINE}" | cut -d'|' -f12)" ip="$(echo "${LINE}" | cut -d'|' -f13)" # Output the reordered row printf \ "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s\n" \ "${ordernumber}" "${exp}" "${code}" "${f_name}" "${l_name}" \ "${address}" "${city}" "${state}" "${zip}" "${country}" "${ip}" fi done < "$1"
以下是您將如何執行它:
bash transformcsv.sh data.csv