Shell
用於從文件列表中提取數據並將其保存為 csv 的 Shell 腳本
我在 CentOS 上。我有一個要讀取的文件列表,從中提取數據並將其組織為 csv 文件。
日誌文件文本格式為:
... {"name":"test-api","hostname":"ci47","pid":3202,"level":30,"msg":"File: dsiManager, Method: getContract, End { userId: 'AFC5EH5PIHHLO4XS7SG',\n clientId: '5003700557',\n intent: 'YesIntent',\n }","time":"2019-01-21T12:23:10.323Z","v":0} ...
輸出格式必須是:
clientId;intent;time;userId 5003700557;YesIntent;2019-01-21T12:23:10.323Z;AFC5EH5PIHHLO4XS7SG
完成這項任務的最簡單方法是什麼?(awk,grep…)
要穩健地解析 JSON 編碼的數據,您將需要一個 JSON 編解碼器。這幾乎意味著 Perl 或 Python(或 Ruby …)。由於我是 Perl 人,這裡有一個 Perl 解決方案。
首先是單線:
$ perl -MJSON -ne 'BEGIN { print("clientId;intent;time;userId\n"); } eval { my $obj = from_json($_); my $msg = $obj->{msg}; $msg =~ s/^.*{\s*|\s*,\s*}.*$//g; my %m = map { m/^([^:]*):\s*(.*)/; ($1, $2) } split(/,\s+/, $msg); print("$m{clientId};$m{intent};$obj->{time};$m{userId}\n"); }; warn($@) if ($@);' <x clientId;intent;time;userId 5003700557;YesIntent;2019-01-21T12:23:10.323Z;AFC5EH5PIHHLO4XS7SG
由於這有點過分,即使對於 Perl,這裡也是一個可讀的腳本:
#!/usr/bin/perl use strict; use warnings; use JSON; print("clientId;intent;time;userId\n"); while (<>) { # Don't choke on malformed lines eval { my $obj = from_json($_); my $msg = $obj->{msg}; $msg =~ s/^.*{\s* # Trim up to and including the leading '{' | \s*,\s*}.*$ # Trim trailing ',}' //gx; # Split $msg into key-value pairs my %m = map { m/^([^:]*) # Stuff that isn't ':' :\s* # Field separator (.*) # Everything after the separator /x; ($1, $2) } split(/,\s+/, $msg); print("$m{clientId};$m{intent};$obj->{time};$m{userId}\n"); }; warn($@) if ($@); }
試試這個,
awk -F "['\"]" 'NF>=26{print $19","$21","$26","$17}' file.csv 5003700557,YesIntent,2019-01-21T12:23:10.323Z,AFC5EH5PIHHLO4XS7SG
['\"]
將單引號和雙引號作為分隔符。NF>=26
只需檢查該行是否有超過或等於 26 個欄位。