Awk
多模式匹配和單行列印
我需要匹配日誌文件中的兩個模式,並且需要獲取匹配的模式之一(兩個模式中)的下一行,最後需要在一行中列印這三個值。
範例日誌文件:
2013/09/05 04:26:00 Processing Batch /fbc/dev/cebi/dod/9739867262 2013/09/05 04:26:02 Batch 9739867262 was successful 2013/09/05 04:26:02 Total Time = 3.13 Secs 2013/09/05 04:26:02 Repository API Time = 2.96 Secs 2013/09/05 04:26:02 File System Io Time = 0.06 Secs 2013/09/05 04:26:02 Doc Validation Time = 0.03 Secs 2013/09/05 04:26:02 Ending @ Thu Sep 05 04:26:02 EDT 2013 2013/09/05 08:18:10 Starting @ Thu Sep 05 08:18:10 EDT 2013 2013/09/05 08:18:10 Starting @ Thu Sep 05 08:18:10 EDT 2013 2013/09/05 08:18:10 Processing Batch /fbc/dev/cebi/dod/9844867675 2013/09/05 08:18:10 Processing Batch /fbc/dev/cebi/dod/9886743777 2013/09/05 08:18:16 Batch 9844867675 was successful 2013/09/05 08:18:16 Total Time = 6.00 Secs 2013/09/05 08:18:16 Repository API Time = 5.63 Secs 2013/09/05 08:18:16 File System Io Time = 0.05 Secs 2013/09/05 08:18:16 Doc Validation Time = 0.19 Secs 2013/09/05 08:18:16 Ending @ Thu Sep 05 08:18:16 EDT 2013 2013/09/05 08:18:18 Batch 9886743777 was successful 2013/09/05 08:18:18 Total Time = 8.27 Secs 2013/09/05 08:18:18 Repository API Time = 8.52 Secs 2013/09/05 08:18:18 File System Io Time = 0.08 Secs 2013/09/05 08:18:18 Doc Validation Time = 0.47 Secs 2013/09/05 08:18:18 Ending @ Thu Sep 05 08:18:18 EDT 2013
我在名為 cust_no.txt 的文件中有單獨的數字
9739867262 9844867675 9886743777
將這些數字作為輸入,我需要在日誌文件中匹配以下兩種模式
- 處理批次 /fbc/dev/cebi/dod/
- 批處理成功
需要以下輸出:
-> 在第一個模式 (
i.e Processing Batch /fbc/dev/cebi/dod/<numbers in the cust_no.txt>
) 的匹配中,我需要獲取第二個單詞,即 $2 。-> 在第二個模式(i.e Batch <numbers in the cust_no.txt> was successful
)的匹配上,我需要得到第二個單詞,即 $ 2 -> And the 6th word ( $ 6) 在第二個模式之後匹配後的下一行(即以 開頭的行Total Time
)期望的輸出:
9739867262,04:26:00,04:26:02,3.13 Secs 9844867675,08:18:10,08:18:16,6.00 Secs 9886743777,08:18:10,08:18:18,8.27 Secs
為了得到這個,我嘗試了以下方式,但這似乎不起作用:
awk -v cn=$cust_no '{{if ($0 ~ "Processing.*" cn) st=$2 && if ($0 ~ "Customer cn was successful" et=$2; getline; tt=$4} ; print st,et,tt}
這個怎麼樣:
while read number;do start=$(grep "Processing Batch /fbc/dev/cebi/dod/$number" log_file\ |head -n 1|awk '{print $2}') end=$(grep -A 1 "Batch $number was successful" log_file\ |head -n 2|tail -n 1|awk -v OFS=',' '{print $2,$6}') echo "$number,$start,$end Secs" done <cust_no.txt
如果您不介意使用 Perl 和 grep,這裡可以解決您的問題。這是腳本,稱為
cmd.pl
:#!/usr/bin/env perl use feature 'say'; #use Data::Dumper; @file = `grep -f cust_no.txt -A 1 sample.log`; my (%info, $secLineSeen, $time, $custno); $secLineSeen = 0; foreach my $line (@file) { if ($secLineSeen == 1) { #2013/09/05 08:18:18 Total Time = 8.27 Secs (my $totTime) = ($line =~ m!\S+ \S+\s+Total Time\s+=\s+(\S+ Secs)!); $info{$custno}{totTime} = $totTime; $secLineSeen = 0; } elsif ($line =~ m/Processing Batch/) { #2013/09/05 08:18:10 Processing Batch /fbc/dev/cebi/dod/9844867675 ($time, $custno) = ($line =~ m!\S+ (\S+)\s+Processing Batch.*/(\S+)!); $info{$custno}{onetwo} = $time; } elsif ($line =~ m/Batch.*successful/) { #2013/09/05 08:18:18 Batch 9886743777 was successful ($time, $custno) = ($line =~ m!\S+ (\S+)\s+Batch (\S+) was.*!); $info{$custno}{twotwo} = $time; $secLineSeen = 1; } } #print Dumper(\%info); #9739867262,04:26:00,04:26:02,3.13 Secs foreach my $key (sort keys %info) { say "$key,$info{$key}{onetwo},$info{$key}{twotwo},$info{$key}{totTime}"; }
例子
$ ./cmd.pl 9739867262,04:26:00,04:26:02,3.13 Secs 9844867675,08:18:10,08:18:16,6.00 Secs 9886743777,08:18:10,08:18:18,8.27 Secs
細節
此 Perl 腳本首先創建一個數組 ,
@file
其中包含此命令的結果:$ grep -f cust_no.txt -A 1 sample.log
此命令獲取日誌文件,
sample.log
並從文件中選擇包含客戶編號的所有行cust_no.txt
,如下所示:2013/09/05 04:26:00 Processing Batch /fbc/dev/cebi/dod/9739867262 2013/09/05 04:26:02 Batch 9739867262 was successful 2013/09/05 04:26:02 Total Time = 3.13 Secs -- 2013/09/05 08:18:10 Processing Batch /fbc/dev/cebi/dod/9844867675 2013/09/05 08:18:10 Processing Batch /fbc/dev/cebi/dod/9886743777 2013/09/05 08:18:16 Batch 9844867675 was successful 2013/09/05 08:18:16 Total Time = 6.00 Secs -- 2013/09/05 08:18:18 Batch 9886743777 was successful 2013/09/05 08:18:18 Total Time = 8.27 Secs
這個
grep
命令做了一件特別值得一提的事情,主要是它在-A 1
任何匹配的 ( ) 之後保留了一行。這使我們能夠抓住其中包含“總時間”的行。提取此數據後,Perl 腳本將根據問題中提到的要求,使用多維散列來儲存此輸出中關鍵數據的結果。
一旦我們完成了對以下內容的處理,雜湊看起來就像這樣
@file
:$VAR1 = { '9739867262' => { 'twotwo' => '04:26:02', 'totTime' => '3.13 Secs', 'onetwo' => '04:26:00' }, '9886743777' => { 'twotwo' => '08:18:18', 'totTime' => '8.27 Secs', 'onetwo' => '08:18:10' }, '9844867675' => { 'twotwo' => '08:18:16', 'totTime' => '6.00 Secs', 'onetwo' => '08:18:10' } };
最後,我們遍歷這個雜湊並以問題中指定的格式列印我們收集的內容。