Text-Processing
如何從 csv 文件的特定列中過濾日期範圍?
考慮輸入文件
1,10/22/2017,Scheduled 2,10/23/2017,Confimred 1,10/24/2017,NA 1,10/29/2017,Scheduled 3,11/1/2017,Scheduled 1,11/2/2017,Scheduled
如何通過提供日期範圍作為輸入來過濾第二列中的日期(範圍內)?
使用
awk
和呼叫 shelldate
命令使用管道中的 getline:awk -v start="$start" -v end="$end" -F, ' BEGIN{srt="date -d"start" +%s"; srt|getline start; close(srt); ed="date -d"end" +%s"; ed|getline end; close(ed) } { bkp=$0; epoch="date -d"$2" +%s";epoch |getline $2;close(epoch)}; ($2>=start && $2<=end){print bkp}' infile
對於以下輸入:
1,10/22/2017,Scheduled 1,10/24/2017,NA 1,10/24/2017,NA,NA 1,10/29/2017,Scheduled 3,11/1/2017,Scheduled 1,11/2/2017,NA 5,9/30/2017,Confirmed 6,10/1/2017,Scheduled
與
start='10/24/2017'
,end='11/1/2017'
結果是:1,10/24/2017,NA 1,10/24/2017,NA,NA 1,10/29/2017,Scheduled 3,11/1/2017,Scheduled
這個片段:
# Utility functions: print-as-echo, print-line-with-visual-space. pe() { for _i;do printf "%s" "$_i";done; printf "\n"; } pl() { pe;pe "-----" ;pe "$*"; } pl " Input data file $FILE:" head data1 # start="10/29/2017" end="11/2/2017" START="10/29/2017" END="11/2/2017" pl " Results, from $START through $END:" dateutils.dgrep -i "%m/%d/%Y" ">=$START" '&&' "<=$END" < data1 pl " Unsorted file, data2:" head data2 pl " Results, from $START through $END, randomly organized file:" dateutils.dgrep -i "%m/%d/%Y" ">=$START" '&&' "<=$END" < data2
產生:
----- Input data file : 1,10/22/2017,Scheduled 2,10/23/2017,Confimred 1,10/24/2017,NA 1,10/29/2017,Scheduled 3,11/1/2017,Scheduled 1,11/2/2017,Scheduled ----- Results, from 10/29/2017 through 11/2/2017: 1,10/29/2017,Scheduled 3,11/1/2017,Scheduled 1,11/2/2017,Scheduled ----- Unsorted file, data2: 1,10/22/2017,Scheduled 1,10/24/2017,NA 1,10/29/2017,Scheduled 1,11/2/2017,Scheduled 2,10/23/2017,Confimred 3,11/1/2017,Scheduled ----- Results, from 10/29/2017 through 11/2/2017, randomly organized file: 1,10/29/2017,Scheduled 1,11/2/2017,Scheduled 3,11/1/2017,Scheduled
在這樣的系統上:
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64 Distribution : Debian 8.9 (jessie) bash GNU bash 4.3.30
因為比較是對日期格式的數據進行算術比較,所以數據可以是任何順序。如果需要,可以對最終結果進行排序——參見 sort、msort、dsort。dateutils 程式碼可用於許多儲存庫和 OSX(通過 brew)。
dateutils.dgrep 的一些細節:
dateutils.dgrep Grep standard input for lines that match EXPRESSION. (man) Path : /usr/bin/dateutils.dgrep Package : dateutils Home : http://www.fresse.org/dateutils Version : 0.3.1 Type : ELF64-bitLSBsharedobject,x86-64,version1(S ...) Help : probably available with -h,--help Home : https://github.com/hroptatyr/dateutils (doc)
最良好的祝愿……乾杯,drl