如何從 csv 文件的特定列中過濾日期範圍？

April 9, 2018

考慮輸入文件
1,10/22/2017,Scheduled
2,10/23/2017,Confimred
1,10/24/2017,NA
1,10/29/2017,Scheduled
3,11/1/2017,Scheduled
1,11/2/2017,Scheduled
如何通過提供日期範圍作為輸入來過濾第二列中的日期（範圍內）？

使用awk和呼叫 shelldate命令使用管道中的 getline：

awk -v start="$start" -v end="$end" -F, ' 
BEGIN{srt="date -d"start" +%s"; srt|getline start; close(srt);  
     ed="date -d"end" +%s"; ed|getline end; close(ed) } 
{ bkp=$0; epoch="date -d"$2" +%s";epoch |getline $2;close(epoch)}; 
   ($2&gt;=start && $2&lt;=end){print bkp}' infile

對於以下輸入：

1,10/22/2017,Scheduled
1,10/24/2017,NA
1,10/24/2017,NA,NA
1,10/29/2017,Scheduled
3,11/1/2017,Scheduled
1,11/2/2017,NA
5,9/30/2017,Confirmed
6,10/1/2017,Scheduled

與start='10/24/2017'，end='11/1/2017'結果是：

1,10/24/2017,NA
1,10/24/2017,NA,NA
1,10/29/2017,Scheduled
3,11/1/2017,Scheduled

這個片段：

# Utility functions: print-as-echo, print-line-with-visual-space.
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }

pl " Input data file $FILE:"
head data1

# start="10/29/2017" end="11/2/2017"
START="10/29/2017"
END="11/2/2017"

pl " Results, from $START through $END:"
dateutils.dgrep -i "%m/%d/%Y" "&gt;=$START" '&&' "&lt;=$END" &lt; data1

pl " Unsorted file, data2:"
head data2

pl " Results, from $START through $END, randomly organized file:"
dateutils.dgrep -i "%m/%d/%Y" "&gt;=$START" '&&' "&lt;=$END" &lt; data2

產生：

-----
Input data file :
1,10/22/2017,Scheduled
2,10/23/2017,Confimred
1,10/24/2017,NA
1,10/29/2017,Scheduled
3,11/1/2017,Scheduled
1,11/2/2017,Scheduled

-----
Results, from 10/29/2017 through 11/2/2017:
1,10/29/2017,Scheduled
3,11/1/2017,Scheduled
1,11/2/2017,Scheduled

-----
Unsorted file, data2:
1,10/22/2017,Scheduled
1,10/24/2017,NA
1,10/29/2017,Scheduled
1,11/2/2017,Scheduled
2,10/23/2017,Confimred
3,11/1/2017,Scheduled

-----
Results, from 10/29/2017 through 11/2/2017, randomly organized file:
1,10/29/2017,Scheduled
1,11/2/2017,Scheduled
3,11/1/2017,Scheduled

在這樣的系統上：

OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.9 (jessie) 
bash GNU bash 4.3.30

因為比較是對日期格式的數據進行算術比較，所以數據可以是任何順序。如果需要，可以對最終結果進行排序——參見 sort、msort、dsort。dateutils 程式碼可用於許多儲存庫和 OSX（通過 brew）。

dateutils.dgrep 的一些細節：

dateutils.dgrep Grep standard input for lines that match EXPRESSION. (man)
Path    : /usr/bin/dateutils.dgrep
Package : dateutils
Home    : http://www.fresse.org/dateutils
Version : 0.3.1
Type    : ELF64-bitLSBsharedobject,x86-64,version1(S ...)
Help    : probably available with -h,--help
Home    : https://github.com/hroptatyr/dateutils (doc)

最良好的祝愿……乾杯，drl

引用自：https://unix.stackexchange.com/questions/399710

如何從 csv 文件的特定列中過濾日期範圍？

相關問答

grep 文件中一行的前 n 個和後 n 個字元

僅解析持續時間的“正常執行時間”

如何使用 sed、grep 或 awk 根據另一個文件中的行號將某些行保留在文件中

如何在經常一起出現的多個文件中查找關鍵字？

在重疊的括號中提取標識符和對應的括號

如何從文件中刪除所有評論？