Linux

使用不同時間戳的 AWK 過濾重複項

  • July 19, 2022

給定按時間戳排序的文件列表,如下所示。我正在尋求檢索每個文件的最後一次出現(每個文件底部的那個)

例如:

archive-daily/document-sell-report-2022-07-12-23-21-02.html
archive-daily/document-sell-report-2022-07-13-23-15-34.html
archive-daily/document-loan-report-2022-07-18-05-12-16.html
archive-daily/document-loan-report-2022-07-18-17-07-26.html
archive-daily/document-deb-report-2022-07-18-13-17-40.html
archive-daily/document-deb-report-2022-07-18-10-04-21.html

會是這樣的:

archive-daily/document-sell-report-2022-07-13-23-15-34.html
archive-daily/document-loan-report-2022-07-18-17-07-26.html
archive-daily/document-deb-report-2022-07-18-10-04-21.html

我可以使用 awk 或任何其他命令來實現這一點嗎?提前致謝。

使用sedtac

$ sed -En 'G;/^(([^-]*-){3}).*\n.*\n\1/d;H;P' <(tac input_file)
archive-daily/document-sell-report-2022-07-13-23-15-34.html
archive-daily/document-loan-report-2022-07-18-17-07-26.html
archive-daily/document-deb-report-2022-07-18-10-04-21.html
$ tac file | awk '!seen[substr($0,1,length()-25)]++'
archive-daily/document-deb-report-2022-07-18-10-04-21.html
archive-daily/document-loan-report-2022-07-18-17-07-26.html
archive-daily/document-sell-report-2022-07-13-23-15-34.html

引用自:https://unix.stackexchange.com/questions/710346