Text-Processing

使用 Bash / awk 填充 csv 文件第一列中的空格

  • September 22, 2021

我正在研究自動化一些流程/計算,但我可能首先需要格式化一個稍微尷尬的CSV文件集。(為此,我bash根據要求使用 )。

csv 文件集(大致)遵循以下格式

CODE,Sitting,Jan,Feb,Mar,Apr,May,Jun,Jul,Totals
CLLK_J9,First Sitting,,,2,5,2,,,10
,Second Sitting,,,,,,,,1
RTHM_A8,First Sitting,,,1,,3,,,6
,Second Sitting,,,,,1,,,1
FFBJ_FA9,First Sitting,,,,8,6,,,25
,Second Sitting,,,,,11,,,12
UUYIOR_HJ9,First Sitting,,,1,3,6,,,17
IKRO_Lk8,First Sitting,,,,3,3,,,37
,Second Sitting,,,,6,11,,,34

在此處輸入圖像描述

我正在嘗試CODE使用上一行中的欄位內容填充列中的空欄位(通常這些空欄位出現在第 2 列中的“第二次坐”實例旁邊)。所以,對於上面的例子,結果應該是這樣的

CODE,Sitting,Jan,Feb,Mar,Apr,May,Jun,Jul,Totals
CLLK_J9,First Sitting,,,2,5,2,,,10
CLLK_J9,Second Sitting,,,,,,,,1
etc.

我開始閱讀一些awk文件,因為它似乎是這個任務的一個相當強大的實用程序——但還沒有取得任何進展。想法?

使用 Miller ( https://github.com/johnkerl/miller ) 非常簡單。跑步

mlr --csv fill-down -f CODE input.csv >output.csv

你將會有

+------------+----------------+-----+-----+-----+-----+-----+-----+-----+--------+
| CODE       | Sitting        | Jan | Feb | Mar | Apr | May | Jun | Jul | Totals |
+------------+----------------+-----+-----+-----+-----+-----+-----+-----+--------+
| CLLK_J9    | First Sitting  | -   | -   | 2   | 5   | 2   | -   | -   | 10     |
| CLLK_J9    | Second Sitting | -   | -   | -   | -   | -   | -   | -   | 1      |
| RTHM_A8    | First Sitting  | -   | -   | 1   | -   | 3   | -   | -   | 6      |
| RTHM_A8    | Second Sitting | -   | -   | -   | -   | 1   | -   | -   | 1      |
| FFBJ_FA9   | First Sitting  | -   | -   | -   | 8   | 6   | -   | -   | 25     |
| FFBJ_FA9   | Second Sitting | -   | -   | -   | -   | 11  | -   | -   | 12     |
| UUYIOR_HJ9 | First Sitting  | -   | -   | 1   | 3   | 6   | -   | -   | 17     |
| IKRO_Lk8   | First Sitting  | -   | -   | -   | 3   | 3   | -   | -   | 37     |
| IKRO_Lk8   | Second Sitting | -   | -   | -   | 6   | 11  | -   | -   | 34     |
+------------+----------------+-----+-----+-----+-----+-----+-----+-----+--------+

引用自:https://unix.stackexchange.com/questions/601138