Text-Processing

列的平均值並將其導出到另一列中

  • April 20, 2022

我有一個如下所示的 txt 文件(這張圖片來自它的 .csv 版本)。我喜歡做的是取月的平均值(從第 7 列到第 10 列)並將其導出到新列。但它只給了我一個平均數字。

OBSERVATORY,Abbreviations,COUNTRY,ALTITUDE(m),LONGITUDE(deg),LATITUDE (deg),January,February,March,April,May,June,July,August,September,October,November,December
Beverly-Begg Observatory Dunedin,,New Zealand,140,170.49,-45.8644,89.93,86.84,85.26,88.22,89.36,89.8,88.52,90.42,88.74,89.06,91.16,91.36
Aorangi Iti Observatory Lake Tekapo,,New Zealand,718,170.473,-44.0082,63.92,60.44,58.63,65.68,75.97,85.7,84.85,83.7,76.34,70.56,70.2,70.4
Mount John Observatory Lake Tekapo,,New Zealand,945,170.465,-43.9874,62.4,60.91,58.98,67.15,79.45,85.24,86.93,84.96,77.1,72.0,70.9,71.85

我的程式碼是這樣的:

awk '{ sum += $5 + $6 + $7 + $8 + $9 + $10 + $11 + $12 + $13 + $14
+ $15 } END { print sum / (NR * 18) }' observatory_1.txt > observatory_3.txt


output: 0.104394

我想創建一個 txt 文件夾,如下所示:

OBSERVATORY, Abbreviations, COUNTRY, ALTITUDE(m), LONGITUDE(deg), LATITUDE (deg), MEAN
Beverly-Begg Observatory Dunedin, , New Zealand,  140, 170, 490, -45,8644, 89,05583333

任何建議將不勝感激。

您的腳本正在對每個輸入行中的一堆列求和,然後,在它讀取所有輸入行之後,END 塊正在列印一個輸出行……所以它在產生任何輸出之前處理整個文件。

您應該做的是分別處理每個輸入行。

您的列號似乎也有問題 - 例如,您為什麼要在平均計算中包含海拔、經度和緯度?我將假設您實際上想要第 7 到 19 列(1 月到 12 月)的平均值。

無論如何,你可能想要更像這樣的東西:

awk -F, -v OFS=, '
    NR == 1 { print $1, $2, $3, $4, $5, $6, "MEAN" }
    NR  > 1 {
      sum = 0;
      for (i=7; i<=19; i++) { sum += $i }
      print $1, $2, $3, $4, $5, $6, (sum / 12)
    }' observatory_1.txt > observatory_3.txt

這將產生如下輸出:

OBSERVATORY,Abbreviations,COUNTRY,ALTITUDE(m),LONGITUDE(deg),LATITUDE (deg),MEAN
Beverly-Begg Observatory Dunedin,,New Zealand,140,170.49,-45.8644,89.0558
Aorangi Iti Observatory Lake Tekapo,,New Zealand,718,170.473,-44.0082,72.1992
Mount John Observatory Lake Tekapo,,New Zealand,945,170.465,-43.9874,73.1558

這可能不是您想要的,但它應該是朝著正確方向邁出的一步。

使用Raku(以前稱為 Perl_6)

raku -e 'put get.split(",")[0..5].join(",") ~ ",MEAN"; \
     for lines() {my @a = .split(","); \
     put (@a[0...5].join(",") ~ "," ~ @a.[6..*].sum / @a.[6..*].elems)};'  

或者

raku -ne 'state $i=0; ++$i; my @a = .split(","); $i == 1 \
     ?? put @a.[0..5].join(",") ~ ",MEAN" \
     !! put (@a[0...5].join(",") ~ "," ~ @a.[6..*].sum / @a.[6..*].elems);'  

樣本輸入:

OBSERVATORY,Abbreviations,COUNTRY,ALTITUDE(m),LONGITUDE(deg),LATITUDE (deg),January,February,March,April,May,June,July,August,September,October,November,December
Beverly-Begg Observatory Dunedin,,New Zealand,140,170.49,-45.8644,89.93,86.84,85.26,88.22,89.36,89.8,88.52,90.42,88.74,89.06,91.16,91.36
Aorangi Iti Observatory Lake Tekapo,,New Zealand,718,170.473,-44.0082,63.92,60.44,58.63,65.68,75.97,85.7,84.85,83.7,76.34,70.56,70.2,70.4
Mount John Observatory Lake Tekapo,,New Zealand,945,170.465,-43.9874,62.4,60.91,58.98,67.15,79.45,85.24,86.93,84.96,77.1,72.0,70.9,71.85

範例輸出(對於上述兩種程式碼解決方案):

OBSERVATORY,Abbreviations,COUNTRY,ALTITUDE(m),LONGITUDE(deg),LATITUDE (deg),MEAN
Beverly-Begg Observatory Dunedin,,New Zealand,140,170.49,-45.8644,89.055833
Aorangi Iti Observatory Lake Tekapo,,New Zealand,718,170.473,-44.0082,72.199167
Mount John Observatory Lake Tekapo,,New Zealand,945,170.465,-43.9874,73.155833

簡要解釋第一個答案:get標題行,split逗號和put前 6 列,然後MEAN 使用.split(",")[0..5].join(",") ~ ",MEAN". (ICYMI,~波浪號用於連接 Raku 中的字元串)。

現在在第二行(第一個數據行)使用逐行讀取游標,for lines()讀取輸入(逐行),逗號split上的 s","並將元素儲存在@a數組中。前六列@a[0..5]put用 計算的平均值@a.[6..*].sum / @a.[6..*].elems

請注意,您可能可以使用 對列索引進行硬編碼@a.[6..17].sum / @a.[6..17].elems,如果它始終是 12 列,則可以@a.[6..17].sum / 12改為使用 12 列。最後, @a.[6..17].map(*.chars > 0).sum如果您需要調整缺失值,請用作分母。

https://raku.org

引用自:https://unix.stackexchange.com/questions/699409