Linux
如何計算列中的 n 個值和總體平均值?
我真的很感激這方面的一些幫助,因為我是 Linux 的相對論新手。我正在使用
grep
,但我還想要所有提取值的平均值(無論它們位於哪一列)以及從每個文件中提取的值(n)的數量(並放入每列)。命令:
grep -o "201[1-4].[0-9][ -9]" file1.txt file2.txt file3.txt \ | awk -F: ' { if (!s[$1]){ s[$1]=$2 } else { s[$1]=s[$1]","$2 } } END { for (f in s){ print f","s[f] } }' | csvtool transpose -u " " - | column -nt
電流輸出:
file1.txt file3.txt file2.txt 2013.17 2012.69 2013.54 2012.6 2013.44 2013.9 2013.12 2012.64 2013.66 2012.76 2013.11 2013.44 2013.75 2012.6 2013.89 2013.08 2012.41 2013.62 2012.41 2013.41 2013.2
總是有三列,但行數無法預測。
例子:
file1.txt file3.txt file2.txt 1 2 3 2 1 1 2 n=2 n=2 n=3 Average: 1.714
我正在使用的內容範例:
文件1:
2896.79 2897.65 2898.82 2012.69 2013.44 2897.4 2896.79 2012.64 2896.71 2217.4 2013.11 2012.6 2012.41 2012.41 2013.2 2897.12 2896.61 2896.35 2896.85 2896.26 2896.72 2913.91 2914.41 2914.27 2329.68 2329.71 2914.15 2914.32 2321.19 2914.02 2329.32 2896.49 2025.55 2328.84 2328.98 2329.1 2913.96 2913.48 2913.36 2913.97 2913.46 2913.71 2925.09 2925.58 2926.69 2401.39 2897.65 2925.77 2925.55 2328.96 2924.86 2897.19 2913.95 2029.61 2896.86 2896.93 2764.59 2925.18 2924.96 2924.68 2925.03 2924.18 2924.72 2933.54 2933.91 3196.19 2897.73 2914.79 3314.12 3016.04 2400.29 3015.62 2914.03 2925.09 2897.07 2913.69 2914.41 2897.38 2934.31 3058.51 3015.26 2934.32 2932.55 2933.38 2952.71 2953.49 3223.9 2914.91 2926.03 3321.3 3066.1 2896.71 3065.91 2925.14 2934.16 2914.04 2924.74 2925.54 2914.6 2952.92 3065.18 3065.74 2953.23 3072.91 2952.84 3016.02 3016.4 3249.51 2925.73 2932.82 3373.04 3073.91 2913.87 3073.65 2953.29 2952.94 2925.78 2952.15 2933.65 2925.67 3015.76 3073.21 3072.94 3065.81 3222.86 3015.45 3065.42 3059.27 3313.84 2953.72 2953.17 3444.15 3081.47 2925.02 3192.84 3015.73 3015.47 2953.12 3015.35 2953.29 2953.84 3073.71 3223.42 3080.34 3073.98 3312.09 3059.03
文件2:
2013.17 2012.6 2013.12 2036.82 2037.29 2036.53 2036.44 2032.6 2032.88 2012.76 2037.57 2037.26 2037.5 2042.89 2038.1 2013.75 2037.73 2038 2037.93 2033.5 2033.26 2013.08 2033.07 2033.03 2032.99 2042.08 2042.15 2042.14 2041.82 2036.84 2036.88 2033.27 2042.66 2042.65 2042.61 2461.68 2042.94 2037.45 2042.99 2042.96 2043.04 2037.29 2037.32 2033.44 2037.25 2037.27 2037.26 2080.15 2080.48 2080.35 2079.99 2042.18 2042.22 2037.31 2461.16 2080.81 2080.72 2465.94 2461.39 2043 2081.13 2081.08 2081.26 2042.62 2042.63 2037.55 2042.56 2042.49 2042.41 2464.77 2465.3 2465.08 2460.36 2053.03 2465.1 2042.58 2465.84 2461.76 2460.66 2473.93 2466.23 2461.58 2461.48 2461.6 2466.06 2053.48 2053.35 2042.68 2053.26 2053.42 2053.79 2480.18 2473.43 2472.84 2464.91 2080.37 2480.47 2058.27 2473.22 2465.78 2465.78 2482.02 2474.04 2466.07 2466.05 2466.01 2474.01 2080.88 2080.75 2053.24 2076.01 2059.33 2058.08 2500.19 2481.17 2480.7 2472.76 2460.1 2529.24 2076.3 2481.38 2473.76 2473.51 2501.38 2482.1 2473.97 2474.02 2473.99 2482.05 2276.73 2276.52 2058.42 2080.83 2075.97 2075.97 2529.14 2529.55 2529.28 2481.04 2465.12 2537.59 2080.44 2489.75 2481.63 2481.37 2525.17 2490.26 2482.1 2481.98 2481.96 2501.93 2465.52 2465.58 2076.22 2250.64 2080.54 2080.49 2537.07 2536.95 2537.65 2487.59 2473 2619.65 2276.27 2496.5 2500.38 2489.45 2530.2 2502.23 2525.03 2490.46 2501.06 2530.44 2500.93 2481.26 2080.85 2276.67 2118.71 2275.92 2635.42 2547.03 2544.73 2503.45 2480.94 2636.12 2465.35 2500.25 2524.95 2524.14 2538.07 2524.69 2530.47 2530.45 2524.9 2538.6 2529.88 2500.92 2276.34
文件3:
2207.2 2003.43 6628.01 2013.54 2013.9 2914.93 2003.72 3315.09 2013.66 2013.44 2147.76 2147.67 2207.45 2147.93 2013.89 2013.62 2008.56 2914.99 6632.04 2252.13 2036.51 2147.79 2036.93 2926.08 2013.41 5833.85 2037.51 2037.41 2206.79 2207.16 2898.47 2207.22 2037.11 2147.77 2037.9 3060 2639.52 2120.66 2206.81 2147.77 3016.02 2036.57 6630.91 2147.94 2147.93 2914.59 2914.66 2915.5 2898.31 2207.46 2206.73 2147.96 3225.13 2829.69 2147.96 2329.47 2207.1 3059.21 2147.81 2207.22 2207.15 3015.96 3058.98 2926.66 2915.11 2898.69 2329.31 2166.65 3314.22 2914.74 2206.87 2897.84 2252.53 3225 2329.91 2329.35 2329.69 3031.21 3224.88 3059.82 2926.17 2915.3 2897.89 2207.42 5833.23 3015.61 2252.38 2914.72 2329.72 3265.74 2897.86 2897.85 2897.81 3058.98 3265.62 3225.63 3059.46 2926.66 2914.67 2253.44 6034.36 3030.72 2329.24 2925.98 2897.89 3305.35 2914.99 2915 2914.72 3077.57 3305.36 3266.57 3225.4 3016.03 2925.65 2330.06 6121.01
一步解決您的實際問題:
$ grep -o '201[1-4].[0-9]\+' file1.txt file2.txt file3.txt \ | datamash --sort -t: -g1 count 2 mean 2 file1.txt:8:2012.8125 file2.txt:6:2013.08 file3.txt:7:2013.6371428571
grep
從文件中獲取值,datamash
計算項目並按文件計算平均值。現在每個文件只有一行:
filename:n:average
更容易,對吧?
要獲得所有文件的平均值,請刪除分組:
grep -o '201[1-4].[0-9]\+' file1.txt file2.txt file3.txt \ | datamash --sort -t: mean 2 2013.1638095238
如果您需要列印精美的表格輸出,請嘗試以下操作:
$ cat mktable.sh #!/bin/bash myfiles="$@" trap "rm ${myfiles//txt/txt.tempfile}" EXIT SIGTERM SIGINT declare -A count for f in $myfiles ; do # write the tempfile AND get the linecount simultaneously count[$f]="$(grep -o '201[1-4].[0-9]\+' "$f" | tee ${f}.tempfile | wc -l)" sed -i "1i $f" ${f}.tempfile # write header sed -i "2i ---------" ${f}.tempfile # write header done ( paste ${myfiles//txt/txt.tempfile} ; for item in $myfiles ; do echo -n '--------- '; done; echo for item in $myfiles ; do echo -n "n=${count[$item]} " ; done ; echo ; for item in $myfiles ; do echo -n '--------- '; done; echo )\ | column -nt echo "Average: $(grep -o '201[1-4].[0-9]\+' $myfiles | datamash -s -t: mean 2)" $ ./mktable.sh file*.txt file1.txt file2.txt file3.txt --------- --------- --------- 2012.69 2013.17 2013.54 2013.44 2012.6 2013.9 2012.64 2013.12 2013.66 2013.11 2012.76 2013.44 2012.6 2013.75 2013.89 2012.41 2013.08 2013.62 2012.41 2013.41 2013.2 --------- --------- --------- n=8 n=6 n=7 --------- --------- --------- Average: 2013.1638095238