Text-Processing
從逗號分隔的文本中提取列
我有一個以逗號分隔的長文件,有 20K 行。這是一個範例:
"","id","number1","number2","number3","number4","number5","number6","number7" "1","MRTAT_1of3.RTS",17.1464602742708,17.1796255746079,17.1132949739337,0.996138996138996,-0.0055810322632996,1,1 "2","MRTAT_2of3.RTS",3.88270908946253,6.13558056235995,1.62983761656512,0.265637065637066,-1.91247162787182,0.718084341158075,1 "3","MRTAT_3of3.RTS",3.87323328936623,1.22711611247199,6.51935046626046,5.31274131274131,2.40945646701554,0.676814519398334,1
我想列印具有 id、number4、number5 和 number 6 的列,並使用製表符分隔設置條件 number4 大於 4.0。這是一些範例輸出:
id number4 number5 number6 MRTAT_3of3.RTS 5.31274131274131 2.40945646701554 0.676814519398334
awk -F , -v OFS='\t' 'NR == 1 || $6 > 4 {print $1, $6, $7, $8}' input.txt
我同意 awk 是最好的解決方案。您可以使用其他一些工具在 bash 中執行此操作:
cut -d , -f 2,6,7,8 filename | { read header tr , $'\t' <<< "$header" while IFS=, read -r id num4 num5 num6; do # bash can only do integer arithmetic if [[ $(bc <<< "$num4 >= 4.0") = 1 ]]; then printf "%s\t%s\t%s\t%s\n" "$id" "$num4" "$num5" "$num6" fi done }