Awk
將數據格式化為表格
如何獲取細節並將其轉換為水平形式?
每條記錄在 之後結束
Couse
。Couse 永遠不會是空白或空值。注意:這四個標題將用於以下數據:姓名,城市,年齡,Couse
如果您看到第二條記錄,則沒有任何 “Name”: "" -> 失去,因此它應該為 null 代替它,其餘的將在此之後附加一個管道,如下所示:
null | Ors | 11 | MB
我在 demo.txt 文件中有如下數據
"Name":"asxadadad ,aaf dsf" "City":"Mum" "Age":"23" "Couse":"BBS" "City":"Ors" "Age":"11" "Couse":"MB" "Name":"adad sf" "City":"Kol" "Age":"21" "Couse":"BB" "Name":"pqr" "Age":"21" "Couse":"NN"
預期輸出:
asxadadad ,aaf dsf | Mum | 23 | BBS null | Ors | 11 | MB adad sf | Kol | 21 | BB pqr | null | 21 | NN
我嘗試了以下程式碼:但不符合我的邏輯
counter=0 var_0='Couse' while read -r line echo "$line" counter=$(( counter + 1 )) var_1=`echo "$line" | grep -oh "Couse"` if [ $var_0 == $var_1 ] then head -$counter demo.txt > temp.txt sed -i '1,$counter' demo.txt counter = 0 else echo "No thing to do" fi done < demo.txt
在每個 Unix 機器上的任何 shell 中使用任何 awk:
$ cat tst.awk BEGIN { numTags = split("Name City Age Couse",nums2tags) for (tagNr=1; tagNr<=numTags; tagNr++) { tag = nums2tags[tagNr] tags2nums[tag] = tagNr wids[tagNr] = ( length(tag) > length("null") ? length(tag) : length("null") ) } OFS=" | " } (NR==1) || (prevTag=="Couse") { numRecs++ } { gsub(/^"|"$/,"") tag = val = $0 sub(/".*/,"",tag) sub(/[^"]+":"/,"",val) tagNr = tags2nums[tag] vals[numRecs,tagNr] = val wid = length(val) wids[tagNr] = ( wid > wids[tagNr] ? wid : wids[tagNr] ) prevTag = tag } END { # Uncomment these 3 lines if youd like a header line printed: # for (tagNr=1; tagNr<=numTags; tagNr++) { # printf "%-*s%s", wids[tagNr], nums2tags[tagNr], (tagNr<numTags ? OFS : ORS) # } for (recNr=1; recNr<=numRecs; recNr++) { for (tagNr=1; tagNr<=numTags; tagNr++) { val = ( (recNr,tagNr) in vals ? vals[recNr,tagNr] : "null" ) printf "%-*s%s", wids[tagNr], val, (tagNr<numTags ? OFS : ORS) } } }
$ awk -f tst.awk file asxadadad ,aaf dsf | Mum | 23 | BBS null | Ors | 11 | MB adad sf | Kol | 21 | BB pqr | null | 21 | NN
或者如果您不想使用硬編碼的標籤列表(欄位/列名):
$ cat tst.awk BEGIN { OFS=" | " } (NR==1) || (prevTag=="Couse") { numRecs++ } { gsub(/^"|"$/,"") tag = val = $0 sub(/".*/,"",tag) sub(/[^"]+":"/,"",val) if ( !(tag in tags2nums) ) { tagNr = ++numTags tags2nums[tag] = tagNr nums2tags[tagNr] = tag wids[tagNr] = ( length(tag) > length("null") ? length(tag) : length("null") ) } tagNr = tags2nums[tag] vals[numRecs,tagNr] = val wid = length(val) wids[tagNr] = ( wid > wids[tagNr] ? wid : wids[tagNr] ) prevTag = tag } END { for (tagNr=1; tagNr<=numTags; tagNr++) { printf "%-*s%s", wids[tagNr], nums2tags[tagNr], (tagNr<numTags ? OFS : ORS) } for (recNr=1; recNr<=numRecs; recNr++) { for (tagNr=1; tagNr<=numTags; tagNr++) { val = ( (recNr,tagNr) in vals ? vals[recNr,tagNr] : "null" ) printf "%-*s%s", wids[tagNr], val, (tagNr<numTags ? OFS : ORS) } } }
$ awk -f tst.awk file Name | City | Age | Couse asxadadad ,aaf dsf | Mum | 23 | BBS null | Ors | 11 | MB adad sf | Kol | 21 | BB pqr | null | 21 | NN
請注意,第二個腳本的輸出中列的順序將是這些標籤出現在輸入中的順序,這就是為什麼它們需要標題行來標識值的原因,除非所有標籤都保證按照您的順序出現在輸入中希望他們輸出。