Linux
linux + 如何從 xml 文件中擷取值
我想從此 xml 文件中擷取所有值並將文件中的值列印為 out1.txt
備註 - 來自 xml 的值表示雙括號中的單詞
more input.txt <app name="UAT/ECC/Global/MES/1206/MRP-S23" ear="UAT/ECC/Global/MES/1206/MRP-S23.ear" xml="UAT/ECC/Glal/ME/120/MRP- S23.xml"/> <app name="OQ/ediedbn/adSFSF/adSFSF-CL" ear="OQ/ebn/aSF/adSF- CL.ear" xml="OQ/ediedbn/adSFSF/adSSF-CL.xml"/> <app name="OQ/ediedbn/adaEBS/adOrBS-HR-CL" ear="OQ/ediedbn/adOraS/araEBS- HR-CL.ear" xml="OQ/eddbn/aOraEBS/adOEBS- HR-CL.xml"/> <app name="UAT/CZ/LIMS/T068_01/LIMS-QA-S03" ear="UAT/CZ/LIS/T068_01/LIS-QA- .ear" xml="UAT/CZ/LIMS/T068_01/LIMS-QA-S03.xml"/>
.
more out1.txt UAT/ECC/Global/MES/1206/MRP-S23 UAT/ECC/Glal/ME/120/MRP-S23.xml OQ/ediedbn/adSFSF/adSFSF-CL OQ/ebn/aSF/adSF- CL.ear . . .
請建議如何使用 awk / perl one liner bash 擷取 out1.txt 文件中的值
您可以像這樣使用 awk 對輸入文件進行切片:
gv@debian:$ cat a.txt <app name="UAT/ECC/Global/MES/1206/MRP-S23" ear="UAT/ECC/Global/MES/1206/MRP-S23.ear" xml="UAT/ECC/Glal/ME/120/MRP- S23.xml"/> <app name="OQ/ediedbn/adSFSF/adSFSF-CL" ear="OQ/ebn/aSF/adSF- CL.ear" xml="OQ/ediedbn/adSFSF/adSSF-CL.xml"/> <app name="OQ/ediedbn/adaEBS/adOrBS-HR-CL" ear="OQ/ediedbn/adOraS/araEBS- HR-CL.ear" xml="OQ/eddbn/aOraEBS/adOEBS- HR-CL.xml"/> <app name="UAT/CZ/LIMS/T068_01/LIMS-QA-S03" ear="UAT/CZ/LIS/T068_01/LIS-QA- .ear" xml="UAT/CZ/LIMS/T068_01/LIMS-QA-S03.xml"/> gv@debian:$ cat b.txt gv@debian:$ awk -F"name=|ear=|xml=|/>" '{print $2} {print $4}' a.txt >b.txt gv@debian:$ cat b.txt "UAT/ECC/Global/MES/1206/MRP-S23" "UAT/ECC/Glal/ME/120/MRP- S23.xml" "OQ/ediedbn/adSFSF/adSFSF-CL" "OQ/ediedbn/adSFSF/adSSF-CL.xml" "OQ/ediedbn/adaEBS/adOrBS-HR-CL" "OQ/eddbn/aOraEBS/adOEBS- HR-CL.xml" "UAT/CZ/LIMS/T068_01/LIMS-QA-S03" "UAT/CZ/LIMS/T068_01/LIMS-QA-S03.xml"
如果您不想保留雙引號,可以使用 sed 將它們刪除,如下所示:
gv@debian:$ sed -i 's/\"//g' b.txt gv@debian:$ cat b.txt UAT/ECC/Global/MES/1206/MRP-S23 UAT/ECC/Glal/ME/120/MRP- S23.xml OQ/ediedbn/adSFSF/adSFSF-CL OQ/ediedbn/adSFSF/adSSF-CL.xml OQ/ediedbn/adaEBS/adOrBS-HR-CL OQ/eddbn/aOraEBS/adOEBS- HR-CL.xml UAT/CZ/LIMS/T068_01/LIMS-QA-S03 UAT/CZ/LIMS/T068_01/LIMS-QA-S03.xml
或者在一個襯里中,將 awk 管道傳輸到 sed :
gv@debian:$ awk -F"name=|ear=|xml=|/>" '{print $2} {print $4}' a.txt |sed 's/\"//g' >b.txt
提示:如果您希望每個輸入文件行的所有欄位都寫入輸出文件中的一行,請使用
{print $2 $4}
(將欄位放在相同的括號內)。此 awk 方法起作用的關鍵是 awk 可以接受多字元定界符以及由 | 分隔的多個定界符。(=或) 。
awk 分隔符由選項 -F 定義
如果您需要保存耳朵值,請將 {print $4} 替換為 {print $3}。
要了解有關此 awk 切片的資訊,請查看將由 awk 分隔的所有欄位:
$ awk -F"name=|ear=|xml=|/>" '{print "Field1="$1} {print "Field2="$2} {print "Field3="$3} {print "Field4="$4}' a.txt Field1=<app Field2="UAT/ECC/Global/MES/1206/MRP-S23" Field3="UAT/ECC/Global/MES/1206/MRP-S23.ear" Field4="UAT/ECC/Glal/ME/120/MRP- S23.xml" Field1=<app Field2="OQ/ediedbn/adSFSF/adSFSF-CL" Field3="OQ/ebn/aSF/adSF- CL.ear" Field4="OQ/ediedbn/adSFSF/adSSF-CL.xml" Field1=<app Field2="OQ/ediedbn/adaEBS/adOrBS-HR-CL" Field3="OQ/ediedbn/adOraS/araEBS- HR-CL.ear" Field4="OQ/eddbn/aOraEBS/adOEBS- HR-CL.xml" Field1=<app Field2="UAT/CZ/LIMS/T068_01/LIMS-QA-S03" Field3="UAT/CZ/LIS/T068_01/LIS-QA- .ear" Field4="UAT/CZ/LIMS/T068_01/LIMS-QA-S03.xml"