Text-Processing

您如何正確格式化此日誌?

  • March 25, 2021

所以我在指定埠上監聽 netcat 並將輸出保存到日誌文件中。顯然日誌文件是一個單行文件……我該如何正確格式化它?如果您合併 grc 或類似程序添加顏色並以某種方式巧妙地刪除包含的顏色程式碼,則加分:

輸入(錯誤地這是由 sed 腳本處理的,請忽略 /\n/)

[{"startTime":"1","categoryName":"2","data":"3","level":"4","context":"5","pid":7520},"2021-03-25T08:01:08.086Z","default",["6"],{"level":20000,"levelStr":"7","colour":"8"},{},"Hello, log4js! \n","INFO","green"]__LOG4JS__[{"startTime":"1","categoryName":"2","data":"3","level":"4","context":"5","pid":7520},"2021-03-25T08:01:08.096Z","default",["6"],{"level":30000,"levelStr":"7","colour":"8"},{},"Test warning! This is not a drill! \n","WARN","yellow"]__LOG4JS__[{"startTime":"1","categoryName":"2","data":"3","level":"4","context":"5","pid":7520},"2021-03-25T08:01:08.229Z","default",["6"],{"level":20000,"levelStr":"7","colour":"8"},{}/\\n/,"Connected to mongo","INFO","green"]

輸出

[{"startTime":"1","categoryName":"2","data":"3","level":"4","context":"5","pid":7520},
"2021-03-25T08:01:08.086Z","default",["6"],{"level":20000,"levelStr":"7","colour":"8"},{},"Hello, log4js! \n","INFO","green"]__LOG4JS__[{"startTime":"1","categoryName":"2","data":"3","level":"4","context":"5","pid":7520},
"2021-03-25T08:01:08.096Z","default",["6"],{"level":30000,"levelStr":"7","colour":"8"},{},"Test warning! This is not a drill! \n","WARN","yellow"]__LOG4JS__[{"startTime":"1","categoryName":"2","data":"3","level":"4","context":"5","pid":7520},
"2021-03-25T08:01:08.229Z","default",["6"],{"level":20000,"levelStr":"7","colour":"8"},{}/\\n/,"Connected to mongo","INFO","green"]

我試圖做到這一點(請注意這是不正確的)

awk '{gsub(/\\n/,"__LOG4JS__")}1' a="$(ncat -l -k 10.0.0.1 10000)" log.log

完全不清楚您要做什麼,但如果它只是將每個轉換__LOG4JS__為換行符,那就是使用 GNU awk 進行多字元 RS (我修復了您的輸入以刪除/\\n/第 3 個中存在的虛假內容)生成無效 JSON 的行,我認為它實際上並不存在於您的真實數據中):

$ awk -v RS='__LOG4JS__|\r?\n' '1' file
[{"startTime":"1","categoryName":"2","data":"3","level":"4","context":"5","pid":7520},"2021-03-25T08:01:08.086Z","default",["6"],{"level":20000,"levelStr":"7","colour":"8"},{},"Hello, log4js! \n","INFO","green"]
[{"startTime":"1","categoryName":"2","data":"3","level":"4","context":"5","pid":7520},"2021-03-25T08:01:08.096Z","default",["6"],{"level":30000,"levelStr":"7","colour":"8"},{},"Test warning! This is not a drill! \n","WARN","yellow"]
[{"startTime":"1","categoryName":"2","data":"3","level":"4","context":"5","pid":7520},"2021-03-25T08:01:08.229Z","default",["6"],{"level":20000,"levelStr":"7","colour":"8"},{},"Connected to mongo","INFO","green"]

您可以對任何 awk 執行相同的操作,但這會將整個文件讀入記憶體(與等效的 sed 解決方案一樣):

awk '{sub(/\r$/,""); gsub(/__LOG4JS__/,ORS)}1' file

如果您正在嘗試做其他事情,請澄清那是什麼並更新您問題中的範例以顯示預期的輸出。

如果您只想查看格式化的 json 輸出,則使用上述內容:

$ awk -v RS='__LOG4JS__|\r?\n' '1' file | jq .
[
 {
   "startTime": "1",
   "categoryName": "2",
   "data": "3",
   "level": "4",
   "context": "5",
   "pid": 7520
 },
 "2021-03-25T08:01:08.086Z",
 "default",
 [
   "6"
 ],
 {
   "level": 20000,
   "levelStr": "7",
   "colour": "8"
 },
 {},
 "Hello, log4js! \n",
 "INFO",
 "green"
]
[
 {
   "startTime": "1",
   "categoryName": "2",
   "data": "3",
   "level": "4",
   "context": "5",
   "pid": 7520
 },
 "2021-03-25T08:01:08.096Z",
 "default",
 [
   "6"
 ],
 {
   "level": 30000,
   "levelStr": "7",
   "colour": "8"
 },
 {},
 "Test warning! This is not a drill! \n",
 "WARN",
 "yellow"
]
[
 {
   "startTime": "1",
   "categoryName": "2",
   "data": "3",
   "level": "4",
   "context": "5",
   "pid": 7520
 },
 "2021-03-25T08:01:08.229Z",
 "default",
 [
   "6"
 ],
 {
   "level": 20000,
   "levelStr": "7",
   "colour": "8"
 },
 {},
 "Connected to mongo",
 "INFO",
 "green"
]

引用自:https://unix.stackexchange.com/questions/641013