Json

如何從一個非常大(> 100,000 行)的 JSON 文件中為每個節點提取兩個數據欄位(1 個標量和 1 個數組)?

  • October 11, 2020

我有一個 139,000 行的 JSON 文件,其結構基本上如下所示(它是 OpenStreetMap 的摘錄):

{
 "type": "FeatureCollection",
 "generator": "overpass-ide",
 "features": [
   {
     "type": "Feature",
     "properties": {
       "@id": "relation/7859",
       "TMC:cid_58:tabcd_1:Class": "Area",
       "TMC:cid_58:tabcd_1:LCLversion": "9.00",
       "TMC:cid_58:tabcd_1:LocationCode": "4934",
       "leisure": "park",
       "name": "Platnersberg",
       "type": "multipolygon",
       "@geometry": "center"
     },
     "geometry": {
       "type": "Point",
       "coordinates": [
         11.128184,
         49.4706035
       ]
     },
     "id": "relation/7859"
   },
   {
     "type": "Feature",
     "properties": {
       "@id": "relation/62370",
       "TMC:cid_58:tabcd_1:Class": "Area",
       "TMC:cid_58:tabcd_1:LCLversion": "8.00",
       "TMC:cid_58:tabcd_1:LocationCode": "1157",
       "admin_level": "6",
       "boundary": "administrative",
       "de:place": "city",
       "name": "Eisenach",
       "type": "boundary",
       "@geometry": "center"
     },
     "geometry": {
       "type": "Point",
       "coordinates": [
         10.2836229,
         50.9916015
       ]
     },
     "id": "relation/62370"
   }
 ]
}

不,我想獲取此文件中每個功能的名稱、TMC 位置程式碼和座標,最好是 CSV 文件:

location_code,name,latitude,longitude

我知道我可以製作一個正則表達式,它會剔除所有多餘的節點,但這將是一個相當複雜的節點。我還在jqOpenSuSE Leap 15.1 機器上安裝了該工具,但在使用此工具時我是新手。

關於如何進行這項提取工作的任何想法?

我自己是新手,但我認為類似

$ jq -r '.features[] | select(.type == "Feature") | [.properties."TMC:cid_58:tabcd_1:LocationCode",.properties.name,.geometry.coordinates[]] | @csv' file.json
"4934","Platnersberg",11.128184,49.4706035
"1157","Eisenach",10.2836229,50.9916015

應該這樣做。過濾器可能不是必需的select(.type == "Feature")- 我不確定是否有任何其他類型是可能的。

引用自:https://unix.stackexchange.com/questions/613971