使用 Bash 遍歷嵌套目錄並從 YAML 文件中提取某些欄位

August 26, 2022

我正在學習bash，我需要的是遍歷一個目錄（裡面有其他目錄）並找到所有名為example.yaml.
這些文件有幾個鍵值對（下面的範例）：
name: Andre
age: 13
address: street
weight: 78kgs
我需要的是在某個目錄（必須包括嵌套目錄）中使用 bash 命令查找所有example.yaml文件，然後僅將名稱和年齡複製到新文件中。需要創建這個新文件，它看起來像這樣：
persons:
 - name: Andre
   age: 13
 - name: Joao
   age: 18
 ...
我試圖做這樣的事情來解決這個問題
printf 'persons:\n' &gt; output.yml
for i in $(find ./ -name "example.yaml");
do
name=$(yq '.name' $i)
age=$(yq '.age' $i)

// append $name and $age to output.yaml
done

注意：這個答案的長度是由於至少有兩個主要的實用程序變體稱為yq，用於解析 YAML 數據，功能和表達式語法略有不同，我將兩者都介紹。我還研究了簡單地使用文件名萬用字元來查找所有文件並使用find（當輸入文件太多時）。最後，我解決了評論中提出的其他問題。
不要迭代find. 相反，請從findusing呼叫您的實用程序-exec。我在這個答案中有一個進一步的例子。您還缺少對某些擴展的引用。
也可以看看：
為什麼循環查找的輸出是不好的做法？
了解 find 的 -exec 選項
什麼時候需要雙引號？
給定命令行上的一個或多個 YAML 文件，以下yq命令將創建 YAML 數據摘要文件：
yq -y -s '{ persons: map({ name: .name, age: .age }) }' files
該命令將所有輸入讀入一個大數組（感謝-s或--slurp），然後將其傳遞給map()命令。該map()命令提取數組中每個元素的name和age欄位，並將它們作為對象添加到persons數組中。
yq這使用了來自https://kislyuk.github.io/yq/的 Andrey Kislyuk 基於 Python 的，它是通用 JSON 解析器的包裝器jq。如果您-y從命令中刪除該選項，您將獲得 JSON 輸出。
yq改用Mike Farah 的基於 Go 的：
yq -N '[{ "name": .name, "age": .age }]' files | yq '{ "persons": . }'
在bashshell 中，您可以將其應用於example.yaml目前目錄中的所有文件或它下面的任何位置，在目前目錄中創建輸出文件output.yaml，如下所示：
shopt -s globstar failglob

yq -y -s '{ persons: map({ name: .name, age: .age }) }' ./**/example.yaml &gt;output.yaml
或者，使用 Mike Farah 的yq：
shopt -s globstar failglob

yq -N '[{ "name": .name, "age": .age }]' ./**/example.yaml | yq '{ "persons": . }' &gt;output.yaml
這假設example.yaml文件少於幾千個，否則命令行會擴展為太長的命令。
首先globstar啟用 shell 選項以允許我們使用文件名通配模式，該模式在路徑名**中匹配。如果沒有匹配的文件名，/我們還啟用shell 選項以使整個命令正常失敗。failglob
測試：
$ tree
.
├── dir1
│   └── example.yaml
├── example.yaml
└── script-andrey
└── script-mike

1 directory, 4 files
$ cat script-andrey
shopt -s globstar failglob
yq -y -s '{ persons: map({ name: .name, age: .age }) }' ./**/example.yaml &gt;output.yaml
$ bash script-andrey
$ cat output.yaml
persons:
 - name: Joao
   age: 18
 - name: Andre
   age: 13
也測試邁克yq：
$ cat script-mike
shopt -s globstar failglob
yq -N '[{ "name": .name, "age": .age }]' ./**/example.yaml | yq '{ "persons": . }' &gt;output.yaml
$ bash script-mike
$ cat output.yaml
persons:
 - name: Joao
   age: 18
 - name: Andre
   age: 13
如果您有成千上萬個這樣的 YAML 輸入文件，那麼您可能希望yq更聰明地應用find.
這是使用安德烈的yq：
find . -name example.yaml -type f \
   -exec yq -y -s 'map({ name: .name, age: .age })' {} + |
yq -y '{ persons: . }' &gt;output.yaml
這將查找所有名稱為example.yaml. 這些是分批傳遞的，將從每個yq中提取name和age欄位，創建一個數組。然後有一個最終yq命令收集生成的 YAML 數組並將其作為persons鍵的值放置在最終輸出中。
同樣，對於邁克的yq：
find . -name example.yaml -type f \
   -exec yq -N '[{ "name": .name, "age": .age }]' {} + |
yq '{ "persons": . }' &gt;output.yaml
使用與上述相同的文件集進行測試：
$ rm output.yaml
$ find . -name example.yaml -type f -exec yq -y -s 'map({ name: .name, age: .age })' {} + | yq -y '{ persons: . }' &gt;output.yaml
$ cat output.yaml
persons:
 - name: Andre
   age: 13
 - name: Joao
   age: 18
（執行為 Mike 設計的命令會yq生成相同的輸出。）
請注意，輸出的順序取決於find查找文件的順序。
您是否要對例如name欄位上的輸出文件進行排序，然後以下內容將對文件進行就地排序（請注意，我不知道如何使用 Mike Farah 的 Go-based 來執行此操作yq）：
yq -i -y '.persons |= sort_by(.name)' output.yaml
以相反的順序排序（就地）：
yq -i -y '.persons |= (sort_by(.name) | reverse)' output.yaml
在評論中，使用者詢問是否可以將數據附加到現有文件。這個有可能。
下面的命令假定最後一件事output.yaml是persons數組的末尾（以便該命令能夠向其中添加新的數組條目）。
使用安德烈的yq：
shopt -s globstar failglob
yq -y -s 'map({ name: .name, age: .age })' ./**/example.yaml &gt;&gt;output.yaml
或者，與find，
find . -name example.yaml -type f \
   -exec yq -y -s 'map({ name: .name, age: .age })' {} + &gt;&gt;output.yaml
使用邁克的yq：
shopt -s globstar failglob
yq -N '[{ "name": .name, "age": .age }]' ./**/example.yaml &gt;&gt;output.yaml
或者，使用find：
find . -name example.yaml -type f \
   -exec yq -N '[{ "name": .name, "age": .age }]' {} + &gt;&gt;output.yaml

引用自：https://unix.stackexchange.com/questions/714871

使用 Bash 遍歷嵌套目錄並從 YAML 文件中提取某些欄位

相關問答

bash 遍歷多個目錄以建構一個 yaml 文件

sed 與 awk：逐行讀取文件並附加到另一個文件中的特定行

使用帶有變數的 sed 命令苦苦掙扎

將變數傳遞給 AWK 在循環中不起作用

使用 yq 對 k8s 秘密值進行 base64 編碼

如何使用 xargs 即時設置和更改變數？