Bash

從由多個鍵鍵控的文件中提取值

  • January 26, 2019

考慮一個帶有key=value對的文件,每個文件key可選地是多個keys 的串聯。換句話說,許多keys 可以映射到一個value。這背後的原因是,key與 的長度相比,每個單詞都是一個相對較短的單詞value,因此數據被“壓縮”成更短的行。

插圖(即不是真實值):

$ cat testfile
AA,BB,CC=a-lengthy-value
A,B,C=a-very-long-value
D,E,F=another-very-long-value
K1,K2,K3=many-many-more
Z=more-long-value

假設所有keys 都是唯一的並且不包含以下字元是有效的:

  • key分隔符:,
  • 鍵值分隔符:=
  • 空白字元:

``keys may come in any form in the *future* (with the above constraints), they *currently* adhere to the following regex coincidentally: [[:upper:]]{2}[[:upper:]0-9]. Likewise, values will not contain =, so =can be safely used to split each line. There are no multi-linekeys or value`s, so it is also safe to process line-by-line.

In order to facilitate data extraction from this file, a function getval() is defined as such:

getval() {
   sed -n "/^\([^,]*,\)*$1\(,[^=]*\)*=\(.*\)$/{s//\3/p;q}" testfile
}

As such, calling getval A will return the value a-very-long-value, not a-lengthy-value. It should also return nothing for a non-existent key.

Questions:

  • Is the current definition of getval() robust enough?
  • Are there alternative ways of performing the data extraction that are possibly shorter/more expressive/more restrictive?

For what it’s worth, this script will run with cygwin’s bash and coreutils that comes with it. Portability is not required here as a result (i.e. only brownie points will be given). Thanks!

edit:

Corrected function, added clarification about the keys.

edit 2:

Added clarification about the format (no multi-lines) and portability (not a requirement).`

You can write it in much more readable form using awk`:

getval() {
   awk -F'=' '$1~/\<'"$1"'\>/{print $2}' testfile
}

引用自:https://unix.stackexchange.com/questions/178697