Bash

在awk中使用正則表達式列印第一個欄位只有四個字元的行?

  • November 24, 2012
John Goldenrod:(916) 348-4278:250:100:175

Chet Main:(510) 548-5258:50:95:135

Tom Savage:(408) 926-3456:250:168:200

Elizabeth Stachelin:(916) 440-1763:175:75:300

輸出應包含僅包含四個字元 (john,chet) 的名稱的行:

awk '$1 ~ /[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]" "/ {print}' file

這似乎對我不起作用。我可以在不使用任何 awk 函式的情況下做到這一點嗎?

awk 中的欄位預設由 " 分隔", this means $1doesn't contain a space, so the correct regex for$1 is:

awk '$1 ~ /^[a-zA-Z0-9]{4}$/ {print}' file

```

If you want to keep your original approach you can also just use `$0` instead, i.e:



```
awk '$0 ~ /^[a-zA-Z0-9]{4}\s/ {print}' file

```

To simplify things you can also use `\w` instead of explicitly defining word characters, i.e:



```
awk '$0 ~ /^\w{4}\s/ {print}' file

```

If you only want to match the space and not something else like `TAB` you just have to replace `\s` with "`" (without the quotation marks).`

`Another issue with your original approach are the missing anchors. As you didn't specify either `^` nor `$` your pattern can occur anywhere, i.e the pattern would match for `Elizabeth Stachelin` with `beth`.``

引用自:https://unix.stackexchange.com/questions/56530