刪除特定上下文中的字元（使用 shell 腳本）

September 18, 2019

所以，我有一個包含名稱列表的文件，比如
Thomas Newbury
Calvin Lewis
E. J. Frederickson
Lamar Wojcik
J.C. Lily
Lillian Thomas
我最終會嘗試將這些分成一長串名字和姓氏，但在此之前，我想將“EJ”變成“EJ”，但我無法弄清楚如何做到這一點與 bash。
我知道"[A-Z]+. [A-Z]+."匹配“EJ”，但我不知道什麼命令允許我僅在兩個虛線字母之間的上下文中刪除空格？

我認為這將與 GNU 有關sed：
sed -E 's/^([A-Z]+\.)[[:blank:]]([A-Z]+\.)/\1\2/' file

我認為 sed 是你最好的選擇，這是我的版本：

sed -r ':a;s/^(.*\.)(\ )+(.\.)(.*)$/\1\3\4/;t a' file

-r -- use extended regular expressions
:a -- label "a" 
^(.*\\.) -- 1st group matches any character "." from the line beginning up to a literal "\\.".   
(\ )+ -- 2nd group matches white space (+ is one or more) 
(.\.) -- 3rd group matches the next letter 
(.*)$ -- 4th group matches to the end of the line
;t a -- if the previous substitution did something then branch to label "a"
/\1\2\4/ -- replaces the matches with groups 1,3,4 removing the space

這可以處理任意縮寫，例如：SOV Sovereign

引用自：https://unix.stackexchange.com/questions/542489

刪除特定上下文中的字元（使用 shell 腳本）

相關問答

將製表符分隔文件中的逗號分隔列表擴展為單獨的行

如何調整一行中的第 n 個數字？

使用 sed 去除字元串末尾的單詞模式

正則表達式，刪除最後一次出現“/”之後的所有字元

從日誌文件中提取數據

grep 正則表達式解決方案（貪心不工作）