Grep

pcregrep 查找帶有周圍空白的行

  • February 10, 2019

我有一些以降價開頭的標題#,並且我有以下兩個規則:

  • 標題( #) 上面應該正好有兩條換行符,下面有一條
  • 字幕#####)應該在上面和下面正好有一個空行。
  • 標題應優先於字幕。(如果有兩個衝突的規則,使用標題格式並忽略字幕)。

**注意:**我正在嘗試查找所有不符合這三個限制的標題。

下面是一些好標題和壞標題的例子

some text 
# Title     | BAD 

## Subtitle | Good (Has two spaces below, is needed for next main title)


# Title     | Good

## Subtitle | Bad
text  

# Title     | Bad

text

在擺弄正則表達式之後,我想出了這些表達式:

主要標題:正則表達式

((?<=\n{4})|(?<=.\n{2})|(?<=.\n))(# .*)|(# .*)(?=(\n.|\n{3}(?!# )|\n{4}))

字幕:正則表達式

'((?<=\n{3})|(?<=.\n))(##+.*)|(##+.*)(?=\n.|\n{3}(?!# )|\n{4}.)'

然而,令我非常困惑的是,他們不使用pcregrep?這是我嘗試執行的命令pcgrep(只是為了完整起見):

$ pcregrep -rniM --include='.*\.md' \
    '((?<=\n{3})|(?<=.\n))(##+.*)|(##+.*)(?=\n.|\n{3}(?!# )|\n{4}.)' \
    ~/Programming/oppgaver/src/web

當我嘗試只搜尋一個文件時它也不起作用,而且我還有其他幾個可以正常工作的表達式。

我的regex有什麼問題,還是執行錯誤?

此解決方案修復了所有不正確的標題。

sed -r '
   :loop; N; $!b loop

   s/\n+(#[^\n]+)/\n\n\1/g

   s/(#[^\n]+)\n+/\1\n\n/g

   s/\n+(#[^\n#]+)/\n\n\n\1/g
' input.txt;

附評論:

sed -r '
   ### put all file into the pattern space,
   # in other words, merge all lines into one line
   :loop; N; $!b loop;

   ### first traversal of the pattern space
   # searches the line with "#" sign (all cases matches - Titles, SubTitles, etc),
   # takes all its upper empty lines
   # and converts them to the one empty line 
   s/\n+(#[^\n]+)/\n\n\1/g;


   ### second traversal of the pattern space
   # again, searches the line with "#" sign, take all its bottom empty lines
   # and converts them to the one empty line 
   s/(#[^\n]+)\n+/\1\n\n/g;

   ### third traversal of the pattern space
   # searches the single "#" sign (Titles only),
   # takes all its upper newlines (at this moment only two of them are there,
   # because of previous substitutions) 
   # and converts them to three newlines 
   s/\n+(#[^\n#]+)/\n\n\n\1/g
' input.txt

輸入

text
# Title
## SubTitle
### SubSubTitle
# Title
## SubTitle
text
### SubSubTitle
# Title
# Title
# Title
## SubTitle
### SubSubTitle

輸出

text


# Title

## SubTitle

### SubSubTitle


# Title

## SubTitle

text

### SubSubTitle


# Title


# Title


# Title

## SubTitle

### SubSubTitle

引用自:https://unix.stackexchange.com/questions/456456