Linux

如何使用 sed 從同一個 linux 字元串中刪除字母、數字、空格、連字元

  • May 4, 2021

我正在嘗試創建一個腳本來監視網頁並在該頁面發生更改時發送電報通知,我正在使用 diff 來完成此任務。

該腳本似乎執行良好,但某些網頁在頁面內容中插入了一種隨機 ID,每次下載頁面時此 ID 都會更改,我需要解決此問題才能使 diff 正常工作。

我需要找到一些方法來刪除/編輯這個隨機生成的 ID,簡而言之,我需要編輯這個 ID 的字元串,刪除幾乎所有的字母、空格、連字元、數字等,並且只保存沒有 ID 的數據。

例如,我只需要修改引號“”中的資訊:

<path d="M0 0h7v7h-7zM9 0h1v2h-1zM12 0h1v4h-2v-1h-1v-1h1v-1h1zM16 0h1v3h-1v-1h-1v-1h1zM18 0h4v1h-1v1h1v1h-2v-2h-1v1h-1zM23 0h1v1h-1zM26 0h7v7h-7zM1 1v5h5v-5zM22 1h1v1h-1zM27 1v5h5v-5zM2 2h3v3h-3zM8 2h1v1h1v1h1v1h1v1h-1v1h-1v-1h-1v1h1v1h-2zM14 2h1v1h-1zM23 2h1v2h1v3h-1v-2h-4v-1h3zM28 2h3v3h-3zM15 3h1v1h2v2h-1v-1h-2v2h-1v-1h-1v-2h2zM18 3h1v1h-1zM19 5h1v1h-1zM12 6h1v2h-2v-1h1zM16 6h1v2h1v-2h1v1h1v-1h1v1h1v1h-2v1h1v1h1v1h1v1h-3v1h-1v-1h-2v1h-1v-2h-2v1h-1v-4h2v1h-1v1h2zM22 6h1v1h-1zM23 7h1v1h-1zM0 8h1v1h1v-1h5v1h-3v1h3v1h-1v1h-1v-1h-2v-1h-1v1h-1v1h-1zM22 8h1v1h-1zM24 8h1v1h-1zM26 8h5v2h1v2h-2v1h3v1h-1v1h-1v1h-1v-2h-1v-1h-1v-3h1v1h1v-1h-1v-1h-1v1h-2zM9 9h1v1h-1zM23 9h1v1h-1zM32 9h1v1h-1zM8 10h1v1h-1zM18 10v1h2v-1zM10 11h1v1h-1zM25 11h1v1h-1zM3 12h2v1h-1v2h-2v-1h1zM6 12h3v1h-1v1h1v1h-1v1h-1v-1h-1v1h1v1h-1v1h-2v1h-1v-1h-3v-5h2v1h-1v2h1v1h1v-1h2v-2h2v-1h-1zM11 12h1v2h3v-1h1v1h1v1h-1v2h-1v-2h-1v1h-3zM14 12h1v1h-1zM17 13h2v1h-2zM22 13h6v1h-1v2h-1v1h-1v-1h-1v-2h-2zM20 14h2v1h1v1h-2v-1h-1zM9 15h1v1h-1zM28 15h1v2h-1v1h1v1h1v-1h-1v-1h2v1h1v1h1v3h-1v-1h-1v-1h-1v3h-1v-2h-1v-1h-2v-1h1v-1h-1v-1h1v-1h1zM10 16h1v1h-1zM17 16h1v1h-1zM32 16h1v2h-1zM8 17h2v1h-1v1h-1v1h2v3h-1v-1h-1v1h-2v1h2v1h-3v-1h-1v1h-1v-1h-1v-2h1v1h2v-1h2v-1h-1v-1h1v-1h-1v-1h2zM11 17h3v2h1v-1h1v1h1v1h-1v1h1v1h-2v-2h-3v-1h1v-1h-1v1h-1v1h-1v-2h1zM16 17h1v1h-1zM19 17h1v1h-1zM21 17h1v1h-1zM23 17h1v1h-1zM18 18h1v1h-1zM20 18h1v1h1v1h-1v1h-1v1h-1v-1h-1v-1h2zM22 18h1v1h-1zM24 18h2v1h-1v1h-1zM1 19h2v1h2v1h-3v-1h-1zM5 19h1v1h-1zM11 20h1v1h1v1h-1v1h-1zM23 20h1v1h4v2h-2v1h4v1h-1v2h1v-2h1v1h1v1h-1v1h-1v3h1v1h-1v1h-1v-1h-1v-1h1v-1h-1v-1h-4v-1h-1v-2h1v-4h-1v1h-1v-2h1zM0 21h2v1h-1v3h-1zM31 22h1v1h1v1h-3v-1h1zM10 23h1v1h-1zM13 23h1v1h-1zM16 23h1v1h-1zM21 23h1v1h-1zM9 24h1v1h1v-1h2v2h-1v1h-1v1h-1v-1h-1v-1h-1v-1h1zM14 24h1v2h-1zM17 24h1v3h2v-1h-1v-2h1v1h2v1h-1v1h1v1h-1v1h-1v1h-1v1h-3v2h-2v-1h1v-1h-4v-2h5v1h2v-1h1v-1h-2v1h-1v-2h-1v-1h1v-1h1zM22 24h1v1h-1zM25 25v3h3v-3zM32 25h1v1h-1zM0 26h7v7h-7zM26 26h1v1h-1zM1 27v5h5v-5zM8 27h1v1h1v3h1v2h-1v-1h-1v-1h-1zM12 27h1v1h-1zM2 28h3v3h-3zM31 28h2v2h-2zM21 29h2v1h-2zM20 30h1v1h-1zM23 30h1v2h-1v1h-1v-1h-1v-1h2zM26 30h2v1h-2zM8 32h1v1h-1zM17 32h3v1h-3zM24 32h1v1h-1zM26 32h2v1h-2zM31 32h1v1h-1z"/>

結果我需要:

<path d = ""/>

或類似這些範例的任何內容:

<path d="0"/>
<path d="CLEAN"/>
<path d=""/>
<path d=/>

我相信使用 sed 可以解決這個問題,但是由於字元串的複雜性,有很多字元、空格、連字元、數字等,我很難找到理想的命令

我正在使用的腳本範例:

#! /bin/bash

page_mofication="$(cat /opt/pagename/listing/latest_modifications/latest_modifications.log)"
fileold=/opt/pagename/latest_modifications/latest_modifications_old
filenew=/opt/pagename/latest_modifications/latest_modifications_new
log=/opt/pagename/listing/latest_modifications/latest_modifications.log
logold=/opt/pagename/oldfiles/latest_modifications/latest_modifications.log

mv $log $logold-`date +%d-%m-%Y_%H:%M:%S`
wget https://www.pagename.com -O $filenew


diff $fileold $filenew >> $log    
message=$'\n'"$page_mofication"
/etc/scripts/telegram-send.sh "$message"

cp $filenew $fileold
exit 0

關於如何解決這個問題的任何想法?

假設您使用的是 sed 編輯器的 GNU sed 版本。在比較之前嘗試將 fileold 和 filenew 文件中的路徑 d 數據清空。因此,您可以按照以下方式做一些事情:

sed -i '
/<path d=/c\
<path d=/>
'  -- "$fileold" "$filenew";

或者,如果您必須確保引號之間的字元僅是字母數字、連字元、水平空格

sed -Ei  '
s|(<path d)="[\t a-zA-Z0-9-]+"/>|\1=/>|
' -- "$fileold" "$filenew";

引用自:https://unix.stackexchange.com/questions/647949