Pdf

如何在 groff 中正確顯示波蘭語變音符號?

  • January 19, 2020

我正在玩groff ,我想從以下內容生成pdftest.ms

.TL
Tytuł
.AU
Imię Nazwisko
.NH
Wstęp
.PP
Pierwszy paragraf. Jakieś informacje, żeby były polskie znaki.
.PP
Drugi paragraf. Reszta znaków:

ąęćłńśóżźĄĘĆŁŃŚÓŻŹ
.NH
Bla bla bla
.PP
safsdsdfsasdds

如您所見,它包含波蘭語變音符號。用它編譯後,groff -ms test.ms -T pdf > test.pdf我們會看到這個爛攤子: 可怕!

我的第一個猜測是使用 utf-8 支持重新編譯。

$ groff -Kutf8 -ms test.ms -T pdf > test.pdf
test.ms:4: warning: can't find special character `u0065_0328'
test.ms:8: warning: can't find special character `u0073_0301'
test.ms:8: warning: can't find special character `u00A0'
test.ms:8: warning: can't find special character `u007A_0307'
test.ms:12: warning: can't find special character `u0061_0328'
test.ms:12: warning: can't find special character `u006E_0301'
test.ms:12: warning: can't find special character `u007A_0301'
test.ms:12: warning: can't find special character `u0041_0328'
test.ms:12: warning: can't find special character `u0045_0328'
test.ms:12: warning: can't find special character `u004E_0301'
test.ms:12: warning: can't find special character `u0053_0301'
test.ms:12: warning: can't find special character `u005A_0307'
test.ms:12: warning: can't find special character `u005A_0301'

Groff 只是忽略了大部分符號,pdf 看起來像這樣:

還是很可怕。

經過一番Google搜尋後,我發現了這個

groff -Kutf8 -Tdvi -mec -ms test.ms > test.dvi
dvipdfm -cz 9 test.dvi

是的,它仍然失敗(雖然更好,只跳過了一個字元):

$ groff -Kutf8 -Tdvi -mec -ms test.ms > test.dvi
test.ms:8: warning: can't find special character `u00A0'

我怎樣才能讓它工作?

**編輯:**這是輸出locale

LANG=pl_PL.UTF-8
LANGUAGE=
LC_CTYPE="pl_PL.UTF-8"
LC_NUMERIC="pl_PL.UTF-8"
LC_TIME="pl_PL.UTF-8"
LC_COLLATE="pl_PL.UTF-8"
LC_MONETARY="pl_PL.UTF-8"
LC_MESSAGES="pl_PL.UTF-8"
LC_PAPER="pl_PL.UTF-8"
LC_NAME="pl_PL.UTF-8"
LC_ADDRESS="pl_PL.UTF-8"
LC_TELEPHONE="pl_PL.UTF-8"
LC_MEASUREMENT="pl_PL.UTF-8"
LC_IDENTIFICATION="pl_PL.UTF-8"
LC_ALL=

性格A0是一個牢不可破的空間。看起來它介於“Jakieś”和“informacje”之間。使用您的編輯器將其替換為普通空間,您應該一切順利。

建議:我已經設置了我的編輯器(emacs、vim)來突出顯示牢不可破的空格,因為我有時會在鍵入需要按的字元後按AltGr+space時無意中鍵入一些內容。space``AltGr

您第一次猜測後的警告似乎表明某些字元(ę,ś,ż…)是通過組合變音符號而不是本機編碼的。例如 ę == e(十六進制 65)+ 組合 ogonek(十六進制 328)而不是“e 與 ogonek”(十六進制 119)。你如何編輯你的源文件?您可以使用 Compose 鍵生成帶有變音符號的“獨立”字母,例如Compose e ,“ę”。

引用自:https://unix.stackexchange.com/questions/486616