csplit多個文件為多個文件

January 5, 2020

伙計們-

我有點難過，關於這個。我正在嘗試編寫一個 bash 腳本，該腳本將使用 csplit 獲取多個輸入文件並根據相同的模式拆分它們。（對於上下文：我有多個帶有問題的 TeX 文件，由 \question 命令分隔。我想將每個問題提取到他們自己的文件中。）

我到目前為止的程式碼：

#!/bin/bash
# This script uses csplit to run through an input TeX file (or list of TeX files) to separate out all the questions into their own files.
# This line is for the user to input the name of the file they need questions split from.

read -ep "Type the directory and/or name of the file needed to split. If there is more than one file, enter the files separated by a space. " files

read -ep "Type the directory where you would like to save the split files: " save

read -ep "What unit do these questions belong to?" unit

# This is a check for the user to confirm the file list, and proceed if true:

echo "The file(s) being split is/are $files. Please confirm that you wish to split this file, or cancel."
select ynf in "Yes" "No"; do
   case $ynf in 
       No ) exit;;
       Yes ) echo "The split files will be saved to $save. Please confirm that you wish to save the files here."
           select ynd in "Yes" "No"; do
           case $ynd in
               Yes )
#                   This line will create a loop to conduct the script over all the files in the list.
                   for i in ${files[@]}
                   do
#                   Mass re-naming is formatted to give "guestion###.tex' to enable processing a large number of questions quickly.
#                   csplit is the utility used here; run "man csplit" to learn more of its functionality.
#                   the structure is "csplit [name of file] [output options] [search filter] [separator(s)].
#                   this script calls csplit, will accept the name of the file in the argument, searches the files for calls of "question", splits the file everywhere it finds a line with "question", and renames it according to the scheme [prefix]#[suffix] (the %03d in the suffix-format is what increments the numbering automatically).
#                   the '\\question' allows searching for \question, which eliminates the split for \end{questions}; eliminating the \begin{questions} split has not yet been understood.
                       csplit $i --prefix=$save'/'$unit'q' --suffix-format='%03d.tex' /'\\question'/ '{*}'
                   done; exit;;
               No ) exit;;
           esac
       done
   esac
done

return

我可以確認它確實按照我對我擁有的輸入文件的預期執行循環。但是，我注意到的行為是它會按預期將第一個文件拆分為“q1.tex q2.tex q3.tex”，並且當它移動到列表中的下一個文件時，它將拆分問題並覆蓋舊文件，第三個文件將覆蓋第二個文件的拆分等。我想要發生的是，比如說，如果 File1 有 3 個問題，它將輸出：

q1.tex
q2.tex
q3.tex

然後如果 File2 有 4 個問題，它將繼續遞增到：

q4.tex
q5.tex
q6.tex
q7.tex

csplit 有沒有辦法檢測在這個循環中已經完成的編號，並適當地增加？

感謝您提供的任何幫助！

該csplit命令沒有保存的上下文（也不應該），因此它總是從 1 開始計數。沒有辦法解決這個問題，但您可以維護自己的插入到前綴字元串中的計數值。
或者，嘗試更換
read -ep "Type the directory and/or name of the file needed to split. If there is more than one file, enter the files separated by a space. " files

...

for i in ${files[@]}
do
   csplit $i --prefix=$save'/'$unit'q' --suffix-format='%03d.tex' /'\\question'/ '{*}'
done
和
read -a files -ep 'Type the directory and/or name of the file needed to split. If there is more than one file, enter the files separated by a space. '

...

cat "${files[@]}" | csplit - --prefix="$save/${unit}q" --suffix-format='%03d.tex' '/\\question/' '{*}'
這是相對罕見的情況之一，其中確實需要使用cat {file} | ...ascsplit只需要一個文件參數（或-用於stdin）。
我已將您的read操作更改為使用數組變數，因為這就是您（正確地）嘗試在for ... do csplit ...循環中使用的內容。
無論您最終決定做什麼，我都強烈建議您在使用它們的所有變數處雙引號，特別是對數組列表的任何進一步使用，例如"${files[@]}".

引用自：https://unix.stackexchange.com/questions/560139

csplit多個文件為多個文件

相關問答

在匹配位置的文件夾的文件上遞歸執行命令，而不是在原始文件中

使用 `find 循環遍歷目錄。-深度 1 -類型 d`

如何製作一個腳本來重命名帶有修改日期的圖像和影片？

ls的for循環解析和背後的魔力*

我如何並行執行這個嵌套的 for 循環？

無法弄清楚為什麼文件不會複製

csplit多個文件為多個文件

相關問答

在匹配位置的文件夾的文件上遞歸執行命令，而不是在原始文件中

使用 find 循環遍歷目錄。-深度 1 -類型 d

如何製作一個腳本來重命名帶有修改日期的圖像和影片？

ls的for循環解析和背後的魔力*

我如何並行執行這個嵌套的 for 循環？

無法弄清楚為什麼文件不會複製

使用 `find 循環遍歷目錄。-深度 1 -類型 d`