Ubuntu
根據帶有 sox 或 ffmpeg 的文本文件中的時間戳將音頻分成幾部分
我查看了以下連結:使用開始和停止時間修剪音頻文件
但這並不能完全回答我的問題。我的問題是:我有一個音頻文件,例如
abc.mp3
orabc.wav
。我還有一個包含開始和結束時間戳的文本文件:0.0 1.0 silence 1.0 5.0 music 6.0 8.0 speech
sox
我想使用 Python 和/將音頻分成三個部分ffmpeg
,從而產生三個單獨的音頻文件。我如何使用
sox
or來實現這一點ffmpeg
?稍後我想使用
librosa
.我有
Python 2.7
、ffmpeg
和sox
Ubuntu Linux 16.04 安裝。
我只是快速入門,測試方式很少,所以也許會有所幫助。下面依賴於ffmpeg-python,但
subprocess
無論如何編寫都不是挑戰。目前,時間輸入文件僅被視為時間對,開始和結束,然後是輸出名稱。缺少的名稱被替換為
linecount.wav
import ffmpeg from sys import argv """ split_wav `audio file` `time listing` `audio file` is any file known by local FFmpeg `time listing` is a file containing multiple lines of format: `start time` `end time` output name times can be either MM:SS or S* """ _in_file = argv[1] def make_time(elem): # allow user to enter times on CLI t = elem.split(':') try: # will fail if no ':' in time, otherwise add together for total seconds return int(t[0]) * 60 + float(t[1]) except IndexError: return float(t[0]) def collect_from_file(): """user can save times in a file, with start and end time on a line""" time_pairs = [] with open(argv[2]) as in_times: for l, line in enumerate(in_times): tp = line.split() tp[0] = make_time(tp[0]) tp[1] = make_time(tp[1]) - tp[0] # if no name given, append line count if len(tp) < 3: tp.append(str(l) + '.wav') time_pairs.append(tp) return time_pairs def main(): for i, tp in enumerate(collect_from_file()): # open a file, from `ss`, for duration `t` stream = ffmpeg.input(_in_file, ss=tp[0], t=tp[1]) # output to named file stream = ffmpeg.output(stream, tp[2]) # this was to make trial and error easier stream = ffmpeg.overwrite_output(stream) # and actually run ffmpeg.run(stream) if __name__ == '__main__': main()