4

So I have written a very simple Python 3 program to take all of the .wav audio files in the working directory and convert them to .flac. Don't worry about any assumptions it makes (like the .wav files being valid, output filenames already existing, etc.). See below:

import os
from multiprocessing import Pool
from pydub import AudioSegment

def worker(filename):
  song = AudioSegment.from_wav(filename)
  song.export(filename.replace(".wav",".flac"), format = "flac")

if __name__ == '__main__':
  converted_count = 0
  convertlist = []
  for filename in os.listdir(os.getcwd()):
    if filename.endswith(".wav"):
      convertlist.append(filename)
      converted_count += 1
  p = Pool(processes=min(converted_count, os.cpu_count()))
  p.map(worker, convertlist)

I've timed this and, on my system, have noticed a significant speedup as compared to using no multiprocessing at all. However, this almost seems too simple; I'm not very experienced with multiprocessing (nor multithreading), so I'm honestly not sure if there's a faster, more efficient way to do this.

If you were tasked with writing this simple converter in Python, how would you do it? What would you do to make it more efficient than this?

jippyjoe4
  • 742
  • 4
  • 19

2 Answers2

0

The pydub package that you are using in worker() is already pretty efficient, given that it is backed by a low level media library that is not written in Python: FFmpeg. That is, under the hood, pydub calls the ffmpeg command using the subprocess module (see pydub code) to perform the conversion, resulting in a command that looks like this:

ffmpeg -y -f wav -i input.wav -write_xing 0 -f flac output.flac 

Unfortunately, FFmpeg's FLAC encoder implementation does not seem to be parallelized and it is therefore not possible to improve the encoding speed of each independent file while keeping the same encoding quality, with this specific encoder. Assuming that you want to keep using pydub and its FFmpeg FLAC encoder, your approach to the problem consisting of processing each file in a different process sounds reasonable.

However, if you really wanted to improve performance at all costs, an alternative option would be to trade audio quality for conversion speed. For this, you could tune some of the encoding parameters, such as reducing the sampling frequency:

# export the song with a lower sampling frequency
low_quality_song = song.set_frame_rate(11025)
low_quality_song.export(flac_filename, format="flac")

That being said, given that you are targeting a lossless format, switching to a more efficient (potentially GPU-based) encoder would most likely produce much better results while keeping the same audio quality.

Pierre
  • 996
  • 7
  • 10
  • 1
    So if pydub is using multithreading, why would there be any speed up at all from using multiprocessing on top of it? – MichaelSB Aug 12 '18 at 02:38
  • Thanks for pointing this out. I did some additional research on the FLAC encoder used by FFmpeg and updated my answer accordingly. – Pierre Aug 13 '18 at 21:51
0

If you have lots of small files (e.g. audio samples for speech recognition), so that conversion of a single file takes a fraction of a second, then you can reduce scheduling overhead of the multiprocessing module by batching multiple files together for conversion, so that each process converts more than one file at a time. E.g. modify the filename loop, or group items in convertlist and add iteration in worker.

MichaelSB
  • 2,783
  • 2
  • 17
  • 33