Fastest way to read/process large wav files (or any large file) to python

Question

I'm working on a school project where I have to work with large wav files ( > 250Mgb), and I wonder, why when I read such a file to audacity software, it takes about 40 sec to be read and plotted, but when reading it to python using script.io.wavfile.read, it just last for ever.

So my question is, how does audacity software make it that fast and is this something I can do in python to make it that fast?

EDIT: I added a new section to my code which computes and plots the envelope of a wav file, but the problem is when trying a large wav file, it just going to take years.

Is there any way to read and process large wav files faster?

This is the code I'm using:

import matplotlib.pyplot as plt
import numpy as np
from scipy.io.wavfile import read
from tkinter import filedialog

# Browse, read the signal and extract signal informations (fs, duration)
filename = filedialog.askopenfilename(filetypes = (("""
            Template files""", "*.wav"), ("All files", "*")))

fs, data = read(filename, mmap=True)

T = len(data) / fs        #duration
nsamples = T * fs       #number of samples
time = np.linspace(0, T, nsamples)


# Compute the envelope of the signal
from scipy.signal import hilbert, chirp, resample
analytic_signal = hilbert(data)
amplitude_envelope = np.abs(analytic_signal)
instantaneous_phase = np.unwrap(np.angle(analytic_signal))
instantaneous_frequency = (np.diff(instantaneous_phase) /(2.0*np.pi) * fs)


len_E = len(amplitude_envelope)
t2 = np.linspace(0,T,len_E)

# Plot the signal and its envelope
plt.figure()
plt.subplot(211)
plt.plot(time, data)

plt.subplot(212)
plt.plot(t2,amplitude_envelope)
plt.show()

Pass 1: Read the headers, skip the data chunks. Pass 2: Read necessary data chunks. Pass 3: Read remaining data chunks. See also http://soundfile.sapp.org/doc/WaveFormat/ Show your code for more specific comments. — bishop, Mar 15 '19 at 07:10
Thank you sir for the informations, i added the code, i have no idea how to implement what you have commented in my code, any help would be much appreciated — , Mar 16 '19 at 07:09
You can try using one of the [other libraries that read wav files](https://stackoverflow.com/questions/2060628/reading-wav-files-in-python). Personally, I found `librosa.read()` to be much faster than scipy, but I just ran a test and it was slower. — Boris, May 15 '19 at 23:54
I know nothing about `wav` format. But if the format is usable that way I would **read** the whole file into RAM than **split** and hand the parts over to multiple **processes** (would use `multiprocessing.Pool`) for that. Then it would computed on multiple cores. — buhtz says get vaccinated, Jun 13 '19 at 08:29

Fastest way to read/process large wav files (or any large file) to python

0 Answers0