4

The use case: I'm developing a web app to help students learn to read. The student is recorded while reading a text on a web app. The signal is sent by segment of 200ms to the backend and analysed before the student finishes reading to give live feedback during the reading. The server will send feedback after each segment analysis.

On the web app the code looks like this:

navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(stream => {
    const mediaRecorder = new MediaRecorder(stream)
    mediaRecorder.start(200)

    mediaRecorder.ondataavailable = event => {
        socket.emit('my_event', Blob([event.data]))
    }
})

On chrome, the media type produced is webm. I'm wondering how to handle the data on the backend so that I can analyse the media with numpy before the end of the recording.

For now I couldn't find a better way than something like:

from pydub import AudioSegment
def blobToSignal(blob, is_first_sequence):
    webm_header = b'\x1aE...'
    fp = tempfile.NamedTemporaryFile()
    fp.write(blob) if is_first_sequence else fp.write(webm_header + blob)
    fp.seek(0)
    samples = AudioSegment.from_file(fp.name, 'webm').get_array_of_samples()
    fp.close()
    return samples # this is almost a numpy array (analyzable)

I tried to change the front to return a Float32Array instead of a webm:

navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(stream => {
    audio_context = new AudioContext()
    var audioInput = audio_context.createMediaStreamSource(stream)
    var recorder = audio_context.createScriptProcessor(8096, 1, 1)
    recorder.onaudioprocess = event => {
        socket.emit(
            'my_event',
            Array.from(event.inputBuffer.getChannelData(0))
        )
    }
    audioInput.connect(recorder)
    recorder.connect(audio_context.destination)

So that the backend can use the raw signal but this method requires a too high bandwidth (~1Mb/s).

So my questions are:

  • am I doing something wrong here?
  • is there a Python librairy to decode a webm coming from a Buffer? (or something similar? I'm not so familiar with Python...)
  • how would you handle this use case?

Thanks for your help!

  • Just how "real time" is this? Does it start analyzing before the student has finished recording? How much are you doing client versus server side? – Acccumulation Jan 17 '18 at 22:57
  • Thanks @Acccumulation I edited the question to be clearer. For client/server I don't understand well what is the question. What I can say is that the full analysis won't be do-able on the front. However, for now I'm ready to perform all the pretreatment on front (encoding in another format etc.). – Nicolas Girault Jan 18 '18 at 09:07

0 Answers0