12

I'm working on a project on the iPhone where I'm recording audio from the device mic using AVAudioRecorder, and then will be manipulating the recording.

To ensure that I'm reading in the samples from the file correctly, I'm using python's wave module to see if it returns the same samples.

However, python's wave module returns "fmt chunk and/or data chunk missing" when trying to open the wav file that is saved by AVAudioRecorder.

These are the settings I am using to record the file:

[audioSettings setObject:[NSNumber numberWithInt:kAudioFormatLinearPCM] forKey:AVFormatIDKey];
[audioSettings setObject:[NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey];
[audioSettings setObject:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey];
[audioSettings setObject:[NSNumber numberWithFloat:4096] forKey:AVSampleRateKey];
[audioSettings setObject:[NSNumber numberWithInt:1] forKey:AVNumberOfChannelsKey];
[audioSettings setObject:[NSNumber numberWithBool:YES] forKey:AVLinearPCMIsNonInterleaved];
[audioSettings setObject:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey]; 

After that, I'm just making a call to recordForDuration to actually do the recording.

The recording succeeds-- I can play the file etc, and I can read in the samples using AudioFile services, but I can't validate it because I can't open the file with Python's wave module.

This is what the first 128 bytes of the file look like:

1215N:~/Downloads$ od -c --read-bytes 128 testFile.wav
0000000   R   I   F   F   x   H 001  \0   W   A   V   E   f   m   t    
0000020 020  \0  \0  \0 001  \0 001  \0   @ 037  \0  \0 200   >  \0  \0
0000040 002  \0 020  \0   F   L   L   R 314 017  \0  \0  \0  \0  \0  \0
0000060  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000200

Any idea what I need to do to make sure a correct WAV header is written out by AVAudioRecorder?

ch3rryc0ke
  • 2,763
  • 1
  • 19
  • 28

2 Answers2

37

Apple software often creates WAVE files with a non-standard (but "spec" conformant) "FLLR" subchunk after the "fmt " subchunk and before the "data" subchunk. I assume "FLLR" stands for "filler", and I assume the purpose of the subchunk is to enable some sort of data alignment optimization. The subchunk is usually about 4000 bytes long, but its actual length can vary depending on the length of the data preceding it.

Adding arbitrary subchunks to WAVE files is generally considered spec-conformant because WAVE is a subset of RIFF, and the common practice in RIFF file processing is to ignore chunks and subchunks which have an unrecognized identifier. The identifier "FLLR" is "non-standard" and so should be ignored by any software which encounters it.

There is a fair amount of software out there that treats the WAVE format much more rigidly than it ought to, and I suspect the library you're using may be one of those pieces of software. For example, I have seen software that assumes that the audio bytes always begin at offset 44 -- this is an incorrect assumption.

In fact, finding the audio bytes in a WAVE file must be done by finding the location and size of the "data" subchunk within the RIFF; this is the correct way to locate the audio bytes within a WAVE file.

Reading WAVE files properly must really begin as an exercise in locating and identifying RIFF subchunks. RIFF subchunks have an 8-byte header: 4 bytes for an identifier/name field which is traditionally filled with human-readable ASCII characters (e.g. "fmt "), and a 4-byte little-endian unsigned integer specifying the number of bytes in the subchunk's data payload -- the subchunk's data payload follows immediately after its 8-byte header.

The WAVE file format reserves certain subchunk identifiers (or "names") as being meaningful to the WAVE format. There are a minimum of two subchunks that must always appear in every WAVE file:

  1. "fmt " - the subchunk with this identifier has a payload which describes the basic information about the audio's format: sample rate, bit depth, etc.
  2. "data" - the subchunk with this identifier has the actual audio bytes in its payload

"fact" is the next most common subchunk identifier. It is usually found in WAVE files that use a compressed codec, such as μ-law. See this enthusiast webpage for more information about some of the various subchunk identifiers in use today in the wild, and information about their payload structure.

From a purely RIFF perspective, subchunks need not appear in any particular order in the file, or at any particular fixed offset. In practice however, almost all software expects the "fmt " subchunk to be the first subchunk. This is a concession to practicality: it is convenient to know early in the data stream what format of audio the WAVE contains -- this makes it easier to play a wave file from a network stream, for example. If the WAVE file uses a compressed format, such as μ-law, it is usually assumed that the "fact" subchunk will appear directly after "fmt ".

After the format-specifying chunks are out of the way, assumptions about the location, ordering, and naming of subchunks should be abandoned. At this point, the software should locate expected subchunks by name only (e.g. "data"). If subchunks are encountered that have unrecognized names (e.g. "FLLR"), those subchunks should simply be skipped over and ignored. Skipping a subchunk requires reading its length so that you can skip over the correct number of bytes.

What Apple has done with the "FLLR" subchunk is slightly unusual, and I'm not surprised that some software is tripped up by it. I suspect that the library you are using is simply unprepared to deal with the presence of the "FLLR" subchunk. I would consider this a defect in the library. The mistake the library authors have made is probably something like:

  1. They may be expecting the "data" subchunk to appear within the first N bytes of the beginning of the file, where N is something less than ~4kB. They may give up looking if they have to scan too far into the file. The Apple "FLLR" subchunk pushes the "data" subchunk to a position >~4kB into the file.

  2. They may be expecting the "data" subchunk to have a specific ordinal subchunk position or byte offset within the RIFF. Perhaps they expect "data" to appear immediately after "fmt ". This is an incorrect way to process a RIFF file, though. The ordinal position and/or offset position of the "data" subchunk should not be assumed.

As long as we're talking about correct WAVE file processing, I might as well remind everyone that the audio bytes (the data subchunk's payload) may not run exactly to the end of the file. It is allowable to insert subchunks after the data payload. Some programs use this to store a textual "comment" field at the end of the file. If you read blindly from the start of the data payload until the EOF, you may pull in some metadata subchunks as audio, which will sounds like a "click" at the end of playback. You need to honor the length field of the data subchunk and stop reading audio once you've consumed the entire data payload -- not stop when you hit EOF.

Mike Clark
  • 9,026
  • 2
  • 35
  • 49
  • 4
    This is one heck of a great answer. Helped me out too, finding a seemingly random FLLR chunk 4044 bytes long filled with `00` bytes in my WAV file. Thanks! – tomsmeding Apr 14 '13 at 08:02
  • 1
    `fact` is a perfectly valid chunk even for `WAVE_FORMAT_PCM` (type 1). Although it is just not 'required' for that format, most MS software puts it in there anyway (even W2K soundrec.exe etc). Choking on it is just the same error (of not correctly finding chunks and reading their data-size) as with the `FLLR` chunk (which I suppose is there to align the wave (as file, not bare audio cd PCM-stream) to 2048 for cd-alignment (think game-music, not in 16bps 41.1khz audio-cd format), something that was previously done with the `JUNK` chunk). – GitaarLAB Sep 25 '14 at 07:03
  • 1
    SoundForge e.g. puts a 72 byte chunk after the data chunk at the end of the file. If you don‘t evaluate the size of the data chunk correctly you‘ll hear a click a the end of the audio portion. – mramosch Oct 13 '19 at 00:27
0

What's the name of the file you're recording to on disk? I had a similar problem and just solved it by tacking on .wav to the end of my filename... I guess AVAudioRecorder needs an extension to figure things out.

kevboh
  • 5,127
  • 5
  • 35
  • 53
  • I am tacking on the .wav file extension. The files play normally in iTunes, but I can't read them using wavread libraries. – ch3rryc0ke Jun 17 '11 at 20:41
  • Another thing that doesn't make sense is that the file size of a 5 second clip recorded @ 16 bit single channel , 4096 sampling frequency turns out to be 84 KB. IT should be closer to 40KB (4096*5*2) – ch3rryc0ke Jun 17 '11 at 20:42
  • Upon further inspection-- QuickTime player shows the file being recorded at 8000hz, not 4096 as I'm specifying. I guess the iPhone simulator can't record at 4096hz? – ch3rryc0ke Jun 17 '11 at 20:49
  • Just tried recording on iPhone4 with IOS 4.3.2, and it still records @ 8000 hz. – ch3rryc0ke Jun 21 '11 at 00:32
  • Should I break out the issue of AVAudioRecorder not recording with the correct frequency as a separate question? – ch3rryc0ke Jun 21 '11 at 00:38
  • 1
    hi sorry to interpret in this way, I am creating a wav file with audio data file offset: 4096, is there any way can make it 76 using avfoundation Question as follows : https://stackoverflow.com/questions/51277675/avaudiorecorder-and-audio-data-file-offset-of-wav-audio-file @ch3rryc0ke advice pls – nickypatson Jul 11 '18 at 07:11