38

I am currently trying to implement some code using Android to detect when a number of specific audio frequency ranges are played through the phone's microphone. I have set up the class using the AudioRecord class:

int channel_config = AudioFormat.CHANNEL_CONFIGURATION_MONO;
int format = AudioFormat.ENCODING_PCM_16BIT;
int sampleSize = 8000;
int bufferSize = AudioRecord.getMinBufferSize(sampleSize, channel_config, format);
AudioRecord audioInput = new AudioRecord(AudioSource.MIC, sampleSize, channel_config, format, bufferSize);

The audio is then read in:

short[] audioBuffer = new short[bufferSize];
audioInput.startRecording();
audioInput.read(audioBuffer, 0, bufferSize);

Performing an FFT is where I become stuck, as I have very little experience in this area. I have been trying to use this class:

FFT in Java and Complex class to go with it

I am then sending the following values:

Complex[] fftTempArray = new Complex[bufferSize];
for (int i=0; i<bufferSize; i++)
{
    fftTempArray[i] = new Complex(audio[i], 0);
}
Complex[] fftArray = fft(fftTempArray);

This could easily be me misunderstanding how this class is meant to work, but the values returned jump all over the place and aren't representative of a consistent frequency even in silence. Is anyone aware of a way to perform this task, or am I overcomplicating matters to try and grab only a small number of frequency ranges rather than to draw it as a graphical representation?

Amanda S
  • 3,178
  • 4
  • 31
  • 45
user723060
  • 383
  • 1
  • 4
  • 4

1 Answers1

33

First you need to ensure that the result you are getting is correctly converted to a float/double. I'm not sure how the short[] version works, but the byte[] version only returns the raw byte version. This byte array then needs to be properly converted to a floating point number. The code for the conversion should look something like this:

    double[] micBufferData = new double[<insert-proper-size>];
    final int bytesPerSample = 2; // As it is 16bit PCM
    final double amplification = 100.0; // choose a number as you like
    for (int index = 0, floatIndex = 0; index < bytesRecorded - bytesPerSample + 1; index += bytesPerSample, floatIndex++) {
        double sample = 0;
        for (int b = 0; b < bytesPerSample; b++) {
            int v = bufferData[index + b];
            if (b < bytesPerSample - 1 || bytesPerSample == 1) {
                v &= 0xFF;
            }
            sample += v << (b * 8);
        }
        double sample32 = amplification * (sample / 32768.0);
        micBufferData[floatIndex] = sample32;
    }

Then you use micBufferData[] to create your input complex array.

Once you get the results, use the magnitudes of the complex numbers in the results. Most of the magnitudes should be close to zero except the frequencies that have actual values.

You need the sampling frequency to convert the array indices to such magnitudes to frequencies:

private double ComputeFrequency(int arrayIndex) {
    return ((1.0 * sampleRate) / (1.0 * fftOutWindowSize)) * arrayIndex;
}
shams
  • 3,248
  • 22
  • 21
  • 2
    Thanks so much for your reply, but I still have a couple of issues. Before running the 'ComputeFrequency' method, should I still be able to extract the values from the returned complex array? The same problem still seems to permit with sporadic numbers appearing ranging from 10 to around 3000 whilst the room is in silence unfortunately – user723060 Apr 25 '11 at 13:49
  • Yes, you should still be able to extract the values from the complex array, you want to be using the magnitudes of the complex numbers (i.e. sqrt(re*re + im*im)). Even though the room is in complete silence, there might be background noise introduced by the mic which will show up on the FFT. Convert the array indices to frequencies to see what is the exact frequencies that show up. The values of those frequencies might help understand whether they are background noise or not. – shams Apr 25 '11 at 16:33
  • I am curious if I am calling the complex array correctly as regards the imaginary numbers. The way that I have implemented it now is very much the same as how I did it in my original example, but am now cycling through the new micBufferData array and assigning each value to a complex array as the real number with the imaginary number constantly as 0. This may be where I am going wrong, but previous examples I have read seem to indicate this is the correct method. Any idea if there is something else meant to go in there? Thanks again! – user723060 Apr 25 '11 at 16:52
  • Your complex number is fine. You need to set the real part only and set the imaginary part to zero. – shams Apr 26 '11 at 00:21
  • Thank you so much for your help, I have finally managed to get it working after far too long being confused over the issue! I didn't have to make any extra changes from what has already been said in here. – user723060 Apr 26 '11 at 02:29
  • 1
    I have similar issue please check my question any help appriciated pls. http://stackoverflow.com/questions/10908582/android-audiorecord-listening-sound-with-frequency-filter – d-man Jun 06 '12 at 05:33
  • Hey, If you've get it to work, could you please tell me if with this change, is the code performing faster? Since i'm using the same FFT and Complex class to analyse an audio signal, but performing FFT of a small 5sec audio is taking approx 10seconds on my HTC sensation. Any help would be much appreciated – Ahsan Zaheer Feb 07 '14 at 04:50
  • what is represented by 32768.0 on the second to last line of the outer loop? – brainmurphy1 Jul 03 '16 at 00:09
  • Hey @user723060 can you please post your solution? Thank you. – Rod Lima Feb 21 '19 at 19:14