1
AudioInputStream stream = AudioSystem.getAudioInputStream(new File("file_a4.wav"));

I am looking for a way to recognise the frequency of a musical scale sound (e.g. A4 = 440 Hz) recorded on a .wav file. I have read a lot about FFT, but it has been suggested that the frequencies on the musical scale do not match the FFT.

I have also heard about DTFT. What should I use to recognise the frequency from a sound file?

Michaello
  • 13
  • 1
  • 8
  • You can use a [Discrete-time Fourier transform](https://en.wikipedia.org/wiki/Discrete-time_Fourier_transform). If your WAV file consists of one frequency, you'll get one peak. If your WAV file consists of a musical score, you'll get several peaks, one for each note being played. – Gilbert Le Blanc Oct 08 '20 at 10:25
  • In fact, you'll probably get multiple peaks even for one musical note, unless it's an instrument that generates pure sinewaves. You'd probably need to search out the fundamental frequency from the DFT output. – Kevin Boone Oct 08 '20 at 11:46
  • Note that the player of Java-FX will provide a spectrum analysis of the audio signal. When I gave up on Java Sound as being unreliable (across platforms & versions) and began to use the Java-FX player, was glad I could replace the sound wave trace with a spectrum. – Andrew Thompson Oct 09 '20 at 03:52

1 Answers1

1

What I understand from your question is that you want to recognize the musical note/s an instrument is playing in a wav file. If that is the case, there are several algorithms for doing that, and you could always train a neural network for doing that too.
Some important Things to take into account are:

  1. Any instrument (the same would happen for the musical sounds produced by the human voice) has its own particular "color" when producing a note. This color is called the timbre (https://en.wikipedia.org/wiki/Timbre), and is composed by the harmonic and inharmonic frequencies that surround the frequency you psychoacoustically perceive when listening to that specific note. This is why you cannot just look for the peak of an FFT to detect the musical note, and it is also the reason why a piano sounds different than a guitar when playing the same note.

  2. The analysis of an audio signal is often performed by windowing the signal and calculating the DFT of the windowed part of the signal. Each window would then produce its own spectrum, and it s from the analysis of each individual spectrum and/or the analysis of how they interact that you (or your CNN, for example) will obtain your conclusions/results. This process of windowing the signal and calculating the DFTs produces a spectogram (https://en.wikipedia.org/wiki/Spectrogram#:~:text=A%20spectrogram%20is%20a%20visual,sonographs%2C%20voiceprints%2C%20or%20voicegrams.)

After that short introduction, here are some simple algorithms for identifying single notes in a wav file. You will be able to find implementations of those algorithms on the internet, and many others. The detection of the notes produced by chords is more complex but can be done with other algorithms or neural networks.

  1. On the use of autocorrelation analysis for pitch detection: https://ieeexplore.ieee.org/document/1162905
  2. YIN algorithm: http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf