0

I'm trying to make a realtime(ish) monophonic guitar to midi program. I want a latency of <=6 milli secs. To find what note was played i aim to sample 256 points (should take approx 6 millis) , run an fft and analyze mag plot to determine pitch of note played.

When i do this in matlab, it gives me back very unstable/inaccurate results with peaks appearing in random places etc.

The note being inputted is 110Hz sampled @ 44.1khz. I've applied a high pass filter at 500hz with a roll off of 48db/octave.. so only the higher harmonics of signal should remain. The audio last for 1 second ( filled with zeros after 256 samples)

Code:

%fft work

guitar = wavread('C:\Users\Donnacha\Desktop\Astring110hz.wav');
guitar(1:44100);
X = fft(guitar);
Xmag = abs(X);
plot(Xmag);

Zoomed in FFT plot

I was hoping to see all the harmonics of 110Hz (A note on guitar) starting at >500hz..

How would i achieve accurate results from an FFT with such little data?

Irreducible
  • 864
  • 11
  • 21
  • 2
    You can't get something for nothing. The frequency resolution of an `N` point FFT with a sample rate `Fs` is `Fs / N`. So for a 44.1 kHz sample rate and a 256 point FFT your resolution is around 172 Hz. You can try interpolating the peaks to get a better resolution, but it will most likely be too inaccurate for your needs. – Paul R Oct 17 '17 at 13:21
  • So your FFT has 128 FFT bins (one sided), with a sampling rate of 44.1kHz you have a resolution of ~172Hz per bin. I am not sure what you are expecting to see? – Irreducible Oct 17 '17 at 13:22
  • 1
    See [this similar question](https://stackoverflow.com/q/41783512/253056) for a discussion on getting better frequency estimates. – Paul R Oct 17 '17 at 13:27
  • Im padding the the rest of data points with zeros though .. so my resolution should be 1hz ? 44100 points (1 sec of data) /Fs.. how do guitar to midi programs about achieving such low latency ? Surely they can only sample incoming data for very short periods to get a an almost instant response ? Thanks for responses btw – flickyducky101 Oct 17 '17 at 14:16
  • Zero-padding does not really increase informational spectral resolution, just plot resolution of an interpolated (very rounded and thus inaccurate in noise) spectrum. The separation resolution (to tell harmonics apart) is still around 2.5*Fs/N or over 400 Hz in your case. – hotpaw2 Oct 17 '17 at 20:36

1 Answers1

1

You can't. (at least reliably for all notes in a guitar's range).

256 samples at 44.1kHz is less than one period of most low string guitar notes. One period of vibration from a guitar's open low E string takes around 535 samples, depending on the guitar's tuning and intonation.

Harmonics often require multiple periods (repetitions) of a guitar note waveform within the FFT window in order to reliably show up in the FFT's spectrum. The more periods within the FFT window, the more reliably and sharper the harmonics show up in the FFT spectrum. Even more periods are required if the data is Von Hann (et.al.) windowed to avoid "leakage" windowing artifacts. So you have to pick the minimum number of periods needed based on the lowest note needed, your window type, and your statistical reliability and frequency resolution requirements.

An alternative is to concatenate several sets of your 256 samples into a longer window, at least as long as several periods of the lowest pitch you want to reliably plot.

hotpaw2
  • 68,014
  • 12
  • 81
  • 143
  • There are better methods of low latency pitch tracking than using a zero-padded FFT. But that's a separate question. – hotpaw2 Oct 17 '17 at 20:30
  • Wont that just add to the latency? I was hoping to only analyse the higher harmonics therefore elimating the need for more data points. The low E on a guitar is approx. 82hz/12millis for full cycle .. but i was hoping that only taking 6 millis of data (256ish data points) , i would at least capture the higher harmonics.. is this thinking flawed? Would a time based method (perhaps cross correlation ) be more suitable ? – flickyducky101 Oct 17 '17 at 21:07
  • Do you know what causes the higher harmonics to be generated or appear in a spectrum? Other time based methods are a separate question, so I won't answer under the current FFT question. – hotpaw2 Oct 17 '17 at 21:21
  • Thats okay.. Regarding harmonics , my naive understanding is that if i have more cycles contained in my sampled input data , the fft will generate a more accurate plot. So within that 256 samples there should be numerous cycles of higher harmonics? I dont have an in depth knowledge of how an fft generates a spectrum so my thinking may be incorrect.. – flickyducky101 Oct 17 '17 at 21:49
  • Yes ? Its definitely a composite waveform from beginning.. i am only judging from time domain here but there are numerous higher frequency oscillations – flickyducky101 Oct 17 '17 at 22:09