Multiple pitch detection: FFT or other?

Question

I've researched fast fourier transforms and have not been able to see a way for them to decode multiple frequencies from one signal. Is there a way to decompose the result of an fft calculation so that we can see individual pitches in a chord, or maybe to calculate the most likely chord based on the result of the fft?

If not, is there another method of pitch detection that can detect multiple pitches in a live setting yet?

EDIT: I am trying to do no more than six pitches at a time, as the software I am writing deals with guitars; in the offhand chance that the program user has a seven string guitar, it would need to be able to pick up seven pitches max.

That being the case, is an FFT (or some other method) able to handle this from a single microphone signal or do I have to make a guitar pickup that reads each string individually?

How many tones or sine waves are we talking about in the original signal? If it's just a few (such as the 2 core tones in a DTMF signal), the FFT would likely work. Just search for the peaks. Otherwise, for music, this is generally known to be a hard problem in computer science and signal processing. You could do an Internet search on "automatic music transcription" and you might find some software programs or code that attempt to do this. — selbie, Mar 08 '12 at 06:50
There's a famous piece of software called Melodyne that can do it for complex sounds. — cmannett85, Mar 08 '12 at 07:45
Possible duplicate of this question: http://stackoverflow.com/questions/4337487/chord-detection-algorithms/4339225#4339225 — hotpaw2, Mar 08 '12 at 08:22

score 3 · Accepted Answer · answered Mar 08 '12 at 23:44

3

There are two well known statistical techniques for parametric spectral estimation. One is MUSIC and the other one is ESPRIT. If you can express your signal as sum of weighted complex exponentials (i.e. sinusoidals) you can apply either of them. Moreover, the eigendecomposition of the correlation matrix will also tell you the number of frequencies in the signal so you are not even supposed to know that. ESPRIT is better than MUSIC since you are not supposed to do any search for peaks in frequency domain. The frequencies are given you directly as a result. However, MUSIC is known to be more robust.

answered Mar 08 '12 at 23:44

jkt

2,440
3
22
27

This is pretty much what I'm looking for; is it fast enough to produce live results (i.e. give live feedback to a guitarist)? – Adam Mar 14 '12 at 15:58
Depending on your size of data. I like ESPRIT more, and in that case you are supposed to do two eigendecompositions. For an M by M autocorrelation matrix, this corresponds to M^3 operations. The thing is that your M is supposed to be bigger than K (K is the number of frequencies in your signal). So I do not think it would be a huge problem in your case, as you do not have too many frequencies. – jkt Mar 14 '12 at 19:09
1

@YBE : MUSIC and ESPRIT are reported to perform poorly when one doesn't know the exact number of exponentials involved, and guitars can produce some large and varying number of overtones for each string (can be dozens), some of them potentially slightly inharmonic. – hotpaw2 Apr 03 '12 at 20:01
@hotpaw2, you are right about not knowing the number of exponentials, however, the eigendecomposition of the autocorrelation matrix also tells some about number of components involved. There should be a significant drop at some point in the eigenspectrum as signal eigenvalues are bigger than noise eigenvalues. Although, in the case of heavily noisy observations, noise eigenvalues will be comparable to signal eigenvalues. There is one other technique called as Annihilating Filter Method that is used with Cadzow's Iterative Denoising which is claimed to be robust. – jkt Apr 03 '12 at 22:05

score 1 · Answer 2 · answered Apr 03 '12 at 20:03

1

A guitar pick-up that isolates each string may be necessary. Otherwise unmixing all the overtones might be a very difficult problem.

answered Apr 03 '12 at 20:03

hotpaw2

68,014
12
81
143

score 1 · Answer 3 · answered Jan 16 '17 at 16:55

You need to first understand what 'pitch' really is (read the Wikipedia link below). When a single note is made on a guitar or piano, what we hear is not just one frequency of sound vibration, but a composite of multiple sound vibrations occurring at different mathematically related frequencies. The elements of this composite of vibrations at differing frequencies are referred to as harmonics or partials. For instance, if we press the Middle C key on the piano, the individual frequencies of the composite's harmonics will start at 261.6 Hz as the fundamental frequency, 523 Hz would be the 2nd Harmonic, 785 Hz would be the 3rd Harmonic, 1046 Hz would be the 4th Harmonic, etc. The later harmonics are integer multiples of the fundamental frequency, 261.6 Hz ( ex: 2 x 261.6 = 523, 3 x 261.6 = 785, 4 x 261.6 = 1046 ).

Below, at GitHub.com, is the C++ source code for an unusual two-stage algorithm that I devised which can do Realtime Pitch Detection on polyphonic MP3 files while being played on Windows. This free application (PitchScope Player, available on web) is frequently used to detect the notes of a guitar or saxophone solo upon a MP3 recording. You could download the executable for Windows to see my algorithm at work on a mp3 file of your choosing. The algorithm is designed to detect the most dominant pitch (a musical note) at any given moment in time within a MP3 or WAV music file. Note onsets are accurately inferred by a change in the most dominant pitch (a musical note) at any given moment during the MP3 recording.

I use a modified DFT Logarithmic Transform (similar to a FFT) to first detect these possible Harmonics by looking for frequencies with peak levels (see diagram below). Because of the way that I gather data for my modified Log DFT, I do NOT have to apply a Windowing Function to the signal, nor do add and overlap. And I have created the DFT so its frequency channels are logarithmically located in order to directly align with the frequencies where harmonics are created by the notes on a guitar, saxophone, etc.

My Pitch Detection Algorithm is actually a two stage process: a) First the ScalePitch is detected ('ScalePitch' has 12 possible pitch values: {E, F, F#, G, G#, A, A#, B, C, C#, D, D#} ) b) and after ScalePitch is determined, then the Octave is calculated by examining all the harmonics for the 4 possible Octave-Candidate notes. The algorithm is designed to detect the most dominant pitch (a musical note) at any given moment in time within a polyphonic MP3 file. That usually corresponds to the notes of an instrumental solo. Those interested in the C++ source code for my Two Stage Pitch Detection algorithm might want to start at the Estimate_ScalePitch() function within the SPitchCalc.cpp file at GitHub.com.

https://github.com/CreativeDetectors/PitchScope_Player

https://en.wikipedia.org/wiki/Transcription_(music)#Pitch_detection

Multiple pitch detection: FFT or other?

3 Answers3