43

I'd like to write a simple C# application to monitor the line-in audio and give me the current (well, the rolling average) beats per minute.

I've seen this gamedev article, and that was absolutely no help. I went through and tried to implement what he was doing but it just wasn't working.

I know there have to be tons of solutions for this, because lots of DJ software does it, but I'm not having any luck in finding any open-source library or instructions on doing it myself.

Helder Pereira
  • 4,704
  • 2
  • 27
  • 45
Karl
  • 753
  • 2
  • 9
  • 22
  • Another article that might be interesting for you... [http://werner.yellowcouch.org/Papers/bpm04/](http://werner.yellowcouch.org/Papers/bpm04/) You can find existing BPM detection libraries here: [http://www.mmartins.com/mmartins/bpmdetection/bpmdetection.asp](http://www.mmartins.com/mmartins/bpmdetection/bpmdetection.asp) ...and a C# BPM detection library here: [http://adionsoft.net/bpm/](http://adionsoft.net/bpm/) – sachaa Sep 20 '08 at 18:31
  • 1
    2 names : Eric D. Scheirer Masataka Goto Google them, they have written about beat detection (realtime and offline) in great length. Very interesting material. Also as a sidenote, I think besides beat-detection you might be interested in beat-*prediction*. – Led May 30 '09 at 23:58
  • i believe accord has the necessary algorithms you want. also, lots of interesting machine learning alogrithms for lots more fun! there's an example in the tutorial / sample programs if i remember correctly. accord.net – user3791372 May 23 '17 at 23:52

8 Answers8

26

Calculate a powerspectrum with a sliding window FFT: Take 1024 samples:

double[] signal = stream.Take(1024);

Feed it to an FFT algorithm:

double[] real = new double[signal.Length];
double[] imag = new double[signal.Length);
FFT(signal, out real, out imag);

You will get a real part and an imaginary part. Do NOT throw away the imaginary part. Do the same to the real part as the imaginary. While it is true that the imaginary part is pi / 2 out of phase with the real, it still contains 50% of the spectrum information.

EDIT:

Calculate the power as opposed to the amplitude so that you have a high number when it is loud and close to zero when it is quiet:

for (i=0; i < real.Length; i++) real[i] = real[i] * real[i];

Similarly for the imaginary part.

for (i=0; i < imag.Length; i++) imag[i] = imag[i] * imag[i];

Now you have a power spectrum for the last 1024 samples. Where the first part of the spectrum is the low frequencies and the last part of the spectrum is the high frequencies.

If you want to find BPM in popular music you should probably focus on the bass. You can pick up the bass intensity by summing the lower part of the power spectrum. Which numbers to use depends on the sampling frequency:

double bassIntensity = 0;
for (i=8; i < 96; i++) bassIntensity += real[i];

Now do the same again but move the window 256 samples before you calculate a new spectrum. Now you end up with calculating the bassIntensity for every 256 samples.

This is a good input for your BPM analysis. When the bass is quiet you do not have a beat and when it is loud you have a beat.

Good luck!

FreelanceConsultant
  • 9,832
  • 22
  • 88
  • 173
Hallgrim
  • 14,210
  • 10
  • 40
  • 54
  • You do realize that with just 1024 samples you will not be able to calculate the BPM (beats per MINUTE)... right? You are just calculating the top frequency in that "tiny" 1024-bytes long sample. – sachaa Sep 20 '08 at 18:22
  • It is correct that I calculate the top frequency in a tiny sample, but then I suggested to move the window 256 samples and calculate it again. The result is a series that can be used to calculate the BPM. I only described how to solve the first part of the problem. – Hallgrim Sep 28 '08 at 00:41
  • 2
    Sacha it's just a matter of scaling units. You don't need to analyze a minute's worth of samples before you can calculate the BPM. BPM is just a more musical way of conveying the period of a signal. – Bob Somers May 16 '09 at 04:00
  • 14
    Don't throw away the imaginary component, it is _not_ just the phase! The phase is given by the arctan(imag/real), which is uesless for this problem. The important part, the magnitude, is given by real^2 + imag^2, not just by real^2. – Adam Rosenfield May 31 '09 at 00:16
  • 1
    As you said, this is only the first part of the solution... In fact, you just have calculated a series that shows how the low frequencies magnitude changes with the time (almost like applying a low-pass filter). In other words, the solution is far harder than it looks in your answer. – Alceu Costa Aug 22 '09 at 21:29
  • In popular music you can *often* count on a bass drum, but in a lot of music the tempo is driven by the high hat, snare drum, or other broadband acoustic devices. Good luck find the BPM for a capella music with a chorus of singers. This approach will get something that sort of works, but only by measuring something incidentally related to BPM in some material. If you're at all concerned about accuracy, look at one of the academic papers other people have mentioned below. – jbarlow Jun 28 '10 at 19:07
  • @AdamRosenfield I was a bit confused at first. I thought you meant the second half of the fft output, when you talked about not throwing away the imaginary component. Since this second half is basically just a mirror of the first half. But later I realised what you meant by imaginary part. – Jespertheend Mar 05 '17 at 01:14
15

There's an excellent project called Dancing Monkeys, which procedurally generates DDR dance steps from music. A large part of what it does is based on (necessarily very accurate) beat analysis, and their project paper goes into much detail describing the various beat detection algorithms and their suitability to the task. They include references to the original papers for each of the algorithms. They've also published the matlab code for their solution. I'm sure that between those you can find what you need.

It's all available here: http://monket.net/dancing-monkeys-v2/Main_Page

Nick Johnson
  • 98,961
  • 16
  • 125
  • 196
8

Not that I have a clue how to implement this, but from an audio engineering perspective you'd need to filter first. Bass drum hits would be the first to check. A low pass filter that gives you anything under about 200Hz should give you a pretty clear picture of the bass drum. A gate might also be necessary to cleanup any clutter from other instruments with harmonics that low.

The next to check would be snare hits. You'd have to EQ this one. The "crack" from a snare is around 1.5kHz from memory, but you'd need to definitely gate this one.

The next challenge would be to work out an algorithm for funky beats. How would you programatically find beat 1? I guess you'd keep track of previous beats and use a pattern matching something-or-other. So, you'd probably need a few bars to accurately find the beat. Then there's timing issues like 4/4, 3/4, 6/8, wow, I can't imagine what would be required to do this accurately! I'm sure it'd be worth some serious money to audio hardware/software companies.

Dan Harper
  • 997
  • 7
  • 21
  • 1
    and that's probably exactly why a definitive answer (let alone with code samples!) is so hard to find on Google ;) – GONeale Oct 27 '14 at 02:44
6

I found this library which seem to have a pretty solid implementation for detecting Beats per Minute. https://github.com/owoudenberg/soundtouch.net

It's based on http://www.surina.net/soundtouch/index.html which is used in quite a few DJ projects http://www.surina.net/soundtouch/applications.html

Community
  • 1
  • 1
eandersson
  • 23,358
  • 7
  • 79
  • 102
6

This is by no means an easy problem. I'll try to give you an overview only.

What you could do is something like the following:

  1. Compute the average (root-mean-square) loudness of the signal over blocks of, say, 5 milliseconds. (Having never done this before, I don't know what a good block size would be.)
  2. Take the Fourier transform of the "blocked" signal, using the FFT algorithm.
  3. Find the component in the transformed signal that has the largest magnitude.

A Fourier transform is basically a way of computing the strength of all frequencies present in the signal. If you do that over the "blocked" signal, the frequency of the beat will hopefully be the strongest one.

Maybe you need to apply a filter first, to focus on specific frequencies (like the bass) that usually contain the most information about the BPM.

Thomas
  • 150,847
  • 41
  • 308
  • 421
  • 1. Compute the average (root-mean-square) loudness of the signal over blocks of, say, 5 milliseconds. (Having never done this before, I don't know what a good block size would be.) 2. Take the Fourier transform of the "blocked" signal, using the FFT algorithm. 3. Find the component in the transformed signal that has the largest magnitude. I can't really see why you rms signal first, and why it is done on blocks (as FFT usually works on windows of the signal anyway). Further more the FFT would give you the signal shifted into an energy-time domain so that is really the same as the firs –  Sep 17 '08 at 09:11
1

First of all, what Hallgrim is producing is not the power spectral density function. Statistical periodicities in any signal can be brought out through an autocorrelation function. The fourier transform of the autocorrelation signal is the power spectral density. Dominant peaks in the PSD other than at 0 Hz will correspond to the effective periodicity in the signal (in Hz)...

pete
  • 11
  • 1
0

I'd recommend checking out the BASS audio library and the BASS.NET wrapper. It has a built in BPMCounter class.

Details for this specific function can be found at http://bass.radio42.com/help/html/0833aa5a-3be9-037c-66f2-9adfd42a8512.htm.

Matt Williams
  • 1,316
  • 14
  • 18
0

The easy way to do it is to have the user tap a button in rhythm with the beat, and count the number of taps divided by the time.

  • Tapping really isn't a good solution for what I'm trying to use the tool for. I'd like it to be fully automatic, which I know is possible. – Karl Mar 23 '09 at 21:03