I'm studying support vector machines and in the process I've bumped into lagrange multipliers with multiple constraints and Karush–Kuhn–Tucker conditions.

I've been trying to study the subject, but still can't get a good enough grasp on the subject. In wikipedia:

(http://en.wikipedia.org/wiki/Lagrange_multiplier#Multiple_constraints)

it says that in order to find the extremum points of a function $f$, (with constraints $g_1, ..., g_m$), we must find a point $\text{x}$ such that

$$\sum_{i=1}^{m}\lambda_{i}\nabla g_i(\text{x}) = \nabla f(\text{x})$$

I understand lagrange multipliers when there is only one constraint, but this is hard to grasp for some reason... :(

Could anyone give me easy-to-understand explanation, why the equation above is true?

Thank you for any guidance :)

P.S.

If it is not a big job to do, I'd be very grateful If someone could also explain the Karush–Kuhn–Tucker conditions which generalize my question :) That would be super!