Starting from your second question: What do we gain when the problem has only equality constraints - and if in any case it is a problem that will be solved by numerical methods, why bother?

There are many cases where the problem under study is posed in abstract terms - no numbers in sight, just symbols for the variables, and yet more symbols for the coefficients/parameters. In such a case, the exact quantitative solution is not something we care about - what we do care is to *characterize* the solution, i.e. determine its qualitative characteristics (like existence, uniqueness, robustness to perturbations, e.t.c). But even if we do seek the specific quantitative solution, numerical optimization does not characterize the solution - it just gives it to you (and why are we so certain that the algorithm didn't mess up? -especially with ill-behaved objective functions). And knowing only this, leaves a lot of uncertainty. Numerical optimization will give you the coordinates and the height of the peak of the mountain - don't you think that it is important to know how snowy and slippery are the rocks that surround the peak?
$$ $$ As for your first question, in the analytic elaboration of the problem usually it is an algebraic nightmare *not* to formulate in Lagrange, KKT or, (even better) Fritz John terms.
Consider $\max_{x,y} f(x,y) \; \text{s.t.}\; g(x)=h(y) $.
without using the multiplier approach you should find for example $x=g^{-1}(h(y)$ and then face $max_{y}f\left[(g^{-1}(h(y)),y\right]$ - and I *didn't* say that $x$ and $y$ are one-dimensional. What experience I have says that this is *not* an equally easy problem to work with.