I think this question is more like a physics question. If you wanted the mathematics, it's already answered - you can square both sides and the formula would still be correct. But why don't we do that?

If you have a bunch of gas particles you might want to somehow describe their *average* speed with a single number. Well, maybe not really the average but a speed that is characteristic to that particular gas in the state considered. How can we do that?

**Average velocity**. As you correctly mention, averaging over all velocities you get 0 in many cases. This is useful in some cases, for example you can use $\bar{\boldsymbol{v}}=(0,0,0)$ as a boundary condition when deriving velocity distribution.

Let's look at the distribution of how many molecules move with certain speed. You can see that actually almost none of the particles have zero speed or velocity. You might conclude that the average velocity is not that great metric to characterize a state of a gas. If the gas fills a container, it's average velocity doesn't care about the state, only about the movement of the container. So let's just average over the graph linked above, shall we?

**Aveage speed**. The speed is magnitude (absolute value) of velocity. You could take the average (mean) speed of gas molecules.
**Most probable speed**. When you looked at the graph... you might actually want to use the speed where the peak (maximum) is. That would describe the curve well, wouldn't it?
**Root mean square speed**. Let's take the square of speed. Find the mean value. Take the root. Sounds like a nightmare, but this is actually the most useful metric. It is the speed that characterizes the energy of the gas.

The molecules of gas moves with different speeds. Each of the molecules have some kinetic energy. You could calculate the total or the average energy if you could measure each speed and do quite a lot of maths (not doable in a lifetime for a reasonable container of gas). However, if all of the gas molecules were moving with a certain speed that we call **root mean square speed** their total and average kinetic energy would be the same as it really is. As energy is usually what we care about the most in physics/chemistry, this is the speed that describes the speeds of a gas in a way that is useful for us.

If you care about the others:

- Average speed describes the momentum in similar way (total/average magnitude of momentum would stay the same if the molecules would all move with the mean speed).
- Average velocity describes the motion as a whole (motion of the center of mass if all the particles are of equal masses, also the momentum of the system).
- Most probable speed says that more molecules have about that speed than any other speed. To be precise you should choose interval, let's say consider number of molecules having a
*speed* +/- 10m/s. Than the number pf molecules having speed in that interval will be the highest if the *speed* is the most probable speed. That is the best usefulness for this number that I can come up with.

Some stuff for further (yet introductory) reading on wikipedia.