I suggest there is a fundamental reason we do exponentiation before multiplication before addition: we do the "most powerful" operation first. I don't have any clear evidence to cite, though in pre-SE days Dr Math agreed with me that this is the key.

To clarify what I mean by "powerful", the hyperoperation sequence is a sequence of arithmetic operations, starting with the most basic: finding the successor. For instance the successor of 5 is 6. If I want to add 3 to 5, then that means I have to find 5's successor is 6, 6's successor is 7, and 7's successor is 8. In other words addition of 3 is just succession repeated (iterated) 3 times. So addition is the next operation in the sequence.

If I do iterated addition, I perform a multiplication (e.g. $3 \times 5 = 5+5+5$) so that's next on the list. And iterated multiplication is exponentiation (e.g. $5^3=5\times 5\times 5$). These are considered the "elementary operations". Of course the hyperoperation sequence doesn't stop there: iterated expoentiation is tetration , e.g. $^{3}2 = 2^{2^{{2}}}=2^4=16$ (had to pick one with small numbers as they get very big very quickly!). Next come pentation (iterated tetration), hexation (iterated pentation)... Knuth invented a lovely system of up-arrow notation to represent these hugely powerful operations in a neat way. Learn it, and you can now win all those "who can write down the biggest number" games that kids play!

So what's my point? There really is a clear and well-defined sense in which addition, multiplication and exponentiation belong on a sequence of increasing power. Our order of operations are defined so that we do the most powerful first, unless parantheses tell us to do things differently. It makes intuitive sense to me that more powerful operations should have priority, although if on some other world the least important ones get done first, people used to that system may see it as intuitive too! At any rate, this seems more convincing to me than typesetting convention (circular argument, since different orders of operation may have led to different notation?) or compliance with certain practical examples (e.g. in which we multiply first then add, but not with the equally abundant practical examples where we add first then multiply so have to resort to parantheses to express the order correctly).

There are many other interesting ways to order arithmetic around. For anybody who hasn't done so, have a play with Reverse Polish notation sometime! I found this really clarified the importance of order of operations to me (you need to think carefully what to type in first), as well as "how a computer/calculator thinks".

Final thought: *why* do I find it intuitive to do most powerful first, other than being accustomed to it? A consequence of my earlier answer is that the more powerful operations are defined by iteratively applying less powerful operations, so maybe "more to less" is more natural. Someone who tries to sort out the less powerful operations first, and "move up" to the more powerful operations, will still end up breaking down the higher operations back into lower ones. In that sense "less to more" doesn't work so well. In fact if you really want to, you can break all the operations down to the successor function $\operatorname{succ}(n)=n+1$, its inverse, the predecessor function $\operatorname{pred}(n)=n-1$, and $H_n(a,b)$ defined by:

$$H_n(a, b) =
\begin{cases}
\operatorname{succ}(b) & \text{if } n = 0 \\
a &\text{if } n = 1, b = 0 \\
0 &\text{if } n = 2, b = 0 \\
1 &\text{if } n \ge 3, b = 0 \\
H_{\operatorname{pred}(n)}(a, H_n(a, \operatorname{pred}(b))) & \text{otherwise}
\end{cases}\,\!$$

Setting $n=0,1,2,3,4,\ldots$ gives succession, addition, multiplication, exponentiation, tetration... the whole hyperoperation sequence! In particular $H_0(a, b) = \operatorname{succ}(b)$, $H_1(a, b) = a + b$, $H_2(a, b) = a \times b$, $H_3(a, b) = a^{b} = a\uparrow{b}$ (in Knuth's notation), $H_4(a, b)=^{b}a=a\uparrow\uparrow{b}$ and so on. Try expanding by hand some of the examples in my answer: $H_0(7,5)= \operatorname{succ}(5)$, $H_1(5,3)=5+3$, $H_2(5,3)=5\times 3$, $H_3(5,3)=5^3$, $H_4(2,3)=2\uparrow\uparrow{3}$. It's quite instructive (once you get on to $H_2$ it's less tedious if you use the fact that $H_1$ means addition, and so on for higher $n$) - one gets to see how the operations all follow from iterating successor and predecessor, and in what sense each is an extension of the previous in the sequence. You'll notice how $b$ acts as a "counter" that ticks down to zero, and understand why you need the $a$, 0 and 1 for the cases $n=1,n=2,n\ge 3$. (Roughly it's to handle inconsistencies in the manner each operation repeats the previous one. When I say $5+3$ is succession repeated 3 times, I start applying those successions to the 5. When I say $5 \times 3$ is adding 5 on, 3 times, it is being added on to zero. And $5^3$ is only "multiplying by 5, 3 times" if I start at 1! I have found these inconsistencies to be a source of confusion to high school students.)