I've always been nagged about the two extremely non-obviously related definitions of conic sections (i.e. it seems so mysterious/magical that somehow slices of a cone are related to degree 2 equations in 2 variables). Recently I came across the following pages/videos:

- This 3B1B video about ellipses, which reignited my desire to understand conics
- Why are quadratic equations the same as right circular conic sections?, which offers a very computational approach to trying to resolve this question
- Another 3B1B video on visualizing Pythagorean triples (i.e. finding the rational points of a circle)
- and Manjul Bhargava's lecture on the Birch-Swinnerton-Dyer Conjecture, where minutes ~10-15 discuss the complete solution to the problems of rational points on conics.

While 3B1B's video makes a lot of sense and is very beautiful from a geometric standpoint, it does not talk about any of the other conics, or discuss the relationship with "degree 2". Moreover, the 2nd 3B1B video I linked and then Bhargava's lecture highlights "degree 2" as something we understand well, compared to higher degrees (reminds me a little bit of Fermat's last theorem and the non-existence of solutions for $n>2$).

So, I suppose my questions are as follows:

- Why, from an intuitive standpoint, should we expect cones to be deeply related to zero-sets of degree 2 algebraic equations?

and more generally:

- Is there some deep reason why "$2$" is so special? I've often heard the quip that "mathematics is about turning confusing things into linear algebra" because linear algebra is "the only subject mathematicians completely understand"; but it seems we also understand a lot of nice things about quadratics as well -- we have the aforementioned relationship with cones, a complete understanding of rational points, and the Pythagorean theorem (oh! and I just thought of quadratic reciprocity). 2 is also special in all sorts of algebraic contexts, as well as being the only possible finite degree extension of $\mathbb R$, leading to in particular $\mathbb C$ being 2-dimensional.

Also interesting to note that many equations in physics are related to $2$ (the second derivative, or inverse square laws), though that may be a stretch. I appreciate any ideas you share!

$$\rule{5cm}{0.4pt}$$

EDIT 3/12/21: was just thinking about variances, and least squares regression. "$2$" is extremely special in these areas: Why square the difference instead of taking the absolute value in standard deviation?, Why is it so cool to square numbers (in terms of finding the standard deviation)?, and the absolutely mindblowing animation of the physical realization of PCA with Hooke's law: Making sense of principal component analysis, eigenvectors & eigenvalues.

In these links I just listed, seems like the most popular (but still not very satisfying to me) answer is that it's convenient (smooth, easy to minimize, variances sum for independent r.v.'s, etc), a fact that may be a symptom of a deeper connection with the Hilbert-space-iness of $L^2$. Also maybe something about how dealing with squares, Pythagoras gives us that minimizing reconstruction error is the same as maximizing projection variance in PCA. Honorable mentions to Qiaochu Yuan's answer about rotation invariance, and Aaron Meyerowitz's answer about the arithmetic mean being the unique minimizer of sum of squared distances from a given point. As for the incredible alignment with our intuition in the form of the animation with springs and Hooke's law that I linked, I suppose I'll chalk that one up to coincidence, or some sort of SF ;)

$$\rule{5cm}{0.4pt}$$

EDIT 2/11/22:
I was thinking about Hilbert spaces, and then wondering again why they behave so nice, namely they have the closest point lemma (leading to orthogonal decomposition $\mathcal H = \mathcal M \oplus \mathcal M^\perp$ for closed subspaces $\cal M$), or orthonormal bases (leading to Parseval's identity, convergence of a series of orthogonal elements if and only if the sum of the squared lengths converge), and I came to the conclusion that the key result each time seemed to be the **Pythagorean theorem** (e.g. the parallelogram law is an easy corollary of Pythag). So that begs the questions, why is the Pythagorean theorem so special? The linked article in the accepted answer of this question: What does the Pythagorean Theorem really prove? tells us essentially the **Pythagorean theorem boils down to the fact that right triangles can be subdivided into two triangles both similar to the original**.

The fact that this subdivision is reached by projecting the vertex onto the hypotenuse (projection deeply related to inner products) is likely also significant... ahh, indeed by the "commutativity of projection", projecting a leg onto the hypotenuse is the same as projecting the hypotenuse onto the leg, but by orthogonality of the legs, the projection of the hypotenuse onto the leg is simply the leg itself! The **square** comes from the fact that projection scales proportionally to the scaling of each vector, and there are **two** vectors involved in the operation of projection.

I suppose this sort of "algebraic understanding" of the projection explains the importance of "2" more than the geometry, since just knowing about the "self-similarity of the subdivisions" of the right triangle, one then has to wonder why say tetrahedrons or other shapes in other dimensions don't have this "self-similarity of the subdivisions" property. However it is still not clear to me why **projection seems to be so fundamentally "2-dimensional"**. Perhaps 1-dimensionally, there is the "objective" perception of the vector, and 2-dimensionally there is the "subjective" perception of one vector in the eyes of another, and there's just no good 3-dimensional perception for 3 vectors?

There might also be some connection between the importance of projection and the importance of the Riesz representation theorem (all linear "projections" onto a 1-dimensional subspace, i.e. linear functionals, are actually literal projections against a vector in the space).

$$\rule{5cm}{0.4pt}$$

EDIT 2/18/22: again touching on the degree 2 Diophantine equations I mentioned above, a classical example is the number of ways to write $k$ as the sum of $n$ squares $r_n(k)$. There are a number of nice results for this, the most famous being Fermat's 2-square theorem, and Jacobi's 4-square theorem. A key part of this proof was the use of the Poisson summation formula for the Euler/Jacobi theta function $\theta(\tau) := \sum_{n=-\infty}^\infty e^{i \pi n^2 \tau}$, which depends on/is heavily related to the fact that Gaussians are stable under the Fourier transform. I still don't understand intuitively why this is the case (see Intuitively, why is the Gaussian the Fourier transform of itself?), but there seems to be some relation to Holder conjugates and $L^p$ spaces (or in the Gaussian case, connections to $L^2$), since those show up in generalizations to the Hardy uncertainty principle (**“completing the square”, again an algebraic nicety of squares**, was used in the proof of Hardy, and the Holder conjugates may have to do with the inequality $-x^p + xu \leq u^q$ -— Problem 4.1 in Stein and Shakarchi’s Complex analysis, where the LHS basically comes from computing the Fourier transform of $e^{-x^p}$) Of course why the Gaussian itself appears everywhere is another question altogether: https://mathoverflow.net/questions/40268/why-is-the-gaussian-so-pervasive-in-mathematics.

This (squares leading to decent theory of $r_n(k)$, and squares leading to nice properties of the Gaussian) is probably also connected to the fact that $\int_{\mathbb R} e^{-x^2} d x$ has a nice explicit value, namely $\sqrt \pi$. I tried seeing if there was a connection between this value of $\pi$ and the value of $\pi$ one gets from calculating the area of a circle "shell-by-shell" $\frac 1{N^2} \sum_{k=0}^N r_2(k) \to \pi$, but I couldn't find anything: Gaussian integral using Euler/Jacobi theta function and $r_2(k)$ (number of representations as sum of 2 squares).