I also found this highly confounding when first starting the continuum mechanics (CM). I think it is easiest to understand tensors using the "classical" approach from old Riemannian geometry, which is closer to how they are used in CM.
Let me just clear a few things up first.

The CM tensors are treated sort of like a special case of the tensors from classical physics and Riemannian geometry.

Note: in other fields, especially computer science, a tensor is defined as a multidimensional array of numbers; but in physics and mathematics, these are usually arrays of *functions* that change over space (or spacetime). These are more accurately called tensor fields (a generalization of vector fields), but let's just call them tensors.

**First fact**: although you can always write a tensor as an array of numbers, the actual numbers present depend on the coordinate system you choose. This is quite similar to matrices, of course. However, there are certain *invariants* of tensor fields (e.g. the trace of stress tensor), that do not change as the coordinate system changes.

**Second fact**: you can index into the arrays in two different ways, covariant (lower indices) or contravariant (upper indices). E.g. $R_{ijk}^\ell$ is type-(1,3), meaning 1 contravariant index and 3 covariant indices. The stress tensor is type-(0,2), for instance, although it should probably really be considered type-(1,1). You can raise or lower indices with the metric tensor and its inverse, but this is not so important in CM (see the link just above).

**Definition**: how do we know a given array of functions defines a tensor? The classic definition is that, given a tensor $T_{\ell_1,\ldots,\ell_n}^{i_1,\ldots,i_m}$, and two coordinate systems $x^i$ and $y^i$, with Jacobian $J_{j}^i=\partial y^i/\partial x^j$.
$$
\hat{T}_{\ell_1',\ldots,\ell_n'}^{i_1',\ldots,i_m'} =
J_{i_1}^{i_1'}\ldots J_{i_m}^{i_m'}
{T}_{\ell_1,\ldots,\ell_n}^{i_1,\ldots,i_m}
(J^{-1})_{\ell_1'}^{\ell_1}\ldots (J^{-1})_{\ell_m'}^{\ell_m}
$$
using the Einstein summation convention (where seeing the same index twice, one lower and one upper, means summing over all possible values of those indices) to transform from $x$ coordinates to $y$ coordinates.

So, here the *defining property of a tensor is how it transforms when the coordinates change*. Note that the Jacobian $J$ is *not* a tensor by this definition, even though we write it like one!

**Intuition**: a tensor represents an entity that captures information that is a *geometric invariant*, i.e. does not change as the coordinate system changes. For instance, the Jacobian obviously depends on the coordinate system, so it is not a tensor. But the stress or strain tensors can be used to measure or compute physical quantities that will remain the same across coordinate systems (e.g. the strain tensor can measure changes in length, which is a scalar invariant).

Let's go back to CM now. Usually, in CM, we tend to *ignore* the type of the tensor. Why? Because normally one changes the index types of a tensor using the metric tensor $g_{ab}$ and its inverse $g^{\alpha\beta}$, e.g. $\sigma_j^i=\sigma_{ik}g^{kj}$. But often in CM it's assumed that $g=I$ (i.e. $g_{ij}=\delta_{ij}$). So even though the stress tensor is written $\sigma_{ij}$, it is actually often used as $\sigma_i^j$, e.g. to take its trace. This is a source of great confusion.

In general, one can just assume that *most* of the CM tensors are type-(1,1), and in $\mathbb{R}^3$, so they can be written as $3\times 3$ arrays.
In these cases, they can be treated as linear maps OR as bilinear forms (under the assumptions about index movement above)!
One exception is the fourth-order elasticity tensor.

(And, yes, this is indeed horrifying and confusing from the notational point of view)

Ok, now for your actual questions.

What's the difference between a linear transformation and a tensor? Somehow they can both be represented by a 3×3 matrix, but they do different things when acting on a vector? Like the columns of a 3×3 matrix of a linear transformation tell you where the basis vectors end up, but the same columns of a tensor don't represent basis vectors at all?

Some tensors are linear transformations i.e. can be written as matrices in the classical way. As noted above, this is often the case for most CM tensors. For instance, the Cauchy stress tensor $\sigma(x)$ takes in a direction $v(x)$ at a specific point $x$, and outputs a new vector $T(x)=\sigma(x)v(x)$, which is the traction (using matrix notation rather than tensor indices). So, here, indeed, it is a linear transformation.

Furthermore, a linear transformation transforms all of space but a tensor is defined at every point in space? Does a tensor act on vectors the same way as linear transformations do?

As mentioned above, CM usually refers to tensor fields, which you can think of as matrix fields (given that you specify the coordinate system). The second question is harder, but for CM specifically, generally, yes.

What is the difference between a tensor product, dyadic product, and outer product and why are engineering tensors like the Cauchy stress built from the tensor product of two vectors (i.e. traction vector and normal vector)?

They are the same. See here and here.

As for the second part, I'm not sure what you mean. One often writes $T^{(n)}_j := \sigma_{ij} n_i$ (in the unfortunate CM index notation) to define the $\sigma$. Also, I could be wrong, but if the stress tensor was always defined by a tensor product of two vectors, then it would always be rank 1 and have $\det(\sigma)=0$, which is not true (indeed the third invariant is $I_3(\sigma)=\det(\sigma)$)!

Is it true that scalars and vectors are just 0th order and 1st order tensors, respectively? How are all these things related to each other?

Yes, scalars are order $0$ and vectors are type-$(1,0)$.
Intuitively, note that a scalar is automatically a geometric invariant (e.g. surface area), and hence a tensor.

(Aside that is not necessary for CM: note that this index notation makes understanding *covectors* easier. If $v^i$ is a vector field, then $s=v^ia_i$ is a scalar [remember the Einstein convention!]. So any type-$(0,1)$ tensor [i.e. covector] can be thought of as a way to map a vector $v$ to a scalar $s$ using an array of numbers $a_j$)

What topics and/or subtopics of linear algebra are essential to grasp the essence of tensors in the context of physics and engineering? Are they really just objects that act on vectors to produce other vectors (or numbers) or are they something more?

Tensors are a subfield of an area called multilinear algebra, which might be worth studying. However, for CM, this is unnecessary. My suggestion is looking at "classical" (i.e. using index notation) tensor theory and Riemannian geometry, e.g. looking at Lovelock and Rund's *Tensors, Differential Forms, and Variational Principles*.
On the other hand, CM has frankly developed its own confusing notation that doesn't match any other field (that I know of) and uses them in their own way, ignoring all the nuances of other areas of physics, such as contravariant vs covariant or covariant derivatives, (where curvilinear spaces are more common) or the more powerful abstractions now used in mathematics. So, just read the continuum mechanics literature instead.

Whew, well, normally I don't answer when there are plenty of other answers present, but here I felt you might like an answer focusing a bit more on CM and its idiosyncrasies :)