**Addition**: in a new answer I have briefly continued the explanation below to define general tensors and not just differential forms (covariant tensors).

Many of us got very confused with the notions of tensors in differential geometry not because of its algebraic structure or definition, but because of confusing old notation. The motivation for the notation $dx_i$ is perfectly justified once one is introduced to exterior differentiation of differential forms, which are just antisymmetric multilinear covectors (i.e. take a number of vectors and give a number, changing sign if we reorder its input vectors). The confusion in your case is higher because your are using $x_i$ for your basis of vectors, and $x_i$ are actually your local coordinate functions.

**CONSTRUCTION & NOTATION:**

Let $V$ be a finite vector space over a field $\mathbb{k}$ with basis $\vec{e}_1,...,\vec{e}_n$, let $V^*:=\operatorname{Hom}(V, \mathbb{k})$ be its dual vector space, i.e. the linear space formed by linear functionals $\tilde\omega: V\rightarrow\mathbb{k}$ which eat vectors and give scalars. Now the tensor product $\bigotimes_{i=1}^p V^*=:(V^*)^{\otimes p}$ is just the vector space of multilinear functionals $\tilde\omega^{(p)}:V^k\rightarrow\mathbb{k}$, e.g. $\tilde\omega (\vec v_1,...,\vec v_k)$ is a scalar and linear in its input vectors. By alternating those spaces, one gets the multilinear functionals mentioned before, which satisfy $\omega (\vec v_1,...,\vec v_k)=\operatorname{sgn}(\pi)\cdot \omega (\vec v_{\pi(1)},...,\vec v_{\pi(k)})$ for any permutation of the $k$ entries. The easiest example is to think in row vectors and matrices: if your vectors are columns, think of covectors as row vectors which by matrix product give you a scalar (actually its typical scalar product!), they are called one-forms; similarly any matrix multiplied by a column vector on the right and by a row vector on the left gives you a scalar. An anti-symmetric matrix would work similarly but with the alternating property, and it is called a two-form. This generalizes for inputs of any $k$ vectors.

Now interestingly, there is only a finite number of those alternating spaces: $$\mathbb{k}\cong V^0, V^*, (V^*)^{\wedge 2}:=V^*\wedge V^*,..., (V^*)^{\wedge \dim V}.$$
If you consider your $V$-basis to be $\vec{e}_1,...,\vec{e}_n$, by construction its dual space of covectors has a basis of the same dimension given by the linear forms $\tilde e^1,...,\tilde e^n$ which satisfy $\tilde e^i (\vec e_k)=\delta^i_k$, that is, they are the functionals that give $1$ only over its dual vectors and $0$ on the rest (covectors are usually indexed by superindices to use the Einstein summation convention). That is why any covector acting on a vector $\vec v=\sum_i v^i\vec e_i$ can be written $$\tilde\omega =\sum_{i=1}^n \omega_i\tilde e^i\Rightarrow \tilde\omega(\vec v) =\sum_{i=1}^n \omega_i\tilde e^i(\vec v)=\sum_{i=1}^n \omega_i\cdot v^i.$$

Finally, when you are working in differential manifolds, you endow each point $p$ with a tangent space $T_p M$ of tangent vectors and so you get a dual space $T_p^* M$ of covectors and its alternating space $\Omega_p^1(M)$ of alternating 1-forms (which in this case coincide). Analogously you define your $k$-forms on tangent vectors defining $\Omega_p^k(M)$. Since these spaces depend from point to point, you start talking about *fields* of tangent vectors and *fields* of differential $k$-forms, which vary in components (with respect to your chosen basis field) from point to point.

Now the interesting fact is that the constructive intrinsic definition of tangent vectors is based on directional derivatives at every point. Think of your manifold $M$ of dimension $n$ as given locally charted around a point $p$ by coordinates: $x^1,...,x^n$. If you have a curve $\gamma:[0,1]\subset\mathbb R\rightarrow M$ over your manifold passing through your point $p$ of interest, you can intrinsically define tangent vectors at $p$ by looking at the directional derivatives at $p$ of any smooth function $f:M\rightarrow\mathbb R$ which are given by the rate of change of $f$ over any curve passing through $p$. That is, in local coordinates your directional derivative at point $p$ along (in the tangent direction of) a curve $\gamma$ passing through it ($\gamma(t_0)=p$) is given by (using the chain rule for differentiation with the local coordinates for $\gamma (t)$): $$\frac{d f(\gamma(t))}{dt}\bigg|_{t_0}= \sum_{i=1}^n\frac{\partial f}{\partial x^i}\bigg|_p\frac{d x^i(\gamma (t))}{d t}\bigg|_{t_0}=\sum_{i=1}^n X^i\vert_p\frac{\partial f}{\partial x^i}\bigg|_p.$$
This is how that equation must be understood: at every point $p$, different curves have different "tangent vectors" $X$ with local components $X^i\vert_p\in\mathbb R$ given by the parametric differentiation of the curve's local equations (so actually you only have to care about equivalence classes of curves having the same direction at each point); the directional derivative at any point of any function in direction $X$ is thus expressible as the operator: $$X=\sum_{i=1}^n X^i\frac{\partial}{\partial x^i},$$ for all the possible components $X^i$ such that are smooth scalar fields over the manifold. In this way you have attached a tangent space $T_pM\cong\mathbb R^n$ at every point with canonical basis $\vec e_i=\frac{\partial}{\partial x^i}=:\partial_i$ for every local chart of coordinates $x^i$. This seems to be far from the visual "arrow"-like notion of vectors and tangent vectors as seen in surfaces and submanifolds of $\mathbb R^n$, but it can proved to be equivalent to the definition of the geometric tangent space by immersing your manifold in $\mathbb R^n$ (which can always be done by Whitney's embedding theorem) and restricting your "arrow"-space at every point to the "arrow"-subspace of vectors tangent to $M$ as submanifold, in the same way that you can think of the tangent planes of a surface in space. Besides this is confirmed by the transformation of components, if two charts $x$ and $y$ contain point $p$, then their coordinate vectors transform with the Jacobian of the coordinate transformation $x\mapsto y$: $$\frac{\partial}{\partial y^i}=\sum_{j=1}^n\frac{\partial x^j}{\partial y^i}\frac{\partial}{\partial x^j},$$ which gives the old-fashioned transformation rule for vectores (and tensors) as defined in theoretical physics. Now it is an easy exercise to see that, if a change of basis in $V$ from $\vec e_i$ to $\vec e'_j$ is given by the invertible matrix $\Lambda^i_{j'}$ (which is always the case), then the corresponding dual basis are related by the inverse transformation $\Lambda^{j'}_i:=(\Lambda^i_{j'})^{-1}$:
$$\vec e'_j = \sum_{i=1}^n \Lambda^i_{j'}\,\vec e_i\Rightarrow \tilde e'^j = \sum_{i=1}^n (\Lambda^i_{j'})^{-1}\,\tilde e^i=:\sum_{i=1}^n \Lambda^{j'}_i\,\tilde e^i,\;\text{ where }\sum_i \Lambda^{k'}_i\Lambda^i_{j'}=\delta^{k'}_{j'}.$$
Thus, in our manifold, the cotangent space is defined to be $T_p^*M$ with coordinate basis given by the dual functionals $\omega_{x}^i (\partial/\partial x^j)=\delta^i_j$, $\omega_{y}^i (\partial/\partial y^j)=\delta^i_j$. By the transformation law of tangent vectors and the discussion above on the dual transformation, we must have that tangent covectors transform with the inverse of the previously mentioned Jacobian:
$$\omega_y^i=\sum_{j=1}^n\frac{\partial y^i}{\partial x^j}\omega_x^j.$$
But this is precisely the transformation rule for differentials by the chain rule!
$$dy^i=\sum_{j=1}^n\frac{\partial y^i}{\partial x^j}dx^j.$$
Therefore is conventional to use the notation $\partial/\partial x^i$ and $dx^i$ for the tangent vector and covector coordinate basis in a chart $x:M\rightarrow\mathbb R^n$.

Now, from the previous point of view $dx^i$ are regarded as the classical differentials, just with the new perspective of being the functional duals of the differential operators $\partial/\partial x^i$. To make more sense of this, one has to turn to differential forms, which are our $k$-forms on $M$. A 1-form $\omega\in T_p^*M=\Omega_p^1(M)$ is then just
$$\omega = \sum_{i=1}^n \omega_i\, dx^i,$$
with $\omega_i(x)$ varying with $p$, smooth scalar fields over the manifold. It is standard to consider $\Omega^0(M)$ the space of smooth scalar fields. After defining wedge alternating products we get the $k$-form basis $dx^i\wedge dx^j\in\Omega^2(M), dx^i\wedge dx^j\wedge dx^k\in\Omega^3(M),..., dx^1\wedge\cdots\wedge dx^n\in\Omega^n(M)$, (in fact not all combinations of indices for each order are independent because of the antisymmetry, so the bases have fewer elements than the set of products). All these "cotensor" linear spaces are nicely put together into a ring with that wedge $\wedge$ alternating product, and a nice differentiation operation can be defined for such objects: the exterior derivative $\mathbf d:\Omega^k(M)\rightarrow\Omega^{k+1}(M)$ given by $$\mathbf d\omega^{(k)}:=\sum_{i_0<...<i_k}\frac{\partial\omega_{i_1,...,i_k}}{\partial x^{i_0}}dx^{i_0}\wedge dx^{i_1}\wedge\cdots\wedge dx^{i_k}.$$
This differential operation is a generalization of, and reduces to, the usual vector calculus operators $\operatorname{grad},\,\operatorname{curl},\,\operatorname{div}$ in $\mathbb R^n$. Thus, we can apply $\mathbf d$ to smooth scalar fields $f\in\Omega^0(M)$ so that $\mathbf df=\sum_i\partial_i f dx^i\in\Omega^1(M)$ (note that not all covectors $\omega^{(1)}$ come from some $\mathbf d f$, a necessary and sufficient condition for this "exactness of forms" is that $\partial_{i}\omega_j=\partial_{j}\omega_i$ for any pair of components; this is the beginning of de Rham's cohomology). In particular the coordinate functions $x^i$ of any chart satisfy $$\mathbf d x^i=\sum_{j=1}^n\frac{\partial x^i}{\partial x^j}dx^j=\sum_{j=1}^n\delta^i_jdx^j= dx^i.$$

THIS final equality establishes the correspondence between the infinitesimal differentials $dx^i$ and the exterior derivatives $\mathbf dx^i$. Since we could have written any other symbol for the basis of $\Omega_p^1(M)$, it is clear that $\mathbf dx^i$ are a basis for the dual tangent space. In practice, notation is reduced to $\mathbf d=d$ as we are working with isomorphic objects at the level of their linear algebra. All this is the reason why $\mathbf dx^i(\partial_j)=\delta^i_j$, since once proved that the $\mathbf dx^i$ form a basis of $T_p^*M$ it is straightforward without resorting to the component transformation laws. On the other hand, one could start after the definition of tangent spaces as above, with coordinate basis $\partial/\partial x^i$ and dualize to its covector basis $\tilde e^i$ such that $\tilde e^i(\partial_j)=\delta^i_j$; after that, one defines wedge products and exterior derivatives as usual from the cotangent spaces; then it is a theorem that for any tangent vector field $X$ and function $f$ on $M$ $$X(f)=\sum_{i=1}^n X^i\frac{\partial f}{\partial x^i}=\mathbf d f(X),$$
so in particular we get as a corollary that our original dual coordinate basis is in fact the exterior differential of the coordinate functions: $$(\vec e_i)^*=\tilde e^i=\mathbf d x^i\in\Omega^1(M):=T^*(M).$$
(That is true at any point for any coordinate basis from the possible charts, so it is true as covector fields over the manifold). In particular the evaluation of covectors $\mathbf d x^i$ on infinitesimal vectors $\Delta x^j\partial_j$ is $\mathbf d x^i(\Delta x^j\partial_j)=\sum_{j=1}^n\delta^i_j\Delta x^j$, so when $\Delta x^i\rightarrow 0$ we can see the infinitesimal differentials as the evaluations of infinitesimal vectors by coordinate covectors.

**MEANING, USAGE & APPLICATIONS:**

Covectors are the essential structure for differential forms in differential topology/geometry and most of the important developments in those fields are formulated or use them in one way or another. They are central ingredients to major topics such as:

Linear dependence of vectors, determinants and hyper-volumes, orientation, integration of forms (Stokes' theorem generalizing the fundamental theorem of calculus), singular homology and de Rham cohomology groups (Poincaré's lemma, de Rham's theorem, Euler characteristics and applications as invariants for algebraic topology), exterior covariant derivative, connection and curvature of vector and principal bundles, characteristic classes, Laplacian operators, harmonic functions & Hodge decomposition theorem, index theorems (Chern-Gauß-Bonnet, Hirzebruch-Riemann-Roch, Atiyah-Singer...) and close relation to modern topological invariants (Donaldson-Thomas, Gromow-Witten, Seiberg-Witten, Soliton equations...).

In particular, from a mathematical physics point of view, differential forms (covectors and tensors in general) are fundamental entities for the formulation of physical theories in geometric terms. For example, Maxwell's equations of electromagnetism are just two equations of differential forms in Minkowski's spacetime:
$$\mathbf d F=0,\;\;\star\mathbf d\star F=J,$$
where $F$ is a $2$-form (an antysymmetric 4x4 matrix) whose components are the electric and magnetivc vector field components $E_i,\,B_j$, $J$ is a spacetime vector of charge-current densities and $\star$ is the Hodge star operator (which depends on the metric of the manifold so it requires additional structure!). In fact the first equation is just the differential-geometric formulation of the classical Gauß' law for the magnetic field and the Faraday's induction law. The other equation is the dynamic Gauß' law for the electric field and Ampère's circuit law. The continuity law of conservation of charge becomes just $\mathbf d\star J=0$. Also, by Poincaré's lemma, $F$ can always be solved as $F=\mathbf d A$ with $A$ a covector in spacetime called the electromagnetic potential (in fact, in electrical engineering and telecommunications they always solve the equations by these potentials); since exterior differentiation satisfies $\mathbf{d\circ d}=0$, the potentials are underdetermined by $A'=A+\mathbf d\phi$, which is precisely the simplest "gauge invariance" which surrounds the whole field of physics. In theoretical physics the electromagnetic force comes from a $U(1)$ principal bundle over the spacetime; generalizing this for Lie groups $SU(2)$ and $SU(3)$ one arrives at the mathematical models for the weak and strong nuclear forces (electroweak theory and chromodynamics). The Faraday $2$-form $F$ is actually the local curvature of the gauge connection for those fiber bundles, and similarly for the other forces. This is the framework of Gauge Theory working in arbitrary manifolds for your spacetime. The only other force remaining is gravity, and again general relativity can be written in the Cartan formalism as a curvature of a Lorentz spin connection over a $SO(1,3)$ gauge bundle, or equivalently as the (Riemannian) curvature of the tangent space bundle.