For question about fisher information that appears in mathematical statistics.

The Fisher information is a way of measuring the amount of information that an observable random variable $X$ carries about an unknown parameter θ upon which the probability of $X$ depends. Let $f(X; \theta)$ be the probability density function (or probability mass function) for $X$ conditional on the value of $\theta$. This is also the likelihood function for $\theta$. It describes the probability that we observe a given sample $X$, given a known value of $\theta$. If $f$ is sharply peaked with respect to changes in $\theta$, it is easy to indicate the “correct” value of $\theta$ from the data, or equivalently, that the data $X$ provides a lot of information about the parameter $\theta$. If the likelihood $f$ is flat and spread-out, then it would take many, many samples like $X$ to estimate the actual “true” value of $\theta$ that would be obtained using the entire population being sampled. This suggests studying some kind of variance with respect to $\theta$.

$$I(\theta) = E\left[\left(\frac{\partial}{\partial \theta} \log f(X; \theta \right)^2|\theta\right]$$