73

How can I compute the standard deviation in an incremental way (using the new value and the last computed mean and/or std deviation) ?

for the non incremental way, I just do something like: $$S_N=\sqrt{\frac1N\sum_{i=1}^N(x_i-\overline{x})^2}.$$

mean = Mean(list)
for i = 0 to list.size
   stdev = stdev + (list[i] - mean)^2
stdev = sqrRoot( stdev / list.size )
Andrew Chin
  • 6,810
  • 1
  • 10
  • 33
shn
  • 972
  • 1
  • 8
  • 16
  • The formula for this is a special case of the sample variance decomposition formula given in [O'Neill (2014)](https://www.tandfonline.com/doi/abs/10.1080/00031305.2014.966589) (Result 1). – Ben Jan 31 '19 at 03:54

8 Answers8

63

I think the easiest way to do this is with an orthogonality trick. I'll show how to incrementally compute the variance instead of the standard deviation. Let $X_1, X_2, ...$ be an iid sequence of random variables with $\bar X = n^{-1} \sum_{j = 1} ^ n X_j$ and $s^2_n$ defined similarly as the $n$'th sample variance (I use a denominator of $n-1$ instead of $n$ in your picture to keep things unbiased for the variance, but you can use the same argument just adding $1$ to all the terms with $n$ in them). First write $$ s^2_n = \frac{\sum_{j = 1} ^ n (X_j - \bar X_n)^2}{n - 1} = \frac{\sum_{j = 1} ^ n (X_j - \bar X_{n - 1} + \bar X_{n - 1} - \bar X_n)^2}{n - 1}. $$ Expand this to get $$ s^2_n = \frac{(n - 2)s^2_{n - 1} + (n - 1) (\bar X_{n - 1} - \bar X_n)^2 + 2 \sum_{j = 1} ^ {n - 1} (X_j - \bar X_{n - 1})(\bar X_{n - 1} - \bar X_n) + (X_n - \bar X_{n})^2}{n - 1} $$ and it is easy to show that the summation term above is equal to $0$ which gives $$ s^2_n = \frac{(n - 2)s^2_{n - 1} + (n - 1)(\bar X_{n - 1} - \bar X_n)^2 + (X_n - \bar X_{n})^2}{n - 1}. $$

EDIT: I assumed you already have an incremental expression for the sample mean. It is much easier to get that: $\bar X_n = n^{-1}[X_n + (n-1)\bar X_{n-1}]$.

guy
  • 3,957
  • 1
  • 25
  • 37
  • 33
    Nice answer. Note that your formula can be simplified to $$s_n^2={n-2\over n-1}s^2_{n-1}+{1\over n}(X_n-\bar X_{n-1})^2.$$ –  Jan 27 '12 at 18:49
  • 1
    Thanks Guy for the great explaination (2). it is exactly what i was looking for. but can you please elaborate how the summation term equals zero as i seem to be missing a trick... –  Jan 07 '13 at 18:09
  • 1
    @user55490: $$ \sum_{j = 1} ^ {n - 1} (X_j - \bar X_{n - 1}) = (sum_{j = 1} ^ {n - 1} (X_j) - (n - 1) \bar X_{n - 1}) = (n - 1) (\bar X_{n - 1} - \bar X_{n - 1})$$ – Jean Hominal Oct 26 '15 at 14:24
  • 2
    See also [this wikipedia post](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm) – Geoffrey De Smet Feb 01 '17 at 09:35
  • 6
    Please, note that the simplified formula of @user940 can only be used if we are computing the sample variance. If we want to use a custom DDOF (Delta Degrees Of Freedom), the formula becomes: $$s^2_n = \frac{(n - 1 - d)s^2_{n - 1} + (n - 1)(\bar X_{n - 1} - \bar X_n)^2 + (X_n - \bar X_{n})^2}{n - d}$$ Where $d$ is the DDOF, usually $d = 0$ for population variance and $d = 1$ for sample variance. – user3019105 Feb 24 '18 at 18:41
  • In your first $s_n^2$, shouldn't the term be $X_j - \bar{X}_{n- 1} + \bar{X}_{n-1} - \bar{X}_n$? It isn't clear to me why you would be able to add $\bar{X}_n$ – Benjamin Jan 17 '19 at 13:07
  • @chandresh I reverted your edit. You did find a typo in the first line, but the final formula is correct, because the error did not propagate to the other lines (i.e., every line except for the first was correct). For future reference, it is a good idea to check the formula numerically before just throwing in an edit, otherwise you risk changing a correct answer to an incorrect one. I double-checked numerically that my original answer gives the correct answer on simulated data, whereas your version does not. – guy Oct 08 '19 at 19:14
  • @guy, do you know at which step you did wrong? If you look at the second term in the second eq (when you say expend..) sum runs from $j=1\ldots n$ and you are multiplying with $(n-1)$ which IMO should be multiplying by $n$ since other quantities are constant. Can you explain this ? FYI, see my derivation [link](https://drive.google.com/open?id=1-ccUKJdHZGXb3p0RAiwEX6aa1pChBcbm) – CKM Oct 08 '19 at 20:05
  • @chandresh it is not wrong. As I said, **I checked it numerically**. I don't have time to go through your algebra and find your mistake, but you can play around with my formula, the formula in the first comment, and your formula in matlab for a few minutes. Our formulas work, yours does not. – guy Oct 08 '19 at 21:35
  • @guy. I cross verified your formula and mine given in the[link](https://drive.google.com/file/d/1-ccUKJdHZGXb3p0RAiwEX6aa1pChBcbm/view) . Both are CORRECT. The one I wrote in the comment previously mistakenly took biased version of the variance in the eq. As a result, both yours and mine went to different derivation. Cheers! – CKM Oct 09 '19 at 10:06
  • With the DDOF=d the base case $s_n^2$ is when the sample size is $n=d+1$, you need to calculate $s_{d+1}$ normally as the base case to have your calculations ruined by a demonimator of $n-d=0$. A simplification for when $d=1$ the base case is $s_{d+1}^2=s_{2}^2=\frac{(X_n - \bar X_{n-1})^2}{2}$. – El8dN8 Oct 02 '20 at 03:20
15

The standard deviation is a function of the totals $T_{\alpha}=\sum_{i=1}^{N}x_{i}^{\alpha}$ for $\alpha=0,1,2$, each of which can be calculated incrementally in an obvious way. In particular, $E[X]=T_{1}/T_{0}$ and $E[X^2]=T_{2}/T_{0}$, and the standard deviation is $$ \sigma = \sqrt{\text{Var}[X]} = \sqrt{E[X^2]-E[X]^2} = \frac{1}{T_0}\sqrt{T_{0}T_{2}-T_{1}^2}. $$ By maintaining totals of higher powers ($T_{\alpha}$ for $\alpha \ge 3$), you can derive similar "incremental" expressions for the skewness, kurtosis, and so on.

mjqxxxx
  • 37,360
  • 2
  • 50
  • 99
  • 4
    I think the @user wanted a formula that lets you incrementally calculate the standard deviation as you increase the data set in a way that is computationally efficient, not this. – guy Jan 27 '12 at 17:52
  • 3
    This is actually computationally efficient, because you just need to count the number of samples, the sum of the samples, and the sum of the squares of each sample. Once you need to "read" the standard deviation (or average,) you calculate the square root, which costs two multuplies, one subtraction, one square root, and one divide. On most modern CPUs, this is highly efficient (and very cache friendly.) – Jon Watte Aug 01 '14 at 03:39
  • 1
    Furthermore, this methods allows to compute easily a running standard deviation as well, which is what i was looking for. in addition to adding the new value to the sum of values and squares, remove the oldest value from them as well. Sorry for the late comment, but I found this page after googling this issue and I hope it will let future people find this more easily. – Quentin Feb 11 '16 at 16:31
  • I have tried to implement this in R. However, I seem to am doing something wrong as I don't get correct standard deviation values. E.g. for the three x values 1, 2, and 3 the standard deviation should be 1. T0 = 3 (1^0 + 2^0 + 3^0), T1 = 6 (1^1 + 2^1 + 3^1), T2 = 14 (1^2 + 2^2 + 3^2). The final standard deviation value according to the formula above would be 0.8165 which is different from 1. Could you tell me where I am making a mistake? – Phil Oct 18 '18 at 16:00
  • 2
    @Phil: the formula I gave, $\sqrt{E[X^2]-E[X]^2}$, is the population standard deviation. Remember that the sample standard deviation differs from this by a factor of $\sqrt{n/(n-1)}$. To get the sample standard deviation in your case you would multiply by $\sqrt{3/2}$, giving the value of $1$ that you were expecting. – mjqxxxx Oct 22 '18 at 18:20
  • @mjqxxxx is there a name for this $T_\alpha$ notation you use? – toriningen May 24 '20 at 17:23
  • nvm, I have found out. $\mu'_\alpha = \frac{T_\alpha}{T_0}$, where $\mu'_\alpha$ is $\alpha$-th non-central moment. $\frac{\sqrt{{T_0}{T_2} - {T_1}^2}}{T_0}$ part is actually just $\sqrt{\mu'_2-{\mu'_1}^2} = \sqrt{\kappa_2}$, where $\kappa_\alpha$ is $\alpha$-th cumulant. – toriningen May 25 '20 at 01:48
10

What you refer to as an incremental computation is very close to the computer scientist's notion of an online algorithm. There is in fact a well-known online algorithm for computing the variance and thus square root of a sequence of data, documented here.

obataku
  • 5,451
  • 1
  • 21
  • 24
  • 1
    Just for the record, the algorithm in the linked wikipedia entry was implemented [in an answer two years later](https://math.stackexchange.com/a/2356070/356647). – Lee David Chung Lin Feb 14 '19 at 14:28
9

Another way of saying the above is that you need to keep a count, an incremental sum of values and an incremental sum of squares.
Let $N$ be the count of values seen so far, $S = \sum_{1,N} x_i$ and $Q = \sum_{1,N} x_i^2$ (where both $S$ and $Q$ are maintained incrementally). Then any stage,
the mean is $\frac{S}{N}$
and the variance is $\frac{Q}{N} - \left( \frac{S}{N}\right)^2$
An advantage of this method is that it is less prone to rounding errors after a long series of calculations.

Philip Kilby
  • 91
  • 1
  • 1
  • This is the best way to achieve the goal - single pass calculate with fixed amount of storage required. But is it mathematically valid? – Edward Ross Aug 25 '16 at 13:12
  • indeed (let $X=S/N$ be mean): $s^2=1/N\sum_{i=0}^n (x_i-X)^2 = 1/N\sum_{i=0}^n (x_i^2 - 2x_iX+X^2)=1/N\sum_{i=0}^n x_i^2-1/N\sum_{i=0}^n 2x_iX+1/N\sum_{i=0}^n X^2=Q/N-2X^2+X^2=Q/N-X^2=Q/N-(S/N)^2$ – dennyrolling Apr 26 '17 at 20:52
  • Programmatically, this is by far the most preferable way, and it's incredibly easy to apply the same logic to remove a value from the set (subtract from S and Q and N appropriately) for a stable increment and decrement, which allows expanding, contracting, or maintaining a some fixed window over which to calculate the mean and variance in a stream. – Brendano257 Apr 08 '22 at 00:33
9

Forgive my poor math background, what I need is detail!

I added my progress here for someone like me.

$$ s_n^2=\frac {\sum_{i=1}^{n}(x_i-\bar{x}_n)^2}{n-1} \\ = \frac {\sum_{i=1}^n(x_i - \bar{x}_{n-1} + \bar{x}_{n-1} - \bar{x}_n)^2}{n-1} \\ = \frac {\sum_{i=1}^{n}(x_i - \bar{x}_{n-1})^2 + 2\sum_{i=1}^n(x_i - \bar{x}_{n-1})(\bar{x}_{n-1} - \bar{x}_n) + \sum_{i=1}^n(\bar{x}_{n-1} - \bar{x}_n)^2} {n-1} \\ = \frac {(\sum_{i=1}^{n-1}(x_i - \bar{x}_{n-1})^2 + (x_n - \bar{x}_{n-1})^2) + (2\sum_{i=1}^{n-1}(x_i - \bar{x}_{n-1})(\bar{x}_{n-1} - \bar{x}_n) + 2(x_n - \bar{x}_{n-1})(\bar{x}_{n-1} - \bar{x}_n)) + \sum_{i=1}^n(\bar{x}_{n-1} - \bar{x}_n)^2} {n-1} \\ = \frac {(n-2)s_{n-1}^2 + (x_n - \bar{x}_{n-1})^2 + 0 + 2(x_n - \bar{x}_{n-1})(\bar{x}_{n-1} - \bar{x}_n)) + n(\bar{x}_{n-1} - \bar{x}_n)^2} {n-1} \\ = \frac {(n-2)s_{n-1}^2 + x_n^2 - 2x_n\bar{x}_{n-1} + \bar{x}_{n-1}^2 + 2x_n \bar{x}_{n-1} - 2x_n \bar{x}_n - 2\bar{x}_{n-1}^2 + 2\bar{x}_{n-1}\bar{x}_n + n\bar{x}_{n-1}^2 - 2n\bar{x}_{n-1}\bar{x}_n + n\bar{x}_n^2} {n-1} \\ = \frac {(n-2)s_{n-1}^2 + x_n^2 + \bar{x}_{n-1}^2 - 2x_n \bar{x}_n - 2\bar{x}_{n-1}^2 + 2\bar{x}_{n-1}\bar{x}_n + n\bar{x}_{n-1}^2 - 2n\bar{x}_{n-1}\bar{x}_n + n\bar{x}_n^2} {n-1} \\ = \frac {(n-2)s_{n-1}^2 + x_n^2 - 2x_n\bar{x}_{n-1} + \bar{x}_{n-1}^2 + 2x_n \bar{x}_{n-1} - 2x_n \bar{x}_n - 2\bar{x}_{n-1}^2 + 2\bar{x}_{n-1}\bar{x}_n + n\bar{x}_{n-1}^2 - 2n\bar{x}_{n-1}\bar{x}_n + n\bar{x}_n^2} {n-1} \\ = \frac {(n-2)s_{n-1}^2 + (x_n^2 - 2x_n\bar{x}_n + \bar{x}_n^2) + (n-1)(\bar{x}_{n-1}^2 - 2\bar{x}_{n-1}\bar{x}_n + \bar{x}_n^2)} {n-1} \\ = \frac {(n-2)s_{n-1}^2 + (n-1)(\bar{x}_{n-1} - \bar{x}_n)^2 + (x_n - \bar{x}_n)^2} {n-1} \\ = \frac {n-2}{n-1}s_{n-1}^2 + (\bar{x}_{n-1} - \bar{x}_n)^2 + \frac {(x_n - \bar{x}_n)^2}{n-1} $$

and

$$ (\bar{x}_{n-1} - \bar{x}_n)^2 + \frac {(x_n - \bar{x}_n)^2}{n-1} \\ = (\bar{x}_{n-1} - \frac {x_n + (n-1)\bar{x}_{n-1}}{n})^2 + \frac {(x_n - \frac {x_n + (n-1)\bar{x}_{n-1}}{n})^2}{n-1} \\ = \frac {1}{n} (x_n - \bar{x}_{n-1})^2 $$

so

$$ s_n^2 = \frac{n-2}{n-1}s_{n-1}^2 + \frac{1}{n}(x_n - \bar{x}_{n-1})^2 $$

secfree
  • 191
  • 1
  • 2
  • Awesome! Thanks! – user3019105 Feb 24 '18 at 18:33
  • @secfree. I am just wondering if step 2 to 3 is correct as per $(a+b)^2=a^2+2ab+b^2$ is not followed. – CKM Oct 08 '19 at 14:04
  • Surprisingly, this can be further extended to $(n-1)s_n^2 = (n-2)s_{n-1}^2 + (x_n - \bar{x}_{n-1})(x_n - \bar{x}_n)$, which is an "no division by $n$" version of this formula. – heiner Jun 30 '20 at 21:30
5

In case someone has to "decrement" and not only "increment" the standard deviation $\sigma$ (for example, when a result $x_i$ in the set is incorrect and needs to be removed or recalculated), you can use this formula:

$ \sigma_{\text{without } x_i} = \sqrt{\frac{n}{n -1} \left[ \sigma_n^2 - \frac{(\bar{x}_n-x_i)^2}{n-1} \right]} $

Here is the derivation:

\begin{equation} \label{varianceDecrementale} \begin{split} \sigma^2_{\text{without } x_i} & = \sum_{j\in \{0, 1, \cdots, i-1, i+1, \cdots, n\}} \frac{x_j^2}{n-1} - \bar{x}_{n \text{ without } x_i}^2 \\ & = \frac{x_0^2 + x_1^2 + \cdots + x_{i - 1}^2 + x_{i + 1}^2 + \cdots + x_n^2}{n - 1} - \bar{x}_{n \text{ without } x_i}^2 \\ & = \frac{n}{n -1} \left[ \frac{x_0^2 + x_1^2 + \cdots + x_{i - 1}^2 + x_{i}^2 + x_{i + 1}^2 + \cdots + x_n^2}{n} - \frac{x_{i}^2}{n}\right] - \bar{x}_{n \text{ without } x_i}^2 \\ & = \frac{n}{n -1} \left[ \sum_i^n \frac{x_i^2}{n} - \frac{x_{i}^2}{n} \right] - \bar{x}_{n \text{ without } x_i}^2 \\ & = \frac{n}{n -1} \left[ \sum_i^n \frac{x_i^2}{n} - \frac{x_{i}^2}{n} \right] - \frac{n^2}{(n-1)^2} \left[ \bar{x}_n - \frac{x_i}{n} \right]^2 \\ & = \frac{n}{n -1} \left[ \sum_i^n \frac{x_i^2}{n} - \frac{x_{i}^2}{n} - \frac{n}{n-1} \left[ \bar{x}_n - \frac{x_i}{n} \right]^2 \right] \\ & = \frac{n}{n -1} \left[ \sum_i^n \frac{x_i^2}{n} - \frac{x_{i}^2}{n} - \frac{n}{n-1} \left[ \bar{x}_n^2 - 2 \bar{x}_n \frac{x_i}{n} + \frac{x_i^2}{n^2} \right] \right] \\ & = \frac{n}{n -1} \left[ \sum_i^n \frac{x_i^2}{n} - \frac{x_{i}^2}{n} - \frac{n}{n-1} \bar{x}_n^2 + 2 \bar{x}_n \frac{x_i}{n - 1} - \frac{x_i^2}{n (n - 1)} \right] \end{split} \end{equation}

We can remark that

\begin{equation} \begin{split} -\frac{n}{n-1} \bar{x}_n^2 & = -\frac{n}{n-1} \bar{x}_n^2 + \bar{x}_n^2 - \bar{x}_n^2 \\ & = -\bar{x}_n^2 - \frac{n}{n-1} \bar{x}_n^2 + \frac{n - 1}{n - 1} \bar{x}_n^2 \\ & = -\bar{x}_n^2 + \frac{-n \bar{x}_n^2 + n \bar{x}_n^2 - \bar{x}_n^2}{n-1} \\ & = - \left (\bar{x}_n^2 + \frac{\bar{x}_n^2}{n-1}\right) \end{split} \end{equation}

thus

\begin{equation} \begin{split} \sigma^2_{\text{without } x_i} & = \frac{n}{n -1} \left[ \sum_i^n \frac{x_i^2}{n} - \frac{x_{i}^2}{n} - \frac{n}{n-1} \bar{x}_n^2 + 2 \bar{x}_n \frac{x_i}{n - 1} - \frac{x_i^2}{n (n - 1)} \right] \\ & = \frac{n}{n -1} \left[ \sum_i^n \frac{x_i^2}{n} - \frac{x_{i}^2}{n} - \left (\bar{x}_n^2 + \frac{\bar{x}_n^2}{n-1}\right) + 2 \bar{x}_n \frac{x_i}{n - 1} - \frac{x_i^2}{n (n - 1)} \right]\\ & = \frac{n}{n -1} \left[ \sum_i^n \frac{x_i^2}{n} - \bar{x}_n^2 - \frac{x_{i}^2}{n} - \frac{\bar{x}_n^2}{n-1} + 2 \bar{x}_n \frac{x_i}{n - 1} - \frac{x_i^2}{n (n - 1)} \right]\\ & = \frac{n}{n -1} \left[ \sigma^2_n- \frac{x_{i}^2}{n} - \frac{\bar{x}_n^2}{n-1} + 2 \bar{x}_n \frac{x_i}{n - 1} - \frac{x_i^2}{n (n - 1)} \right] \\ & = \frac{n}{n -1} \left[ \sigma^2_n - \frac{x_{i}^2}{n} - \frac{\bar{x}_n^2}{n-1} + 2 \bar{x}_n \frac{x_i}{n - 1} - \frac{x_i^2}{n (n - 1)} \right] \\ & = \frac{n}{n -1} \left[ \sigma^2_n - \frac{x_{i}^2(n-1)}{n(n-1)} - \frac{\bar{x}_n^2}{n-1} + 2 \bar{x}_n \frac{x_i}{n - 1} - \frac{x_i^2}{n (n - 1)} \right] \\ & = \frac{n}{n -1} \left[ \sigma^2_n - \frac{x_{i}^2{n}}{{n}(n-1)} + {\frac{r^2}{n(n-1)}} - \frac{\bar{x}_n^2}{n-1} + 2 \bar{x}_n \frac{x_i}{n - 1} - {\frac{x_i^2}{n (n - 1)}} \right] \\ & = \frac{n}{n -1} \left[ \sigma^2_n - \frac{x_i^2}{n-1} - \frac{\bar{x}_n^2}{n-1} + 2\bar{x}_n\frac{x_i}{n-1} \right]\\ & = \frac{n}{n -1} \left[ \sigma^2_n - \frac{(\bar{x}_n-x_i)^2}{n-1} \right] \end{split} \end{equation}

3

If it is of any help I found what seems to be a much nicer way here wikipedia - online algorithm .

The following is how I have implemented it in java. It has work very well for me. Hope it is clear enough, Please feel free to ask me questions about it.

/**
 * standardDeviation() - designed to calculate the standard deviation of a data set incrementally by taking the last entered value and the previous sum of differences to the mean recorded.
 * (i.e; upon adding a value to the data set this function should immediately be called)
 * 
 * NOTE: do not call this function if the data set size it less than 2 since standard deviation cannot be calculated on a single value
 * NOTE: sum_avg, sum_sd and avg are all static variables
 * NOTE: on attempting to use this on another set following previous use, the static values will have to be reset**
 * 
 * @param vector - List<Double> - data with only one additional value from previous method call
 * @return updated value for the Standard deviation
 */
public static double standardDeviation(List<Double> vector)
{   
    double N = (double) vector.size();                  //size of the data set
    double oldavg = avg;                                //save the old average
    avg = updateAverage(vector);                        //update the new average

    if(N==2.0)                                          //if there are only two, we calculate the standard deviation using the standard formula 
    {                                                               
        for(double d:vector)                            //cycle through the 2 elements of the data set - there is no need to use a loop here, the set is quite small to just do manually
        {
            sum_sd += Math.pow((Math.abs(d)-avg), 2);   //sum the following according to the formula
        }
    }
    else if(N>2)                                        //once we have calculated the base sum_std  
    {   
        double newel = (vector.get(vector.size()-1));   //get the latest addition to the data set

        sum_sd = sum_sd + (newel - oldavg)*(newel-avg); //https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm
    }
    return Math.sqrt((sum_sd)/(N));                     //N or N-1 depends on your choice of sample of population standard deviation

}

/**
 * simplistic method for incrementally calculating the mean of a data set
 * 
 * @param vector - List<Double> - data with only one additional value from previous method call
 * @return updated value for the mean of the given data set
 */
public static double updateAverage(List<Double> vector)
{
    if(vector.size()==2){
        sum_avg = vector.get(vector.size()-1) + vector.get(vector.size()-2);
    }
    else{
        sum_avg += vector.get(vector.size()-1);
    }


    return sum_avg/(double)vector.size();

}
  • This points to exactly same source as the [the answer by @oldrinb](https://math.stackexchange.com/a/1379813/356647) two years earlier. The good thing here is that this is self-contained while the older one is just a link (with a keyword reference). – Lee David Chung Lin Feb 14 '19 at 14:25
1

I think it could be done in easier form

$$s_n^2 = \frac{\sum_{i=1}^n(x_i - \bar{x}_n)}{n-1}^2 = \frac{\sum_{i=1}^n(x_i^2-2x_i\bar{x}_n + \bar{x}_n^2)}{n-1} $$

expand numerator

$$\sum_{i=1}^n(x_i^2-2x_i\bar{x}_n + \bar{x}_n^2) = \sum_{i=1}^nx_i^2 - \sum_{i=1}^n(2x_i\frac{\sum_{j=1}^nx_j}{n}) + \sum_{i=1}^n(\frac{\sum_{j=1}^nx_j}{n})^2 = \sum_{i=1}^nx_i^2 - 2(\sum_{i=1}^nx_i)(\sum_{j=1}^n\frac{x_j}{n}) + n(\sum_{j=1}^n\frac{x_j}{n})^2 = \sum_{i=1}^nx_i^2 - \frac{2}{n}(\sum_{i=1}^nx_i)^2 + \frac{1}{n}(\sum_{j=1}^nx_j)^2$$

by this way

$$s_n^2 = \frac{\sum_{i=1}^n(x_i - \bar{x}_n)}{n-1} = \frac{\sum_{i=1}^nx_i^2 - \frac{1}{n}(\sum_{i=1}^nx_i)^2}{n-1}$$

Here example of how it can be implemented in Python

from math import sqrt
x = int(input())
n = 0
sum = 0
sumsq = 0
while x != 0: # or whenever you want
    n += 1
    sum += x
    sumsq += x^^2
    x = int(input())
print(sqrt((sumsq - sum^^2 / n)/(n-1)))
biunovich
  • 11
  • 3