1

I have a dataset that is has a number of groups.

I have to calculate two things:

The average value of some data per group - which I have done with the following code:

var results = from res in dt.AsEnumerable()
                      group res by res.Field<string>(key)
                          into grp
                          orderby Convert.ToInt32(grp.Key)
                          select new
                          {
                              date = grp.Key,
                              sum = grp.Average(r => Convert.ToDouble(r.Field<string>  (average)))

                          };

I also need to calculate the standard deviation of the values for each group, so for each individual group, I need to create an array or list of the individual results per group. How could I do this with LINQ?

Thanks.

EDIT: I appreciate my question my be a little vague; I know how to calculate the standard deviation, I would like to know how to extract the list of values for each group(as above) to allow the calculation to be performed. Thanks.

Darren Young
  • 10,381
  • 34
  • 88
  • 148

2 Answers2

1

Here's a SO question/answer on Calculating Standard Deviation: Standard Deviation in LINQ

For the list of items just do items = grp.ToList() or grp.Select(c => c.Field...).ToList()

var results = from res in dt.AsEnumerable()
              group res by res.Field<string>(key)
              into grp
              orderby Convert.ToInt32(grp.Key)
              select new
                  {
                     date = grp.Key,
                     sum = grp.Average(r => Convert.ToDouble(r.Field<string>(average))),
                     items = grp.ToList(),
                     sd = CalcSD(grp.Select(c => Convert.ToDouble(r.Field<string>(average))))
                  };
Community
  • 1
  • 1
Geoff Appleford
  • 17,742
  • 4
  • 58
  • 83
  • Thanks, but I know how to calculate the standard deviation, I really need to know how to extract the list of values from the datatable based upon my grouping key above. – Darren Young Mar 03 '11 at 10:58
  • 1
    @Darren - `grp` contains the rows for each grouping key. Get the values as you have done with the average calculation from the relevant column and use that to do your calculation. eg `grp.Select(c => c.Field....)`) – Geoff Appleford Mar 03 '11 at 11:01
1

Darren,

Just a quick SO linq (pun intended!!) to get you going until further answers arrive:

Standard deviation of generic list?

hope this helps

[edit] - based on the SO link above, you could amend that and try (inside a static class):

public static double StdDev(this IEnumerable<double> values)
{   
  double ret = 0;
  if (values.Count() > 0) 
  {      
     //Compute the Average      
     double avg = values.Average();
     //Perform the Sum of (value-avg)_2_2      
     double sum = values.Sum(d => Math.Pow(d - avg, 2));
     //Put it all together      
     ret = Math.Sqrt((sum) / (values.Count()-1));   
  }   
  return ret;
}

plus a little override (which i had worked up but then changed on looking at this link http://help.syncfusion.com/ug_84/User%20Interface/WPF/Grid/default.htm?turl=Documents%2Fcustomsummaries.htm):

public static double StdDev<T>(this IEnumerable<T> values, Func<T, double?> selector)
{
    double ret = 0;
    var count = values.Count();

    if (count > 0)
    {
        // Compute the Average
        double? avg = values.Average(selector);

        // Perform the Sum of (value-avg)^2
        double sum = values.Select(selector).Sum(d =>
        {
            if (d.HasValue)
            {
                return Math.Pow(d.Value - avg.Value, 2);
            }
            return 0.0;
        });

        // Put it all together
        ret = Math.Sqrt((sum) / (count - 1));
    }
    return ret;
}

and then simply apply as such:

var results = from res in dt.AsEnumerable()
                  group res by res.Field<string>(key)
                      into grp
                      orderby Convert.ToInt32(grp.Key)
                      select new
                      {
                          date = grp.Key,
                          sum = grp.Average(r => Convert.ToDouble(r.Field<string>(average))),
                          stdDev =  gp.Select(s => Convert.ToDouble(s.Field<string>(average))).StdDev(),
                          // alternative override - note no need to convert
                          stdDev2 = gp.StdDev(s => s.Field<string>(average))
                      };

haven't tested, so may not work straight away, but theory is there.

Community
  • 1
  • 1
jim tollan
  • 21,831
  • 4
  • 43
  • 62
  • Thanks, but I know how to calculate the standard deviation, I really need to know how to extract the list of values from the datatable based upon my grouping key above – Darren Young Mar 03 '11 at 10:59
  • Darren - the above edit may work if you've got sufficinet smoke and mirrors laying around :) – jim tollan Mar 03 '11 at 11:24
  • Work's perfectly with a little tweaking to the anonymous type. Thanks. – Darren Young Mar 03 '11 at 11:34
  • This was the change (if you are interested) stdDev = grp.Select(c => (Convert.ToDouble(c.Field(average)))).ToList().StandardDeviation(), – Darren Young Mar 03 '11 at 11:35
  • glad that worked for you - nice one... also note, i've changed the extension method sig a little since the 1st attempt, so hopefully you can invoke without the ToList() if you change your code to match. – jim tollan Mar 03 '11 at 11:37