0

I faced this question in ML book which really is more like a math question than ML. Would be grateful if you could give me a solution with stat or any packages bound to python

As per a survey on use of pesticides among 1000 farmers in grape farming for around 10 acres of grape farmland, it was found that the grape farmers spray 38 liters of pesticides in a week on an aver- age with the corresponding standard deviation of 5 liters. Assume that the pesticide spray per week follows a normal distribution. Write code to answer the following questions:

(a) What proportion of the farmers is spraying more than 50 liters of pesticide in a week? (b) What proportion of farmers is spraying less than 10 liters? (c) What proportion of farmers is spraying between 30 liters and 60 liters?

Andreas Rossberg
  • 31,309
  • 3
  • 55
  • 70

1 Answers1

1

I'll post my answer with the assumption that this is not part of a homework assignment for a class.

The question provides information on the mean and variance of the distribution, which are 38 and 5, respectively. Assuming a Gaussian distribution, we can answer the question by plugging in appropriate values to the cumulative distribution function. To calculate the CDF, we use scipy.

The first question is asking for "the proportion of farmers using more than 50 liters of pesticide a week." In code, this translates into:

from scipy.stats import norm

print(1 - norm.cdf(50, 38, 5)) # 0.008197535924596155, or ~ 0.8 percent

Note that we subtract from 1 since the question asks for 50 or more, not up to 50, which is what we would get if we used the value returned by the CDF as it is.

For (b), we can do

print(norm.cdf(10, 38, 5)) # 1.0717590258310887e-08

This will give us the proportion of farmers using 10 liters of pesticides or less.

For the last question, we can take the same approach, with just a slight bit of alteration:

print(norm.cdf(60, 38, 5) - norm.cdf(30, 38, 5)) # 0.9451952957565343

We first calculate norm.cdf(60, 38, 5), which gives us the proportion of farmers using 60 liters of pesticide or less. From this, we subtract norm.cdf(30, 38, 5), which is the proportion of farmers who use 30 liters or less. By definition, this will give us farmers using the amount of pesticide within the range of [30, 60], which is what the question asks for.

As for plotting, there are already a plethora of excellent answers here on SO, such as this one. I'll reserve my answer for threads of the likes.

Jake Tae
  • 1,426
  • 1
  • 4
  • 10
  • Dear, firstly I would like to thank you millions of times cause I HAVE BEEN WORKING on this question of Book "Machine Learning using python" written by "Manaranjan Pradhan | U Dinesh Kumar" as self study. and I spent hours today but couldn't solve q.No.4 of CHAPTER 3 OF THIS BOOK and finally I was forced to ask it here. I'm not a student and this is not an assignment. Be sure about that. and again thanks for your consideration. – Ali Esfandiari May 05 '20 at 18:25
  • 2
    No problem, it's great to see someone self-studying for the sake of it. I just wanted to make sure because I've seen instances of students abusing this system for assignment completion, and that is certainly not in line with the spirit of this forum. Glad I was able to help, even if just a bit! – Jake Tae May 05 '20 at 18:35
  • 3 things to mention sir: 1- I wanted to say I totally agree with you about abusing and decreasing the quality of learning curve of students around the globe, 2- I would like to dive a bit deeper into Probability and Statistics and if you suggest a special Book which lifts me up I would be really grateful. 3- Don't say that way! you really complemented my comprehension of this chapter of book. Thanks again and sorry for disturbing your precious time. – Ali Esfandiari May 05 '20 at 18:43
  • 1
    I'm also in the process of studying myself, so I'm by no means an expert. But here are two books to get you started: Introduction to Probability and Statistics by Ross, and Introduction to Probability by Blitzstein. The latter has a dedicated [course website](https://projects.iq.harvard.edu/stat110/home) with problem sets, YouTube videos, and more. Happy learning! – Jake Tae May 06 '20 at 02:01