0

I'm a complete python noob, so I hope this question isn't too trivial.

Say I have a dataframe with different columns like 'country', 'year', 'suicides_no', 'GDP' etc.

My job is to find out the top 6 countries wrt average yearly 'suicides_no', and all plot the suicides of each of these countries for different years too.

This is the original dataframe df:

enter image description here

What I initially did was use groupby to create a new dataframe to store the country names and the average suicide rate.

<!-- language: python -->

df1 = df.groupby(["country"])['suicides_no'].mean()
df1=df1.reset_index()

This is what I got for df1:

enter image description here

Now, I found the top 6 values for the averaged suicides_no

df2=df1.nlargest(6,["suicides_no"])
df2=df2.reset_index()

Which got me this as df2:

enter image description here

This is where I'm stuck. How do I go back to the original dataframe df and pull out values grouped by the countries in df2(Russia, US etc.) and all the years to find the average suicide rate there? And how do I plot these?

Also, is there an easier way to do this? My method seems very inefficient.

user_9
  • 79
  • 7

0 Answers0