How Solve a Data Science Question Using Python's Panda Data Structure Syntax

QBoard » Advanced Visualizations » Viz - Python » How Solve a Data Science Question Using Python's Panda Data Structure Syntax

User Dashboard

How Solve a Data Science Question Using Python's Panda Data Structure Syntax

Back To Topics

Tags : python pandas

Viaan Prakash

461
Good afternoon.

I have this question I am trying to solve using "panda" statistical data structures and related syntax from the Python scripting language. I am already graduated from a US university and employed while currently taking the Coursera.org course of "Python for Data Science" just for professional development, which is offered online at Coursera's platform by the University of Michigan. I'm not sharing answers to anyone either as I abide by Coursera's Honor Code.

First, I was given this panda dataframe chart concerning Olympic medals won by countries around the world:
```
# Summer    Gold    Silver  Bronze  Total   # Winter    Gold.1  Silver.1    Bronze.1    Total.1 # Games Gold.2  Silver.2    Bronze.2    Combined total  ID

Afghanistan 13  0   0   2   2   0   0   0   0   0   13  0   0   2   2   AFG
Algeria 12  5   2   8   15  3   0   0   0   0   15  5   2   8   15  ALG
Argentina   23  18  24  28  70  18  0   0   0   0   41  18  24  28  70  ARG
Armenia 5   1   2   9   12  6   0   0   0   0   11  1   2   9   12  ARM
Australasia 2   3   4   5   12  0   0   0   0   0   2   3   4   5   12  ANZ
```
Second, the question asked is, "Which country has won the most gold medals in summer games?"

Third, a hint given me as to how to answer using Python's panda syntax is this: "This function should return a single string value."

Fourth, I tried entering this as the answer in Python's panda syntax:
```
import pandas as pd
    df = pd.read_csv('olympics.csv', index_col=0, skiprows=1)
def answer_one():
    if df.columns[:2]=='00':
        df.rename(columns={col:'Country'+col[4:]}, inplace=True)    
    df_max = df[df[max('Gold')]]
    return df_max['Country']
answer_one() 
```
Fifth, I have tried other various answers like this in Coursera's auto-grader, but it keeps giving this error message:

Could you please help me solve that question? Any hints/suggestions/comments are welcome for that.

Thanks, Kevin
September 8, 2021 12:48 PM IST

0
Tarun Reddy

84
```
import pandas as pd
def answer_one():
    df1=pd.Series.max(df['Gold'])
    df1=df[df['Gold']==df1]
    return df1.index[0]

answer_one()
```
September 13, 2021 11:52 PM IST

0
Samar Patil

346 3
You can use pandas' loc function to find the country name corresponding to the maximum of the "Gold" column:
```
data = [('Afghanistan', 13),
        ('Algeria', 12), 
        ('Argentina', 23)]

df = pd.DataFrame(data, columns=['Country', 'Gold'])

df['Country'].loc[df['Gold'] == df['Gold'].max()]
```
The last line returns Argentina as answer.

Edit 1: I just noticed you import the .csv file using pd.read_csv('olympics.csv', index_col=0, skiprows=1). If you leave out the skiprows argument you will get a dataframe where the first line in the .csv file correspond to column names in the dataframe. This makes handling of your dataframe much easier in pandas and is encouraged. Second, I see that using the index_col=0 argument you use the country names as indices in the dataframe. In this case you should choose to use index over the loc function as follows:
```
df.index[df['Gold'] == df['Gold'].max()][0]
```
September 14, 2021 1:38 PM IST

0
Advika Banerjee

319 1
Function argmax() returns the index of the maximum element in the data frame.''
```
return df['Gold'].argmax()
```
November 20, 2021 12:16 PM IST

0

Member Sign In

Member Sign In

Create Account

How Solve a Data Science Question Using Python's Panda Data Structure Syntax

Connect With Us