1

I cannot figure out what the problem is with the code, it is giving me "invalid syntax error" but im following exact instructions and it looks accurate, i'm trying to get just the people with over 30 doubles ('2B') and in the AL league from the merged data below (d820hw5p3)... any ideas whats going on??

d820hw5p6= d820hw5p3[(d820hw5p3.2B > 30) & (d820hw5p3.LEAGUE == 'AL')]
d820hw5p6

d820hw5p3 is this data:

First         Last    R    H   AB LEAGUE  2B  3B  HR  RBI
0      Leonys       Martin   72  128  518     AL  17   3  15   47
1         Jay        Bruce   74  135  540     NL  27   6  33   99
2      Jackie  Bradley Jr.   94  149  558     AL  30   7  26   87
3      George     Springer  116  168  644     AL  29   5  29   82
4       Corey    Dickerson   57  125  510     AL  36   3  24   70
5      Dexter       Fowler   84  126  457     NL  25   7  13   48
6       Angel        Pagan   71  137  495     NL  24   5  12   55
7        Adam        Eaton   91  176  620     AL  29   9  14   59
8     Yasmany        Tomas   72  144  529     NL  30   1  31   83
9     Gregory      Polanco   79  136  527     NL  34   4  22   86
10      Nomar       Mazara   59  137  515     AL  13   3  20   64
11     Justin        Upton   81  140  569     AL  28   2  31   87
12      Bryce       Harper   84  123  506     NL  24   2  24   86
13       Kole      Calhoun   91  161  594     AL  35   5  18   75
14      Ender     Inciarte   85  152  522     NL  24   7   3   29
15     Jacoby     Ellsbury   71  145  551     AL  24   5   9   56
16     Curtis   Granderson   88  129  544     NL  24   5  30   59
17     Mookie        Betts  122  214  673     AL  42   5  31  113
18     Denard         Span   70  152  571     NL  23   5  11   53
19       Adam       Duvall   85  133  552     NL  31   6  33  103
20      Brett      Gardner   80  143  548     AL  22   6   7   41
21       Matt         Kemp   89  167  623     NL  39   0  35  108
22      Khris        Davis   85  137  555     AL  24   2  42  102
23       Mike        Trout  123  173  549     AL  32   5  29  100
24      Melky      Cabrera   70  175  591     AL  42   5  14   86
25       Jose     Bautista   68   99  423     AL  24   1  22   69
26        Ian      Desmond  107  178  625     AL  29   3  22   86
27       Alex       Gordon   62   98  445     AL  16   2  17   40
28       Ryan        Braun   80  156  511     NL  23   3  30   91
29       Nick     Markakis   67  161  599     NL  38   0  13   89
30     Carlos     Gonzalez   87  174  584     NL  42   2  25  100
31     Yoenis     Cespedes   72  134  479     NL  25   1  31   86
32    Stephen     Piscotty   86  159  582     NL  35   3  22   85
33    Michael     Saunders   70  124  490     AL  32   3  24   57
34     Jayson        Werth   84  128  525     NL  28   0  21   69
35      Howie     Kendrick   65  124  486     NL  26   2   8   40
36       Adam        Jones   86  164  619     AL  19   0  29   83
37    Marcell        Ozuna   75  148  556     NL  23   6  23   76
38      Jason      Heyward   61  122  530     NL  27   1   7   49
39     Marwin     Gonzalez   55  123  484     AL  26   3  13   51
40   Starling        Marte   71  152  489     NL  34   5   9   46
41       J.D.     Martinez   69  141  459     AL  35   2  22   68
42      Kevin       Pillar   59  146  549     AL  35   2   7   53
43    Charlie     Blackmon  111  187  577     NL  35   5  29   82
44     Odubel      Herrera   87  167  584     NL  21   6  15   49
45  Christian       Yelich   78  172  577     NL  38   3  21   98
46     Andrew    McCutchen   81  153  598     NL  26   3  24   79
pixe anzu
  • 35
  • 5
  • 1
    try `d820hw5p3['2B'] > 30` instead of `d820hw5p3.2B > 30` – Quang Hoang Mar 04 '20 at 17:42
  • Can you please tell about the exact error? Meanwhile, can you please try replacing '&' with 'and' – Vatsal Gupta Mar 04 '20 at 17:43
  • hmm so yeah the [ ] didn't help, but I did confirm the error is defintley with the first condition, as when I do the LEAGUE condition alone it works fine, but the 1st cond alone does not. I will post the exact error below: – pixe anzu Mar 04 '20 at 17:50
  • d820hw5p6= d820hw5p3[(d820hw5p3.2B > 30) and (d820hw5p3.LEAGUE == 'AL')] d820hw5p6 File "", line 1 d820hw5p6= d820hw5p3[(d820hw5p3.2B > 30) and (d820hw5p3.LEAGUE == 'AL')] ^ SyntaxError: invalid syntax – pixe anzu Mar 04 '20 at 17:50
  • I wonder if the fact it starts with a 2 is problematic?? does this make it think it's something else, as I notice when I pasted it on here atleast, the 2B/3B colums are red along with the values throughout, but black for the rest of the columns – pixe anzu Mar 04 '20 at 17:52
  • _I wonder if the fact it starts with a 2 is problematic??_ That might be it, and even if it isn't, I see no reason to use the `.`/attribute style over `[ ]`. Please provide a [mcve], at least more of your code. Why do you have a variable named `d820hw5p6` ? – AMC Mar 04 '20 at 18:01

1 Answers1

0

I went of AMC's hunch that the column starting with a 2 is problematic, and created this minimal reproducible example:

import pandas as pd
# define Data Frame
df= pd.DataFrame({
    'name': ['A', 'B', 'C'],
    '2b': [1, 2, 3],
    'b2': [4, 5, 6],
})

# Try to access column 2b
df.2b

Which returns SyntaxError: invalid syntax

While df['2b'] returns the expected series. I did a brief search for documentation about this, and didn't see anything, but I expect it has something to do with this: Variable names in Python cannot start with a number or can they?

So in the end, while 2b is a valid column name, you will have to access it's series by using the df['column'] method.

Nathan Clement
  • 798
  • 1
  • 16
  • 25