1

I have an excel file to analyze but have a lot of data that I don't want to analyze, can we delete a column if we don't find the value SpaceX string in the first row like following

SL#   State   District  10/01/2021  10/01/2021  10/01/2021  11/01/2021  11/01/2021   11/01/2021
                        SpaceX in     Star in       StarX out       SpaceX out      Star out       StarX in
1      wb      al        10           11          12          13        14           15
2      wb      not       23           22          20          24        25           25

Now here I want to delete the columns where in the rows SpaceX not there. And then Want to delete the SpaceX as well to shift up the rows ultimate output will look like as follows

SL#   State   District  10/01/2021    11/01/2021
1      wb      al        10            13      
2      wb      not       23            24

Tried with loc and iloc functions but no clue at the moment.

Also checked the answer: Drop columns if rows contain a specific value in Pandas but it's different. I'm checking the substring not the exact value match.

ThunderStorm
  • 125
  • 1
  • 1
  • 11

1 Answers1

2

Firstly create a boolean mask with startswith() method and fillna() method:

mask=df.loc[0].str.startswith('SpaceX').fillna(True)

Finally use Transpose(T) attribute,loc accessor and drop() method:

df=df.T.loc[mask].T.drop(0)

Output of df:

    SL#     State   District    2021-01-10 00:00:00     2021-01-11 00:00:00     2021-01-12 00:00:00
1   1.0     wb      al          10                              13                  16
2   2.0     wb      not         23                              13                  16
Anurag Dabas
  • 7,118
  • 4
  • 9
  • 28