How can I exclude columns in a Daraframe if the content of a row contains a specific substring?

Question

This question touches upon the problem that I am facing. However, instead of excluding columns based on an exact string match, I want to exclude columns if a particular substring matches.

For example, in the image below, I would like to filter out columns A and C because they contain the substring 'is'

How would I go on about doing this? I replaced df.loc[:, ~(df == 'Salty').any()] from @cs95's answer to df.loc[:, ~(re.findall('/\w+(?:is)\w+/', df)).any()] but this gives me a

TypeError: expected string or bytes-like object

Any help would be appreciated!

Input

|---------------------|------------------|----------------|----------|
|                     |         A        |          B     |   C      |
|---------------------|------------------|----------------|----------|
|          Value      |        Red       |    Green       |Blue      |
|---------------------|------------------|----------------|----------|
|          12         |  HotisGood       |    Warm        |isGood    |
|---------------------|------------------|----------------|----------|

Output

|---------------------|--------------|
|                     |        B     |   
|---------------------|--------------|
|          Value      |   Green      |
|---------------------|--------------|
|          12         |  Warm        |
|---------------------|--------------|

Quang Hoang · Accepted Answer · 2020-05-19T19:52:03.490

3

IIUC, you can do:

cols = df.apply(lambda x: x.str.contains('is').any())
df.loc[:, ~cols]

Output:

           B
Value  Green
12      Warm

Note: Please next time consider include your sample data/expected output as text.

edited May 19 '20 at 19:52

answered May 19 '20 at 19:32

Quang Hoang

117,517
10
34
52

Hi! I included the sample input table and output table - is the image not visible/corrupted? Should I re-upload it? – Mega_Noob May 19 '20 at 19:33
@Mega_Noob no, images are not encouraged, see a guide to [ask good question](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – Quang Hoang May 19 '20 at 19:35
1

Understood, thank you. I will update the question right away - jbtw the output is the exact opposite of what I want (which is to exclude the columns that are being shown now and keep everything else) – Mega_Noob May 19 '20 at 19:37
@Mega_Noob please see updated. As you also see, the text data helps me debug my code and reproduce your expected outcome easily. – Quang Hoang May 19 '20 at 19:53
Thanks a ton! I'll keep formatting in mind for the future – Mega_Noob May 19 '20 at 19:55

How can I exclude columns in a Daraframe if the content of a row contains a specific substring?

1 Answers1