Filtering by string giving me empty results

Question

I am asking for any other algorithm or method that you would use to detect anomalies on a single column.

Filtering by columns not showing the data.

I am using the following approach to limit my dataframe only to two columns

X=pd.read_csv(‘C:/Users/Path/file.csv’, usecols=[“Describe_File”, "numbers"])

Describe_File   numbers
0   This is the start   25
1   Ending is coming    42
2   Middle of the story 525
3   This is the start   65
4   This is the start   25
5   Middle of the story 35
6   This is the start   28
7   This is the start   24
8   Ending is coming    24
9   Ending is coming    35
10  Ending is coming    25
11  Ending is coming    24
12  This is the start   215

Now I want to go to column ** Describe_File** , filter by the string This is the start and then show my the values of numbers

To do so I usually use the following code, by for some reason it is not giving me anything. The string exists on my csv file

X = X[X.Describe_File == "This is the start"]

score 1 · Answer 1 · answered Mar 02 '20 at 09:16

1

You can use the .str.contains() - vectorized substring search, i.e.

df = X[X.Describe_File.str.contains("This is the start", regex=False)]

answered Mar 02 '20 at 09:16

Oleg O

873
3
10

@Oleg O I am still getting the same result when I type ```df.shape``` getting ```(0, 2) ``` – E199504 Mar 02 '20 at 09:27
There's something wrong with strings then (some crazy symbols that snuck in). You can try to find it out by reducing the substring, e.g. `contains("start", regex=False)` – Oleg O Mar 02 '20 at 09:42

Filtering by string giving me empty results

1 Answers1