0

Need to remove all rows that contain the word 'thread' for example a row in the file reads 'Post-Match Thread: Liverpool 4-0 Barcelona [4-3 on agg.]'

I have tried using the code below as mentioned in other answers

df[~df.post_title.str.contains('Thread')]

but this appears to do nothing. The rest of the code is below

import pandas as pd
from nltk.corpus import stopwords
import string
from nltk.stem import WordNetLemmatizer
import re

lemma = WordNetLemmatizer()

pd.read_csv('soccer.csv', encoding='utf-8')
df = pd.read_csv('soccer.csv')

df.columns = ['post_id', 'post_title', 'subreddit']

df[~df.post_title.str.contains('Thread')]

df.to_csv('clean_soccer2.csv', encoding='utf-8-sig')
plshelpme_
  • 47
  • 1
  • 9

0 Answers0