1

I am analyzing some text files and I want to extract a specific word every time that the word is found in the file.

Imagine that I have 'Sports' in the file then I want to extract the Word 'SPORTS' based on a list.

I've the following code:

content = ['Sports', 'Nature', 'Football']
path = filename
with open(path) as auto:
    for line in auto:
        if any(x.lower() in line.lower() for x in content):
            print(line)

My text file has this content:

Sports TV is the home of football videos. 
Complex game to follow.
home of football

With my code I print all the lines with 'Sports' and 'Football':

Sports TV is the home of football videos. 

home of football

But I want to see the following result:

Sports
football

How can I print only the word that I have on List instead of all the line?

Thanks!

Dlamini
  • 255
  • 1
  • 9
Pedro Alves
  • 846
  • 12
  • 27
  • What are you looking for exactly? Are you checking if the text contains the word? Are you counting amount of times the words were found in text? Are you looking at grammar (as in starts with capital letter, plural, etc.)? – user3053452 Apr 02 '19 at 15:16
  • I will export a lot of analysis for a file and one of the columns is the word found on the file that matches with my list – Pedro Alves Apr 02 '19 at 15:24
  • and how will you manage multiple found words? Like 1st sentence in your case where there are football and sports? – user3053452 Apr 02 '19 at 15:26

2 Answers2

1

list.txt:

Sports TV is the home of football videos. 
Complex game to follow.
home of football

Hence:

content = ['Sports', 'Nature', 'Football']
path = 'list.txt'

with open(path) as auto:
    print([[x.lower() for x in content if x.lower() in line.lower()] for line in auto])

OUTPUT:

[['sports', 'football'], [], ['football']]

Since:

line 1 had sports and football

line 2 had no matching elements from content list

line 3 had football

DirtyBit
  • 15,671
  • 4
  • 26
  • 53
0

you are printing the entire line at the moment

try:

content = ['Sports', 'Nature', 'Football']
path = filename
with open(path) as auto:
    for line in auto:
        for x in content:
            if x.lower() in line.lower():
                print(x)
Zulfiqaar
  • 543
  • 5
  • 12