-2

I have a text file that consists of similar lines and few are half similar to other lines in a text file.

Input.txt

I would like to play: Volleyball
I would like to play: Volleyball
I would like to play: TableTennis
I would like to play: Baseball
I do not know how to play: Volleyball
She would like to play: TableTennis
I want to learn how to play: Baseball
They like to play: all the three

From the input file, I wanted to remove the repeated lines as shown

I would like to play: Volleyball
I would like to play: TableTennis
I would like to play: Baseball
I do not know how to play: Volleyball
She would like to play: TableTennis
I want to learn how to play: Baseball
They like to play: all three

From the input file, I wanted to remove the repeated lines as shown

I would like to play: Volleyball
I would like to play: TableTennis
I would like to play: Baseball
I do not know how to play: Volleyball
She would like to play: TableTennis
I want to learn how to play: Baseball
They like to play: all three

In the next step:

I would like to play
They like to play

a brief explanation for the output file The statement I would like to play covered many different sports so I want that to print. The last line They like to play is a different case so I want to print that line as well. (How about we write these results into .csv format and print the statements that covered the maximum number of sports and also all the unique sports in different columns)

Note: I don't want to print I do not know how to play: Volleyball She would like to play: TableTennis I want to learn how to play: Baseball

because three sports are already covered

I got confused about how we compare the one line with another in the same text file. Any help would be appreciated. Thank you

sandy
  • 27
  • 6
  • Are you saying that if two lines end with the same word keep the first line only? – Cary Swoveland Jul 17 '20 at 04:22
  • try this, ```"\n".join(set(text.splitlines()))``` – sushanth Jul 17 '20 at 04:33
  • @Sushanth I do not want to join the lines. Sorry, I could not catch you. – sandy Jul 17 '20 at 05:15
  • @CarySwoveland Updated the question a bit more. Please have a look – sandy Jul 17 '20 at 05:16
  • Looks like you are looking to create a regex, but do not know where to get started. Please check [Reference - What does this regex mean](https://stackoverflow.com/questions/22937618) resource, it has plenty of hints. Also, refer to [Learning Regular Expressions](https://stackoverflow.com/questions/4736) post for some basic regex info. Once you get some expression ready and still have issues with the solution, please edit the question with the latest details and we'll be glad to help you fix the problem. – Wiktor Stribiżew Jul 17 '20 at 05:20

2 Answers2

0

You can follow this:

with open('Input.txt') as f:
    content = f.readlines()
import pandas as pd
content=pd.unique(content).tolist()

or

with open('Input.txt') as f:
    content = f.readlines()
result = []
for line in content:
    if line not in result:
        result.append(line)
  • 1
    This is good. Maybe you could even just convert “content” from a list to a set and not create “result” at all. – Wilf Rosenbaum Jul 17 '20 at 04:22
  • For this situation converting a list to a set is a great idea. Thanks – Md. Fantacher Islam Jul 17 '20 at 04:28
  • @Md. Fantacher Islam With the above code I am getting the ouput in multiple lines by adding the next line which is not repeated. I donot want to print the line by appending every time. I updated the question for a bit more clarity. Thanks – sandy Jul 17 '20 at 05:06
-1

This Is Simple Enough Do It Like This in you '.py' file:

"""Simple Solution To Your Problem!"""

# Opening The Input File- `input.txt`
f = open('input.txt', encoding='utf-8', mode='w+')
new_file = '\
I would like to play: Volleyball\n\
I would like to play: Volleyball\n\
I do not know how to play: Volleyball\n\
I would like to play: Baseball\n\
I want to learn how to play: Volleyball'
f.write(new_file)
del f  # To Read The File Again


# Next, Printing Lines 1, 3, 4
with open('input.txt', encoding='utf-8', mode='r') as f:
lines = f.readlines()
wanted_lines = [0, 3, 4]
for each_line in wanted_lines:
    print(lines[each_line])
del f  # Just To Save Some Memory
HamBurger
  • 739
  • 4
  • 19
  • Actually the input lines are not fixed to go with what lines what lines we need. Could you please check once I updated the question a bit more clear. – sandy Jul 17 '20 at 04:55