I'm fetching continuous updates from a site. Whenever I run my script I get an old_string
, which is the string currently stored in my database. I also get a new_string
which contains the current text body fetched from the site.
Is there a smart way to check which sentences of the new_string
are not in old_string
? To find which are the newest updates/changes and store that in newest_updates
?
An example where I use --> x <--
to indicate new/modified string:
old_string =
"Inbound restrictions:
The country’s airports closed to international flights on 18 March and will remain closed until 1
April. The land and sea borders at this time remain open.
Travellers coming from Brazil, China, Dominican Republic, French Guiana, Italy, Iran, Jamaica, Japan,
Malaysia, Panama, Singapore, South Korea, St Vincent, Thailand and the US should anticipate increased
screenings upon arrival. There is also a possibility that these individuals would be denied entry
into the country, according to government officials.
There are currently no known restrictions on individuals seeking to depart the country."
new_string =
"Inbound restrictions:
The country’s airports closed to international flights on 18 March and will remain closed until -->5
April<--. The land and sea borders at this time remain open.
Travellers coming from Brazil, China, Dominican Republic, French Guiana, Italy, Iran,-->Sweden<--, Jamaica, Japan,
Malaysia, Panama, Singapore, South Korea, St Vincent, Thailand and the US should anticipate increased
screenings upon arrival. There is also a possibility that these individuals would be denied entry
into the country, according to government officials.
There are currently no known restrictions on individuals seeking to depart the country.-->
Outbound restrictions:
There are currently no known restrictions on individuals seeking to depart the country.<--"
From this the output would be :
newest_updates = "The country’s airports closed to international flights on 18 March and will remain
closed until 5 April.
Travellers coming from Brazil, China, Dominican Republic, French Guiana, Italy, Iran,Sweden,
Jamaica, Japan, Malaysia, Panama, Singapore, South Korea, St Vincent, Thailand and the US should
anticipate increased screenings upon arrival
Outbound restrictions:
There are currently no known restrictions on individuals seeking to depart the country."
What would be the best way to do this? A suggestion is to use difflib
. But with difflib
I catch every sentence that is common in the two sentences, even if no changes have been made.