-2

I tried to compute the difference between two sentences as follows:

import difflib

text1_lines = "I understand how customers do their choice. Difference"
text2_lines = "I understand how customers do their choice."
diff = difflib.ndiff(text1_lines, text2_lines)

I would like to get a difference

but I am not getting that. What am I doing wrong ? Thanks for letting me know.

Alec
  • 6,521
  • 7
  • 23
  • 48
henry
  • 934
  • 11
  • 25

4 Answers4

2

from the Docs:

import difflib
import sys

text1_lines = "I understand how customers do their choice. Difference"
text2_lines = "I understand how customers do their choice."
diff = difflib.context_diff(text1_lines, text2_lines)
for line in diff:
    sys.stdout.write(line)

Output:

*** 
--- 
***************
*** 41,54 ****
c  e  .-  - D- i- f- f- e- r- e- n- c- e--- 41,43 ----
LinPy
  • 12,525
  • 3
  • 19
  • 35
  • Thanks for your answer. How do I get plain: "Difference" without all the extra stuff, signs etc. ? – henry May 16 '19 at 05:54
1

split the larger string with smaller string and you will get the difference.

if len(a) == 0:
   print b
   return
if len(b) == 0:
   print a
   return
if len(a)>len(b): 
   res=''.join(a.split(b))             #get diff
else: 
   res=''.join(b.split(a))             #get diff

print(res.strip())     
prashant
  • 1,622
  • 2
  • 18
  • 30
1

Use a simple list comprehension:

diff = [x for x in difflib.ndiff(text1_lines, text2_lines) if x[0] != ' ']

It will show you the deletions and addendums

Output:

['-  ', '- D', '- i', '- f', '- f', '- e', '- r', '- e', '- n', '- c', '- e']

(everything with a minus behind it was deleted)

Conversely, switching text1_lines and text2_lines would produce this result:

['+  ', '+ D', '+ i', '+ f', '+ f', '+ e', '+ r', '+ e', '+ n', '+ c', '+ e']

To remove signs, you can convert the above list:

diff_nl = [x[2] for x in diff]

To fully convert to a string, just use .join():

diff_nl = ''.join([x[2] for x in diff])
Alec
  • 6,521
  • 7
  • 23
  • 48
0

Using actual difflib, this is how you would do it. The problem is you are getting a generator, which is kind of like a packed for loop, with the only way to unpack it being to iterate over it.

import difflib
text1_lines = "I understand how customers do their choice. Difference"
text2_lines = "I understand how customers do their choice."
diff = difflib.unified_diff(text1_lines, text2_lines)

unified_diff is different to ndiff in that it only shows what is different, where as ndiff shows what is both similar and different. diff is now a generator object, and all that is left to do is unpack it

n = 0
result = ''
for difference in diff:
    n += 1
    if n < 7: # the first 7 lines is a bunch of information unnecessary for waht you want
        continue
    result += difference[1] # the character at this point will either be " x", "-x" or "+x"

And finally:

>>> result
' Difference'
Recessive
  • 1,184
  • 2
  • 8
  • 24