6

Is there an elegant way to ignore whitespace in a diff in python (using difflib, or any other module)? Maybe I missed something, but I've scoured the documentation, and was unable to find any explicit support for this in difflib.

My current solution is to just break my text into lists of words, and then diff those:

d.compare(("".join(text1_lines)).split(), ("".join(text2_lines)).split())

The disadvantage of this is that if one wants a report of line-by-line differences, rather than word-by-word, one must merge the output of the diff with the original file text. This is easily doable, but a bit inconvenient.

Max Wallace
  • 3,185
  • 26
  • 41
  • 1
    I'm not sure that's wise in a language in which whitespace is part of the syntax! You can change the behavior just by adding or removing whitespace. – Codie CodeMonkey Aug 05 '13 at 18:03
  • 3
    @CodieCodeMonkey: OP doesn't seem to be `diff`ing Python code (and even if he was, it can be useful to ignore whitespace changes, if, for example, you've changed tabs to spaces and only want to see what else you changed.) – Wooble Aug 05 '13 at 18:05
  • @Wooble, thanks, you're right, I misread. – Codie CodeMonkey Aug 05 '13 at 18:06
  • @Wobble correct, I'm I'm not diffing python code itself. – Max Wallace Aug 05 '13 at 18:07
  • Max, you might find something helpful in an older project of mine called adiff: https://bitbucket.org/agriffis/adiff/src/tip/adiff.py ... It's designed to be a command-line program, but you could also import and use the internal classes directly. – Aron Griffis Aug 05 '13 at 19:25
  • Thanks, Aron. I'll definitely take a look, if not for this project (as I've already implemented what I need), then for a future one. If I find anything insightful about how this problem is generally handled in diff tools more sophisticated than my own, I'll be sure to post it back here. – Max Wallace Aug 06 '13 at 15:30

0 Answers0