I am trying to compare two Arabic strings using python's difflib.HtmlDiff
module. I have looked at various ways of writing the outputs of HtmlDiff
to a file but none seems to work for me. Methods I have tried so far:
Note: in all subsequent code snippets, original
and mockinputs
are lists of strings, as required by HtmlDiff
, of Unicode text, specifically Arabic.
Method 1
import difflib
hdiff = difflib.HtmlDiff()
html = hdiff.make_file(original, mockinputs)
with open('out_file.html', 'w', encoding='utf-8') as out_file:
out_file.write(html)
This runs without error but the html file created is gibberish (things like الرØÙ
) when opened in browser.
Method 2
(as pointed out here)
import difflib
htmldiff = difflib.HtmlDiff()
html = htmldiff.make_file(original, mockinputs)
out_file = open('out_file.html', 'w')
out_file.write(html.encode('utf-8'))
out_file.close()
This gives me this error:
TypeError: must be str, not bytes
So, how can I write Unicode texts produced by HtmlDiff
as shown here to an html file in python 3?
I am using python 3.4.3.