4

I have two files which contains same number of lines.

"file1.txt" contains following lines:

 Attitude is a little thing that makes a big difference
 The only disability in life is a bad attitude
 Abundance is, in large part, an attitude
 Smile when it hurts most

"file2.txt" contains:

 Attitude is a little thing that makes a big difference
 Everyone has his burden. What counts is how you carry it
 Abundance is, in large part, an attitude
 A positive attitude may not solve all your problems  

I want to compare two files line by line and if any of the line mismatches between two files i want to

 print "mismatch in line no: 2"
 print "mismatch in line no: 4"   #in this case lineno: 2 and lineno: 4 varies from second file

I tried.but i can print only the line in file1 which is differ from line in file2.can't able to print the line number of mismatched lines.??

 My code:
 with open("file1.txt") as f1:
    lineset = set(f1)
 with open("file2.txt") as f2:
    lineset.difference_update(f2)
    for line in lineset:
        print line
Toni Toni Chopper
  • 1,805
  • 2
  • 20
  • 29
user3116273
  • 61
  • 2
  • 10
  • Why are you making it a set? Do you want to erase duplicates? – Alex Chumbley Dec 19 '13 at 16:24
  • no i dnt want to erase that line.i want to print the line no of file1 which mismatch the file2 lines.in my case line 2 and line 4 differs from file2.so i want to print mismatch in line2 and 4 – user3116273 Dec 19 '13 at 16:26
  • Have you heard of `diff`? This is kinda reinventing the wheel .. – wim Dec 19 '13 at 16:28

4 Answers4

8

Using itertools.izip and enumerate:

import itertools

with open('file1.txt') as f1, open('file2.txt') as f2:
    for lineno, (line1, line2) in enumerate(itertools.izip(f1, f2), 1):
        if line1 != line2:
            print 'mismatch in line no:', lineno
falsetru
  • 314,667
  • 49
  • 610
  • 551
2

What if:

with open("file1.txt") as f1:
    with open("file2.txt") as f2:
        for idx, (lineA, lineB) in enumerate(zip(f1, f2)):
            if lineA != lineB:
                print 'mismatch in line no: {0}'.format(idx)

Or if there are a different number of rows you can try izip_longest

import itertools

with open("file1.txt") as f1:
    with open("file2.txt") as f2:
        for idx, (lineA, lineB) in enumerate(itertools.izip_longest(f1, f2)):
            if lineA != lineB:
                print 'mismatch in line no: {0}'.format(idx)
Artsiom Rudzenka
  • 24,197
  • 3
  • 30
  • 49
1

You might be able to use the difflib module. Here's a simple example using its difflib.Differ class:

import difflib
import sys

with open('file1.txt') as file1, open('file2.txt') as file2:
    line_formatter = '{:3d}  {}'.format
    file1_lines = [line_formatter(i, line) for i, line in enumerate(file1, 1)]
    file2_lines = [line_formatter(i, line) for i, line in enumerate(file2, 1)]
    results = difflib.Differ().compare(file1_lines, file2_lines)
    sys.stdout.writelines(results)

Output:

    1  Attitude is a little thing that makes a big difference
-   2  The only disability in life is a bad attitude
+   2  Everyone has his burden. What counts is how you carry it
    3  Abundance is, in large part, an attitude
-   4  Smile when it hurts most
+   4  A positive attitude may not solve all your problems

The minus and plus characters in the first column indicate lines that were replaced in typical diff utility program style. The absence of any indicator means the line was the same in both files -- you could suppress the printing of those if you wished, but to keep the example simple everything the compare() method creates is being printed.

For reference, here's the contents of the two files side-by-side with line numbers shown:

1  Attitude is a little thing that makes a big difference    Attitude is a little thing that makes a big difference
2  The only disability in life is a bad attitude             Everyone has his burden. What counts is how you carry it
3  Abundance is, in large part, an attitude                  Abundance is, in large part, an attitude
4  Smile when it hurts most                                  A positive attitude may not solve all your problems
martineau
  • 99,260
  • 22
  • 139
  • 249
0
import itertools

with open('file1.txt') as f1, open('file2.txt') as f2:
    for lineno, (line1, line2) in enumerate(zip(f1, f2), 1):
        if line1 != line2:
            print ('mismatch in line no:', lineno)
Baptiste Mille-Mathias
  • 1,719
  • 3
  • 24
  • 30