19

Is there an exsting string comparison method that will return a value based on the first occurance of a non matching character between two strings?

i.e.

string A = "1234567890"

string B = "1234567880"

I would like to get a value back that would allow me to see that the first occurance of a matching break is A[8]

sll
  • 56,967
  • 21
  • 100
  • 149
Andy
  • 405
  • 1
  • 4
  • 16

6 Answers6

7
/// <summary>
/// Gets a first different char occurence index
/// </summary>
/// <param name="a">First string</param>
/// <param name="b">Second string</param>
/// <param name="handleLengthDifference">
/// If true will return index of first occurence even strings are of different length
/// and same-length parts are equals otherwise -1
/// </param>
/// <returns>
/// Returns first difference index or -1 if no difference is found
/// </returns>
public int GetFirstBreakIndex(string a, string b, bool handleLengthDifference)
{
    int equalsReturnCode = -1;
    if (String.IsNullOrEmpty(a) || String.IsNullOrEmpty(b))
    {
        return handleLengthDifference ? 0 : equalsReturnCode;
    }

    string longest = b.Length > a.Length ? b : a;
    string shorten = b.Length > a.Length ? a : b;    
    for (int i = 0; i < shorten.Length; i++)
    {
        if (shorten[i] != longest[i])
        {
            return i;
        }
    }

    // Handles cases when length is different (a="1234", b="123")
    // index=3 would be returned for this case
    // If you do not need such behaviour - just remove this
    if (handleLengthDifference && a.Length != b.Length)
    {
        return shorten.Length;
    }

    return equalsReturnCode;
}
sll
  • 56,967
  • 21
  • 100
  • 149
  • 1
    Is there a reason why you check `a.Equals(b)` rather than `a == b` ? Your code will break if `a` is null. – Kevin Gosse Dec 14 '11 at 12:34
  • The call to equals will iterate the whole string until the first break barring short-cut cases, before you then iterate the whole string. I'd call `ReferenceEquals` as a short-cut, but leave out the rest of `Equals` as its work will be duplicated anyway. – Jon Hanna Dec 14 '11 at 12:39
  • Now "abc" equals null? Or with the alternative suggested, null mismatches null on its first character? – Jon Hanna Dec 14 '11 at 12:50
  • @Jon : now null is considered as zero-length string so can be handled as difference at 0-index when `handleLengthDifference` is enabled – sll Dec 14 '11 at 13:00
  • if a is equals to b, then it returns -1 and must return the length of a or b, isn't it? – Marc Apr 25 '13 at 06:33
3

If you have .net 4.0 installed, this could be a way:

    string A = "1234567890";
    string B = "1234567880";

    char? firstocurrence = A.Zip(B, (p, q) => new { A = p, B = q })
        .Where(p => p.A != p.B)
        .Select(p => p.A)
        .FirstOrDefault();

edit:

Though, if you need the position:

    int? firstocurrence = A.Zip(B, (p, q) => new { A = p, B = q })
            .Select((p, i) => new { A = p.A, B = p.B, idx = i })
            .Where(p => p.A != p.B)
            .Select(p => p.idx)
            .FirstOrDefault();
Francisco
  • 3,994
  • 3
  • 21
  • 27
  • That's not a requirement in the question. Kind of easy to check afterwards anyway, if two strings have different length – Francisco Dec 14 '11 at 21:11
  • 1
    Not really, because to find where "abc123432343234" mismatches "abcdefghijk" requires you to then do something that would have answered the question in the first place. – Jon Hanna Dec 14 '11 at 21:22
  • 1
    The LINQ is incorrect, it always returns position 0 due to the Where clause. The corred ending after Zip is: .Select((x, idx) => (item: x, idx)).Where(x => x.item.expected != x.item.actual).Select(x => x.idx).FirstOrDefault(); – dsschneidermann Jul 20 '19 at 18:09
2

An extension method along the lines of the below would do the job:

public static int Your_Name_Here(this string s, string other) 
{
    string first = s.Length < other.Length ? s : other;
    string second = s.Length > other.Length ? s : other;

    for (int counter = 0; counter < first.Length; counter++)
    {
        if (first[counter] != second[counter])
        {
            return counter;
        }
    }
    return -1;
}
glosrob
  • 6,457
  • 2
  • 39
  • 66
  • What happens if `other` is shorter than `s`? – Oded Dec 14 '11 at 12:20
  • It bombs :) fair comment, will amend – glosrob Dec 14 '11 at 12:23
  • Names are awful - prefer answer below from @sll – glosrob Dec 14 '11 at 12:29
  • 1
    Returns a false -1 if the longer string starts with the shorter. "abc" mismatches "abcdef" on char 3 (the char "abc" doesn't have at all), but this will return -1 indicating a match. – Jon Hanna Dec 14 '11 at 12:44
  • This answer should have been deleted years ago. If the strings are different, but the lengths are the same, then both `first` and `second` will be assigned to s, and the return value will be -1. – Mark Feldman Oct 30 '20 at 03:08
2

Not that I know of, but it's pretty trivial:

public static int FirstUnmatchedIndex(this string x, string y)
{
  if(x == null || y == null)
    throw new ArgumentNullException();
  int count = x.Length;
  if(count > y.Length)
    return FirstUnmatchedIndex(y, x);
  if(ReferenceEquals(x, y))
    return -1;
  for(idx = 0; idx != count; ++idx)
    if(x[idx] != y[idx])
      return idx;
  return count == y.Length? -1 : count;
}

This is a simple ordinal comparison. Ordinal case-insensitive comparison is an easy change, but culture-base is tricky to define; "Weißbier" mismatches "WEISSBIERS" on the final S in the second string, but does that count as position 8 or position 9?

Jon Hanna
  • 102,999
  • 9
  • 134
  • 232
  • 1
    Hey... we all do that... SO should integrate an online compiler ;) – Oded Dec 14 '11 at 12:29
  • @Oded Serves me right for calling it "trivial". Well, it is, but there's still an imperfection in every answer's first draft. – Jon Hanna Dec 14 '11 at 12:45
1

This topic is a duplicate of Comparing strings and get the first place where they vary from eachother, which contains a better, one line, solution using Linq

Community
  • 1
  • 1
Cathartis
  • 119
  • 5
0

It possible to write string extension like

public static class MyExtensions
{
    public static IList<char> Mismatch(this string str1, string str2)
    {
        var char1 = str1.ToCharArray();
        var char2 = str2.ToCharArray();
        IList<Char> Resultchar= new List<char>();
        for (int i = 0; i < char2.Length;i++ )
        {
            if (i >= char1.Length || char1[i] != char2[i])
                Resultchar.Add(char2[i]);
        }
        return Resultchar;
    }
}

Use it like

var r = "1234567890".Mismatch("1234567880");

It is not an optimized algorithm for finding the mismatch.

If you are interested only to find the the first mismatch,

public static Char FirstMismatch(this string str1, string str2)
        {
            var char1 = str1.ToCharArray();
            var char2 = str2.ToCharArray();             
            for (int i = 0; i < char2.Length;i++ )
            {
                if (i >= char1.Length || char1[i] != char2[i])
                    return char2[i];
            }
            return ''c;
        }
Riju
  • 536
  • 5
  • 19