1

Is there a built in method (I can't find it by searching the documentation) to see the number of similar letters in two strings? The order of the letters are not relevant so comparing "abc" to "cad" would have a 66% match for the characters 'c' and 'd'. The number of occurences is also relevant. 'a' should match the first time around, but not on the second since there is only one common 'a' between the two strings. Is there a built in way to do this currently by using some bitwise operation or do I have to loop and manually compare?

randombits
  • 41,533
  • 69
  • 218
  • 394
  • Does the number of occurrences matter? For example, are "abc" and "abac" a 100% match since they both contain the characters 'a', 'b' and 'c'? – Chuck Mar 06 '13 at 21:42
  • `commonPrefixWithString` method of `NSString` could be helpful I feel – nsgulliver Mar 06 '13 at 21:43
  • Number of occurrences is definitely relevant. I'll update my question to reflect that. – randombits Mar 06 '13 at 21:45
  • Number of similar letters? How is an a from abc any different from an a from cad? – El Tomato Mar 06 '13 at 21:46
  • TBlue: it is NOT any different. But it should only match ONCE if cad was caad, for instance. – randombits Mar 06 '13 at 21:47
  • You could use NSString's `stringByTrimmingCharactersInSet` passing a NSCharacterSet containing the characters you're comparing against; you'd then have a string containing only the characters that didn't match, and you could then extract a similarity level by comparing the original and trimmed string lengths. I leave it to you to figure out the number of occurences part :) – Thiago Campezzi Mar 06 '13 at 22:03

2 Answers2

3

You will have to build this yourself, but here is a shortcut for doing it. There is a built-in collection class called NSCountedSet. This object keeps each unique object and a count of how many of each were added.

You can take the two strings and load their characters into two different NSCountedSet collections. Then just check the items in the resulting collections. For example, grab an object from the first NSCountedSet. Check to see if it exists in the second NSCountedSet. The smaller of the 2 counts for that particular letter is how many of those letters that the 2 strings have in common. To shorten the number of iterations, start with the collection with fewer objects and then enumerate through those objects.

Here is Apple's Documentation for NSCountedSet. https://developer.apple.com/library/ios/#documentation/Cocoa/Reference/Foundation/Classes/NSCountedSet_Class/Reference/Reference.html

Jeff Wolski
  • 6,350
  • 6
  • 34
  • 68
1

I am hesitant to say but, there is probably no method out there that fills your requirements. I'd do this:

Create a category on NSString. Lets call it -(float)percentageOfSimilarCharactersForString:(NSString*)targetString

Here's a rough pseudocode that goes into this category:

  1. Make a copy of self called selfCopy and trimselfCopy` to contain only unique characters.
  2. Similarly trim targetString to unique characters. For trimming to unique characters, you could utilize NSSet or a subclass thereof. Looping over each character and adding to a set would help.
  3. Now sort both sets by ASCII values.
  4. Loop through each character of targetString-related NSSet and check for it's presence in selfCopy-related NSSet. For this you could use another category called containsString. You can find that here. Every time containsString returns true, increment a pre-defined counter.
  5. Your return value would be (counter_value/length_of_selfCopy)*100.
Community
  • 1
  • 1
Ravi
  • 7,029
  • 6
  • 36
  • 48