1

It is clear from this question that there are many ways to remove duplicates from an NSArray when the array's elements are primitive types, or when the elements are perfect duplicates. But, is there a way to remove duplicates based on a transformation applied to each element, as is permitted in Underscore.js's uniq function, rather than by simply comparing the whole elements? And if a manual implementation would be difficult to optimize, is there an efficient system-provided method (or 3rd party library algorithm) for accomplishing this that I am missing?

Community
  • 1
  • 1
danielmhanover
  • 2,884
  • 3
  • 32
  • 50
  • 2
    There can never be a "system-provided method" which is more efficient than manual, other than perhaps eliminating some loop overhead. – Hot Licks Aug 02 '14 at 23:21
  • System-provided methods are almost always more efficient than manual, simply because a lot more work goes into them, and because they have access to private APIs we may not. – danielmhanover Aug 02 '14 at 23:23
  • 1
    There's no "magic" in system-provided methods. Just because you can replace a loop with a single method doesn't mean the actual code is any simpler or faster. At best you save a few calls. And in many cases a system-provided method is actually slower, because it must account for all sorts of edge cases you don't need to worry about. – Hot Licks Aug 02 '14 at 23:32
  • I suppose, but I still don't see a need to reinvent the wheel if it isn't necessary (though it seems in this case that it is) – danielmhanover Aug 02 '14 at 23:36
  • 2
    Every day here I see folks spending hours seeking help finding some sort of supposely-efficient, built-in mechanism for performing something that could easily be coded up in 5-10 minutes using plain vanilla statements. Even if there is a "canned" solution, the time spent finding it is often not worth the effort invested. – Hot Licks Aug 02 '14 at 23:39

4 Answers4

1

A simple approach:

NSMutableArray* someArray = something;

for (int i = someArray.count - 1; i > 0; i--) {
    MyObject* myObject = someArray[i];
    for (int j = 0; j < i; j++) {
        MyObject* myOtherObject = someArray[j];
        if ([myObject isSortaEqual:myOtherObject]) {
            [someArray removeObjectAtIndex:i];
            break;
         }
     }
}

Yes, it's N-squared, but that's not a biggie unless the array is fairly large.

Hot Licks
  • 44,830
  • 15
  • 88
  • 146
  • (I'll let the reader figure out why the outer loop is backwards. It's a technique I learned before many of you were born.) – Hot Licks Aug 02 '14 at 23:40
1

If you want to redefine what equality means for your objects, then consider overriding -hash and -isEqual:. Then you can create an NSSet from your array if order is irrelevant, or an NSOrderedSet if it is relevant. Here's an example of a Person class where I want the name of the person to determine object equality.

@interface Person
@property (nonatomic, copy) NSString *name;
@end

@implementation Person

- (BOOL)isEqual:(id)object
{
    Person *otherPerson = (Person *)object;
    return [self.name isEqualToString:otherPerson.name];
}

- (NSUInteger)hash
{
    return [self.name hash];
}

@end

Uniquing them now is rather easy:

NSArray *people = ...;

// If ordered is irrelevant, use an NSSet
NSSet *uniquePeople = [NSSet setWithArray:people];

// Otherwise use an NSOrderedSet
NSOrderedSet *uniquePeople = [NSOrderedSet orderedSetWithArray:people];
Anurag
  • 132,806
  • 34
  • 214
  • 257
  • I gather that you don't mean to have `uniquePeople` defined twice, but are simply giving two alternatives? – Hot Licks Aug 02 '14 at 23:53
  • Yeah that's to show a version without preserving order and one with. I'll update the code to clarify. Thanks – Anurag Aug 02 '14 at 23:54
0

Absolutely. You are looking for a way to pass your own method for testing for uniqueness (at least, that's what the uniq function you refer to does).

indexesOfObjectsPassingTest: will allow you to pass your own block to determine uniqueness. The result will be an NSIndexSet of all the objects in the array that matched your test. With that you can derive a new array. The block you are passing is roughly equivalent to the Underscore iterator passed to uniq.

The sister method, indexesOfObjectsWithOptions:passingTest: also allows you to specify enumeration options (i.e. concurrent, reverse order, etc.).

As you mention in your question, there are lots of ways to accomplish this. NSExpressions with blocks, Key-value coding collections operators, etc. could be used for this as well. indexesOfObjectsPassingTest: is probably the closest to what you seem to be looking for, though you can do much the same thing (with a lot more typing) using expressions.

quellish
  • 20,584
  • 4
  • 72
  • 81
0

I just came up against this problem, so I wrote a category on NSArray:

@interface NSArray (RemovingDuplicates)

- (NSArray *)arrayByRemovingDuplicatesAccordingToKey:(id (^)(id obj))keyBlock;

@end

@implementation NSArray (RemovingDuplicates)

- (NSArray *)arrayByRemovingDuplicatesAccordingToKey:(id (^)(id obj))keyBlock
{
    NSMutableDictionary *temp = [NSMutableDictionary dictionaryWithCapacity:[self count]];

    for (NSString *item in self) {
        temp[keyBlock(item)] = item;
    }

    return [temp allValues];
}

@end

You can use it like this (this example removes duplicate words, ignoring case):

NSArray *someArray = @[ @"dave", @"Dave", @"Bob", @"shona", @"bob", @"dave", @"jim" ];

NSLog(@"result: %@", [someArray arrayByRemovingDuplicatesAccordingToKey:^(id obj){ 
    return [obj lowercaseString];
}]);

Output:

2015-02-17 17:44:10.268 Untitled[4043:7711273] result: (
    dave,
    shona,
    jim,
    bob
)

The 'key' is a block that returns an identifier used to compare the objects. So if you wanted to remove Person objects according to their name, you'd pass ^(id obj){ return [obj name]; }.

This solution is O(n), so is suitable to large arrays, but doesn't preserve order.

joerick
  • 14,682
  • 4
  • 48
  • 55