71

I'm looking for a quick and easy way to strip non-alphanumeric characters from an NSString. Probably something using an NSCharacterSet, but I'm tired and nothing seems to return a string containing only the alphanumeric characters in a string.

Monolo
  • 17,926
  • 16
  • 63
  • 102
Jeff Kelley
  • 18,594
  • 5
  • 67
  • 80

8 Answers8

147

We can do this by splitting and then joining. Requires OS X 10.5+ for the componentsSeparatedByCharactersInSet:

NSCharacterSet *charactersToRemove = [[NSCharacterSet alphanumericCharacterSet] invertedSet];
NSString *strippedReplacement = [[someString componentsSeparatedByCharactersInSet:charactersToRemove] componentsJoinedByString:@""];
user102008
  • 28,544
  • 10
  • 78
  • 100
  • 2
    What are alphanumeric characters? E.g. would German "Umlaute", like ä, ö or ü be included in the set and hence not be trimmed? – Erik Jan 30 '13 at 20:50
  • 3
    To handle accented characters you need to create a NSMutableCharacterSet that is a union of alphanumericCharacterSet and nonBaseCharacterSet, and invert that – Greg Fodor Jul 16 '13 at 05:25
  • 1
    The `trimmedReplacement` is misleading. In all iOS NSString invocations, *trimmed* means from start and end. May I suggest **occurrencesReplacement** or **strippedReplacement** instead? – SwiftArchitect May 19 '15 at 22:18
  • @Erik, umlauts would be included. that makes it unusable for filenames :( – dy_ Nov 12 '15 at 17:13
  • 4
    @datayeah No worries, just change the first line to invert the 'Portable Filename Character Set' as per http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_277: `NSCharacterSet *charactersToRemove = [[NSCharacterSet characterSetWithCharactersInString:@"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789._-"] invertedSet];` – Erik Nov 14 '15 at 19:36
21

In Swift, the componentsJoinedByString is replaced by join(...), so here it just replaces non-alphanumeric characters with a space.

let charactersToRemove = NSCharacterSet.alphanumericCharacterSet().invertedSet
let strippedReplacement = " ".join(someString.componentsSeparatedByCharactersInSet(charactersToRemove))

For Swift2 ...

var enteredByUser = field.text .. or whatever

let unsafeChars = NSCharacterSet.alphanumericCharacterSet().invertedSet

enteredByUser = enteredByUser
         .componentsSeparatedByCharactersInSet(unsafeChars)
         .joinWithSeparator("")

If you want to delete just the one character, for example delete all returns...

 enteredByUser = enteredByUser
         .componentsSeparatedByString("\n")
         .joinWithSeparator("")
Amr Hossam
  • 2,283
  • 1
  • 20
  • 22
19

What I wound up doing was creating an NSCharacterSet and the -invertedSet method that I found (it's a wonder what an extra hour of sleep does for documentation-reading abilities). Here's the code snippet, assuming that someString is the string from which you want to remove non-alphanumeric characters:

NSCharacterSet *charactersToRemove =
[[ NSCharacterSet alphanumericCharacterSet ] invertedSet ];

NSString *trimmedReplacement =
[ someString stringByTrimmingCharactersInSet:charactersToRemove ];

trimmedReplacement will then contain someString's alphanumeric characters.

Jeff Kelley
  • 18,594
  • 5
  • 67
  • 80
  • 26
    FYI, stringByTrimmingCharactersInSet: only removes characters from the beginning and end of the string. Maybe that's what you wanted. – Ken Aspeslagh Jan 19 '10 at 16:14
  • Hmm, good point, Ken. I didn't know that. It still works for my needs, but that's good to know. – Jeff Kelley Jan 19 '10 at 16:40
8

Swift 3 version of accepted answer:

let unsafeChars = CharacterSet.alphanumerics.inverted
let myStrippedString = myString.components(separatedBy: unsafeChars).joined(separator: "")
Travis M.
  • 10,360
  • 1
  • 50
  • 70
1

A Cleanup Category

I have a method call stringByStrippingCharactersInSet: and stringByCollapsingWhitespace that might be convenient to just drop-in.

@implementation NSString (Cleanup)

- (NSString *)clp_stringByStrippingCharactersInSet:(NSCharacterSet *)set
{
    return [[self componentsSeparatedByCharactersInSet:set] componentsJoinedByString:@""];
}

- (NSString *)clp_stringByCollapsingWhitespace
{
    NSArray *components = [self componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
    components = [components filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:@"self <> ''"]];

    return [components componentsJoinedByString:@" "];
}

@end
Cameron Lowell Palmer
  • 17,859
  • 4
  • 102
  • 114
0

Here’s a Swift version of Cameron’s category as an extension:

extension String {

    func stringByStrippingCharactersInSet(set:NSCharacterSet) -> String
    {
        return (self.componentsSeparatedByCharactersInSet(set) as NSArray).componentsJoinedByString("")
    }

    func stringByCollapsingWhitespace() -> String
    {
        var components:NSArray = self.componentsSeparatedByCharactersInSet(NSCharacterSet.whitespaceCharacterSet())
        let predicate = NSPredicate(format: "self <> ''", argumentArray: nil)
        components = components.filteredArrayUsingPredicate(predicate)

        return components.componentsJoinedByString(" ")
    }
}
Community
  • 1
  • 1
Aral Balkan
  • 5,851
  • 3
  • 19
  • 24
0

The plain cycle would be the faster execution time I think:

@implementation NSString(MyUtil)

- (NSString*) stripNonNumbers {
    NSMutableString* res = [NSMutableString new];
    //NSCharacterSet *numericSet = [NSCharacterSet decimalDigitCharacterSet];
    for ( int i=0; i < self.length; ++i ) {
        unichar c = [self characterAtIndex:i];
        if ( c >= '0' && c <= '9' ) // this looks cleaner, but a bit slower: [numericSet characterIsMember:c])
            [res appendFormat:@"%c", c];
    }
    return res;
}

@end
0

This is a more effective way than the provided answer

+ (NSString *)alphanumericString:(NSString *)s {

    NSCharacterSet * charactersToRemove = [[NSCharacterSet alphanumericCharacterSet] invertedSet];
    NSMutableString * ms = [NSMutableString stringWithCapacity:[s length]];
    for (NSInteger i = 0; i < s.length; ++i) {
        unichar c = [s characterAtIndex:i];
        if (![charactersToRemove characterIsMember:c]) {
            [ms appendFormat:@"%c", c];
        }
    }
    return ms;

}

or as a Category

@implementation NSString (Alphanumeric)

- (NSString *)alphanumericString {

    NSCharacterSet * charactersToRemove = [[NSCharacterSet alphanumericCharacterSet] invertedSet];
    NSMutableString * ms = [NSMutableString stringWithCapacity:[self length]];
    for (NSInteger i = 0; i < self.length; ++i) {
        unichar c = [self characterAtIndex:i];
        if (![charactersToRemove characterIsMember:c]) {
            [ms appendFormat:@"%c", c];
        }
    }
    return ms;

}

@end
Peter Lapisu
  • 18,394
  • 14
  • 107
  • 163