97

I want to remove special characters like:

- + ^ . : ,

from an String using Java.

Shashank Agrawal
  • 21,660
  • 9
  • 71
  • 105
Sameek Mishra
  • 8,234
  • 28
  • 86
  • 113
  • You apparently already know what a regex is based on how you've tagged your question. Did you try reading the documentation for the `String` class? In particular, look for the word 'regex'; there are a few methods, and a bit of thought should tell you how to proceed... :) – Karl Knechtel Sep 26 '11 at 08:12
  • 3
    The phrase "special character" is so overused to be almost completely meaningless. If what you mean is, "I have this list of **specific** characters I want to remove," then do as Thomas suggests and form your pattern with a regex character class and `replaceAll` them away. If you have more esoteric requirements, edit the question. :) – Ray Toal Sep 26 '11 at 08:18
  • 1
    those are not special characters... these are: äâêíìéè since they're not your common 1-byte character types like - + ^ are... anyway, as Ray stated, either do a `replaceAll` for them, or, do a parse on the string, add the chars that are not the chars you want to take out to another string and in the end just do a += to a String you'll be returning. – Gonçalo Vieira Sep 26 '11 at 09:16
  • `deleteChars.apply( fromString, "-+^.:," );` – [find deleteChars here](https://stackoverflow.com/questions/4576352/remove-all-occurrences-of-char-from-string/56916146#56916146) – Kaplan Oct 03 '19 at 11:20

8 Answers8

267

That depends on what you define as special characters, but try replaceAll(...):

String result = yourString.replaceAll("[-+.^:,]","");

Note that the ^ character must not be the first one in the list, since you'd then either have to escape it or it would mean "any but these characters".

Another note: the - character needs to be the first or last one on the list, otherwise you'd have to escape it or it would define a range ( e.g. :-, would mean "all characters in the range : to ,).

So, in order to keep consistency and not depend on character positioning, you might want to escape all those characters that have a special meaning in regular expressions (the following list is not complete, so be aware of other characters like (, {, $ etc.):

String result = yourString.replaceAll("[\\-\\+\\.\\^:,]","");


If you want to get rid of all punctuation and symbols, try this regex: \p{P}\p{S} (keep in mind that in Java strings you'd have to escape back slashes: "\\p{P}\\p{S}").

A third way could be something like this, if you can exactly define what should be left in your string:

String  result = yourString.replaceAll("[^\\w\\s]","");

This means: replace everything that is not a word character (a-z in any case, 0-9 or _) or whitespace.

Edit: please note that there are a couple of other patterns that might prove helpful. However, I can't explain them all, so have a look at the reference section of regular-expressions.info.

Here's less restrictive alternative to the "define allowed characters" approach, as suggested by Ray:

String  result = yourString.replaceAll("[^\\p{L}\\p{Z}]","");

The regex matches everything that is not a letter in any language and not a separator (whitespace, linebreak etc.). Note that you can't use [\P{L}\P{Z}] (upper case P means not having that property), since that would mean "everything that is not a letter or not whitespace", which almost matches everything, since letters are not whitespace and vice versa.

Additional information on Unicode

Some unicode characters seem to cause problems due to different possible ways to encode them (as a single code point or a combination of code points). Please refer to regular-expressions.info for more information.

Thomas
  • 80,843
  • 12
  • 111
  • 143
  • +1 for the best general-purpose solution. Since you are listing a couple variations in the absence of details from the OP, you might as well show and explain patterns like `[\P{L}]` – Ray Toal Sep 26 '11 at 08:21
  • Also note that the `-` character must be the first or last one in the list or it needs to be escaped. – kapex Sep 26 '11 at 08:24
  • `[^\\p{L}\\p{Z}]` seems to eliminate German Umlauts (ä,ö,ü) as well (at least it does so for me:/), so "The regex matches everything that is not a letter in any language" doesn't seem to be 100% correct – Peter May 01 '13 at 10:19
  • @Peter it doesn't eliminate those characters in my tests. There might be another problem in your case, e.g. a different encoding of the text. I'll add a link to more information. – Thomas May 02 '13 at 09:07
  • 1
    @Thomas `String result = yourString.replaceAll("[^\w\s]","");` makes error `Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )` – Visruth Jul 30 '13 at 05:12
  • @VisruthCV you'd need to escape the backslashes in Java strings, i.e. use `"[^\\w\\s]"`. I'll fix that error in my answer as well. – Thomas Jul 30 '13 at 07:08
  • @Thomas Thanks. I have optimized your answer youredittextvariable.setText((youredittextvariable.getText().toString().replaceAll("[″&<>′]","")).toString()); for android. – Abhijit Gujar Apr 01 '15 at 13:20
46

This will replace all the characters except alphanumeric

replaceAll("[^A-Za-z0-9]","");
Stephen
  • 684
  • 6
  • 11
18

As described here http://developer.android.com/reference/java/util/regex/Pattern.html

Patterns are compiled regular expressions. In many cases, convenience methods such as String.matches, String.replaceAll and String.split will be preferable, but if you need to do a lot of work with the same regular expression, it may be more efficient to compile it once and reuse it. The Pattern class and its companion, Matcher, also offer more functionality than the small amount exposed by String.

public class RegularExpressionTest {

public static void main(String[] args) {
    System.out.println("String is = "+getOnlyStrings("!&(*^*(^(+one(&(^()(*)(*&^%$#@!#$%^&*()("));
    System.out.println("Number is = "+getOnlyDigits("&(*^*(^(+91-&*9hi-639-0097(&(^("));
}

 public static String getOnlyDigits(String s) {
    Pattern pattern = Pattern.compile("[^0-9]");
    Matcher matcher = pattern.matcher(s);
    String number = matcher.replaceAll("");
    return number;
 }
 public static String getOnlyStrings(String s) {
    Pattern pattern = Pattern.compile("[^a-z A-Z]");
    Matcher matcher = pattern.matcher(s);
    String number = matcher.replaceAll("");
    return number;
 }
}

Result

String is = one
Number is = 9196390097
turbandroid
  • 2,206
  • 19
  • 30
15

Try replaceAll() method of the String class.

BTW here is the method, return type and parameters.

public String replaceAll(String regex,
                         String replacement)

Example:

String str = "Hello +-^ my + - friends ^ ^^-- ^^^ +!";
str = str.replaceAll("[-+^]*", "");

It should remove all the {'^', '+', '-'} chars that you wanted to remove!

Shashank Agrawal
  • 21,660
  • 9
  • 71
  • 105
omt66
  • 4,015
  • 1
  • 19
  • 20
7

To Remove Special character

String t2 = "!@#$%^&*()-';,./?><+abdd";

t2 = t2.replaceAll("\\W+","");

Output will be : abdd.

This works perfectly.

Devon_C_Miller
  • 15,714
  • 3
  • 40
  • 68
Akila
  • 655
  • 1
  • 8
  • 11
2

Use the String.replaceAll() method in Java. replaceAll should be good enough for your problem.

Shashank Agrawal
  • 21,660
  • 9
  • 71
  • 105
MT.
  • 1,857
  • 3
  • 17
  • 17
1

You can remove single char as follows:

String str="+919595354336";

 String result = str.replaceAll("\\\\+","");

 System.out.println(result);

OUTPUT:

919595354336
duggu
  • 35,841
  • 11
  • 112
  • 110
Satya
  • 140
  • 1
  • 10
0

If you just want to do a literal replace in java, use Pattern.quote(string) to escape any string to a literal.

myString.replaceAll(Pattern.quote(matchingStr), replacementStr)
Tezra
  • 7,096
  • 2
  • 19
  • 59