-4

I would like to look for any word in a file containing different informations like date and percentage and some strings.

Input:

21-02-2015 wordA 22 wordB wordC

Result:

wordA wordB wordC

Please help me for I am new to regex.

Med BEN
  • 31
  • 1
  • 1
  • 5
  • 1
    Have you tried anything? If you just have us give you the solution you will stay new to the subject forever – Dragondraikk Mar 16 '15 at 13:00
  • What's the rule here? Find words that begin with `word`? Find the second, fourth and fifth word? Find words not containing digits (which seems almost the opposite to the title)? Find words consisting of five characters? Also; are you using Java or Apple's library? Surely not both. And what has this got to do with lookarounds? – Biffen Mar 16 '15 at 13:08
  • Biffen : Actually it is finding every string there are that make a word (eg : Hello or Name etc) I am using Java and my probleme is that I cannot seems to catch every word in my input with special character – Med BEN Mar 16 '15 at 13:13
  • @MedBEN Then you might want to change the title of the question. And please clean up the tags. Anyway, in what way is `22` not a word? – Biffen Mar 16 '15 at 13:14
  • will do about the title but 22 is a digits and I dont want it in my result – Med BEN Mar 16 '15 at 13:16
  • @MedBEN So is that the rule, then? You want to match all words that don't contain digits? – Biffen Mar 16 '15 at 13:19
  • yes but words that I am working with contains some special character. – Med BEN Mar 16 '15 at 13:22
  • @MedBEN Which of the characters in `21-02-2015 wordA 22 wordB wordC` would you say are special? And I'm still curious about the tags; are you really parsing a regular language using `NSRegularExpression` in Java? – Biffen Mar 16 '15 at 13:24
  • Biffen this is the regex "([a-zA-Z]+)" i am using to fetch the words in my input but it s not abel to find all the word (e.g : Førtids ) – Med BEN Mar 16 '15 at 13:30
  • @MedBEN How about split by spaces and then match each element against `[^\d]+`? – Biffen Mar 16 '15 at 13:35
  • Actually I tried this one and works (([a-zA-Z]+)(\W)?([a-zA-Z]+)) – Med BEN Mar 16 '15 at 13:41

2 Answers2

0

This is the regex that could retrieve any string including special character answer found it :

 (([a-zA-Z]+)(\W)?([a-zA-Z]+))
  • ([a-zA-Z]+) Look for character in Aa-zZ
  • (\W) look for the special character
  • ([a-zA-Z]+) if the special character is in the middle you look for the rest of the word
Thomas Dickey
  • 43,185
  • 7
  • 51
  • 88
Med BEN
  • 31
  • 1
  • 1
  • 5
-1

Java's regex implementation supports character class intersection, for which this is a textbook usecase.

[\w&&[^\d]] will thus match a word character but not a digit. Together with Pattern.UNICODE_CHARACTER_CLASS it should match ‘special characters’.

Thus this code:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

class test {

  public static void main(String... args) {
    String  input   = "21-02-2015 wordA 22 wordB wordC Førtids";
    Matcher matcher = Pattern.compile("[\\w&&[^\\d]]+",
                                      Pattern.UNICODE_CHARACTER_CLASS)
                      .matcher(input);

    while (matcher.find() ) {
      System.out.println(matcher.group() );
    }
  }

}

Produces:

wordA
wordB
wordC
Førtids
Biffen
  • 5,354
  • 5
  • 27
  • 32