0

I have a text file with the following String:

I want to know. bye bye. I found you. I hate you. I hear you.

What I want to do is to search for a target sentence inside a file. This is the code that I use:

public String lookFor(String target, File targetDestination) throws FileNotFoundException {
    Scanner scan = new Scanner(targetDestination);

    scan.useDelimiter("\\. ");
    while (scan.hasNext()) {
        if (scan.next().compareTo(target) == 0)
            return target;
    }
    return "Sorry,(" + target + ") cannot be found!";
}

The code works just fine when ever I try to look for any sentence such as: "I hate you" it returns "I hate you", but when I try to return the last sentence "I hear you" it says that it's not found, until I add a dot "I hear you." then it returns it.

Can anyone explain what is happening exactly? I feel that it's the delimiter, but I don't know much about regular expressions.

Nick L.
  • 5,151
  • 4
  • 29
  • 46
fox
  • 13
  • 4
  • Do you know what `useDelimiter` does? `\\. ` is a regex that matches a dot followed by a space. – Maroun Apr 26 '15 at 11:31
  • well, I have to do that in order to get back the same sentence. Because, when I type a sentences to look for inside my text file I need to scan the dot with the space that follows, otherwise I would have to put a space before the sentence I look for @MarounMaroun – fox Apr 26 '15 at 11:31
  • 1
    So, you got your answer: a dot alone is not "a dot followed by a space". So, the last token is `I hear you.`, not `I hear you`, since the last dot is not followed by a space, and is thus not a delimiter. Using a debugger would allow finding that by yourself. – JB Nizet Apr 26 '15 at 11:32
  • @JBNizet I know, but I would need to include a space before each search. For example: ` I hear you` instead of `I hear you`. that is why I need to put the dot with the space – fox Apr 26 '15 at 11:38
  • 1
    No. You can simply use a dot as the delimiter, and trim the token before comparing. That will have the additional advantage of also working if the dot is followed by two spaces, or a tab, or an end of line. – JB Nizet Apr 26 '15 at 11:41

2 Answers2

5

"\\. " is a regex that matches a dot followed by a space.

Look at your sentence:

I want to know. bye bye. I found you. I hate you. I hear you.
              ↑        ↑            ↑           ↑
      dot followed by a space    dot followed by a space 

However, the last dot, is not followed by a space if it's "I hear you.".

Note: Using a debugger will save your time and will make you better understand your code.

Maroun
  • 87,488
  • 26
  • 172
  • 226
  • Believe me I knew all of that before, and I used a debugger many times, but I have to use the dot with the space so I can type the target sentence alone and it would get a match. otherwise I would have to put a space before each sentence I look for – fox Apr 26 '15 at 11:41
  • @fox since `useDelimiter` accepts a regex, you can use the following `\\.\\s*`. – Maroun Apr 26 '15 at 11:43
  • would you please explain what that expression `\\.\\s*` say exactly? – fox Apr 26 '15 at 11:51
  • @fox `\\.` matches a dot, `\\s` matches a space, `\\s*` means "match *zero or more* spaces". – Maroun Apr 26 '15 at 11:52
0

All comments and answers are right.

You need to change delimiter regex to "\\.\\s*" to allow zero or any number of spaces to follow the dot.

Alex Salauyou
  • 13,188
  • 4
  • 39
  • 67
  • Thank you very much. That is what I was looking for. Would you please explain a little but more what the regex does exactly so I can understand better what happened – fox Apr 26 '15 at 11:44
  • @fox Should your delimiter also affect `I leave in U.S.A.` and let you find `I leave in U` `S` `A`? – Pshemo Apr 26 '15 at 11:48
  • 1
    If you need to avoid splitting acronyms like `U.S.A.`, use `"\\.(?:\\s+|$)"` Here, it will apply to the dot followed by one or more spaces *or* dot followed by end of line. – Alex Salauyou Apr 26 '15 at 11:50
  • `\s+` means "1 or more spaces", `$` means "end of line", `(?:X|Y)` means "X or Y", the same as `(X|Y)` but performs faster. – Alex Salauyou Apr 26 '15 at 11:53
  • 1
    @SashaSalauyou Thank you for your help. That is really smart! – fox Apr 26 '15 at 11:57
  • @SashaSalauyou Consider moving `"\\.(?:\\s+|$)"` to your answer with explanation when it should be used and how does it work (non-capturing groups, OR operator, whitespaces character class, anchor). If you don't want to describe details you can give links to articles in http://www.regular-expressions.info/ where these subjects are explained. – Pshemo Apr 26 '15 at 12:47
  • @Pshemo okay, but please wait a little. Now I'm writing another question. – Alex Salauyou Apr 26 '15 at 13:05