parsing a text file using a java scanner

Question

I am trying to create a method that parses a text file and returns a string that is the url after the colon. The text file looks as follow (it is for a bot):

keyword:url
keyword,keyword:url

so each line consists of a keyword and a url, or multiple keywords and a url.

could anyone give me a bit of direction as to how to do this? Thank you.

I believe I need to use a scanner but couldn't find anything on anyone wanting to do anything similar to me.

Thank you.

edit: my attempt using suggestions below. doesn't quite work. Any help would be appreciated.

    public static void main(String[] args) throws IOException {
    String sCurrentLine = "";
    String key = "hello";

    BufferedReader reader = new BufferedReader(
            new FileReader(("sites.txt")));
    Scanner s = new Scanner(sCurrentLine);
    while ((sCurrentLine = reader.readLine()) != null) {
        System.out.println(sCurrentLine);
        if(sCurrentLine.contains(key)){
            System.out.println(s.findInLine("http"));
        }
    }
}

output:

    hello,there:http://www.facebook.com
null
whats,up:http:/google.com

sites.txt:

   hello,there:http://www.facebook.com
whats,up:http:/google.com

Use a `BufferedReader` to get the lines of the file and then you can use a `Scanner` or `split` or, probably easiest, regex to tokenise the line. — Boris the Spider, Aug 29 '13 at 07:58

score 2 · Answer 1 · answered Aug 29 '13 at 11:57

You should read the file line by line with a BufferedReader as you are doing, I would the recommend parsing the file using regex.

The pattern

(?<=:)http://[^\\s]++

Will do the trick, this pattern says:

http://
followed by any number of non-space characters (more than one) [^\\s]++
and preceded by a colon (?<=:)

Here is a simple example using a String to proxy your file:

public static void main(String[] args) throws Exception {
    final String file = "hello,there:http://www.facebook.com\n"
            + "whats,up:http://google.com";
    final Pattern pattern = Pattern.compile("(?<=:)http://[^\\s]++");
    final Matcher m = pattern.matcher("");
    try (final BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(new ByteArrayInputStream(file.getBytes("UTF-8"))))) {
        String line;
        while ((line = bufferedReader.readLine()) != null) {
            m.reset(line);
            while (m.find()) {
                System.out.println(m.group());
            }
        }
    }
}

Output:

http://www.facebook.com
http://google.com

score 0 · Answer 2 · answered Aug 29 '13 at 08:01

0

Use BufferedReader, for text parsing you can use regular expresions.

answered Aug 29 '13 at 08:01

slanecek

806
1
8
23

score 0 · Answer 3 · answered Aug 29 '13 at 08:28

0

You should use the split method:

String strCollection[] = yourScannedStr.Split(":", 2);
String extractedUrl = strCollection[1];

answered Aug 29 '13 at 08:28

PythaLye

314
1
8

score -1 · Answer 4 · edited May 23 '17 at 11:51

-1

Reading a .txt file using Scanner class in Java

http://www.tutorialspoint.com/java/java_string_substring.htm

That should help you.

edited May 23 '17 at 11:51

Community

1
1

answered Aug 29 '13 at 08:00

Ben Dale

2,172
12
14

parsing a text file using a java scanner

4 Answers4