-1

I am writing a program where it will read an input from a file and build a sentence from the words. I am inspecting each word to check if the word ends with one of the sentence terminators which are:

  • period (.)
  • exclamation mark (!)
  • and question mark (?)

to decide if I should be creating a new instance of my sentence object. This is what I came up with so far

ArrayList<Sentence2> sentences = new ArrayList<>();
String wordsJoin = "";
int numOfWords = 0;
try{
    input = new BufferedReader(new FileReader("final.txt"));
    strLine = input.readLine();
    while(strLine != null){
        String[] tokens = strLine.split("\\s+");
        for (int i = 0; i < tokens.length; i++){
            String s = tokens[i];
            if(s.charAt(s.length()-1) != '.' ||s.charAt(s.length()-1) !='?' ||s.charAt(s.length()-1) != '!'){
                wordsJoin += tokens[i] + " ";
                numOfWords += tokens.length;
            }else{
                sentences.add(new Sentence2(wordsJoin,numOfWords));


            }
        }
        strLine = input.readLine();
    }

The problem is I am getting out of bounds exception. The stack trace is here:

Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: -1 !at java.lang.String.charAt(String.java:658)

Long story short my program is reading input and deciding whether the last character in the word ends in sentence terminator, if it does then i'll create an instance of sentence class which consists of the sentence and the number of words contained in that sentence.

Some of the text from the file that I need to process is here:

Text that follows is based on the Wikipedia page on cryptography! Cryptography is the practice and study of hiding information. In modern times, cryptography is considered to be a branch of both mathematics and computer science and is affiliated closely with information theory, computer security, and engineering. Cryptography is used in applications present in technologically advanced societies; examples include the security of ATM cards, computer passwords, and electronic commerce, which all depend on cryptography.

I really need help with this please, i have been going over it from quite some time now.

Saad
  • 369
  • 6
  • 22
  • How is this question a duplicate to array index out of bound exception. Mine is about processing strings using regex whereas the other one is just strings. @JarrodRoberson – Saad Oct 28 '16 at 16:34

3 Answers3

1

Your regex is wrong. To split a String to get every word, you should use split("\\s+").

public class Main {
    public static void main(String... args) {
        ArrayList<Sentence2> sentences = new ArrayList<>();
        String wordsJoin = "";
        int numOfWords = 0;

        String strLine = "It will be splitted? Sentence by sentence? Sure!";

        String[] tokens = strLine.split("\\s+");
        for (int i = 0; i < tokens.length; i++) {
            if(strLine.isEmpty()){
                continue;
            }

            String s = tokens[i];
            wordsJoin += tokens[i] + " ";
            numOfWords += tokens.length;

            if (s.charAt(s.length() - 1) == '.' || s.charAt(s.length() - 1) == '?' || s.charAt(s.length() - 1) == '!') {
                sentences.add(new Sentence2(wordsJoin, numOfWords));
                wordsJoin = "";
                numOfWords = 0;
            }
        }

        for (Sentence2 sentence2 : sentences) {
            System.out.println(sentence2.wordsJoin + " " + sentence2.numOfWords);
        }
    }

    public static class Sentence2 {
        private String wordsJoin;
        private int numOfWords;

        public Sentence2(String wordsJoin, int numOfWords) {
            this.wordsJoin = wordsJoin;
            this.numOfWords = numOfWords;
        }
    }
}
Paulo
  • 2,508
  • 1
  • 17
  • 30
  • Might have made a mistake here otherwise in my original program it is like yours. – Saad Oct 21 '16 at 00:17
  • But won't that mean that every last character has to be .!? in order for the statement to be true? – Saad Oct 21 '16 at 00:23
  • Just wanted to see how that will go, just added AND and it still shows the same error. – Saad Oct 21 '16 at 00:25
  • It is still throwing the same exception. i just copied your code and the same error popped up. – Saad Oct 21 '16 at 00:29
  • @Saad Try to copy the code now. At first, test it with a single string. And if the code run, you can improve it to get the string from de file. – Paulo Oct 21 '16 at 00:29
  • @Saad You have to check if strLine is not empty. Otherwise it will throw exception. – Paulo Oct 21 '16 at 00:41
  • It runs when i try yours however when i do it with my file, it shows the same error. Yeah i am checking for empty string in my while loop. – Saad Oct 21 '16 at 00:46
0

Always check the length of tokens returned from split(). It can well be an empty string, and then token.charAt(token.length()-1) will not exist.

Also take a look at How exactly does String.split() method in Java work when regex is provided?

Community
  • 1
  • 1
Florian Heer
  • 654
  • 5
  • 17
0

Make sure that you check if string does not contains empty string before put it to charAt() method. Something like folowing:

int len = s.length();
char = len > 0 ? s.charAt(len) : '';
Zamrony P. Juhara
  • 4,834
  • 2
  • 22
  • 37