1

I am trying to count the number of words in a file passed through a string. I am also displaying the string to make sure the output is correct and I am getting the exact contents of the file.

However, my word count method counts the last word of the previous line and the first word of the next line as one word.

Example: "Test word (newline) test words" outputs as "Test wordtest words"

Tried adding "\n" to my code and it displays correct output now but still counts it as before.

Any help would be appreciated.

Levy
  • 38
  • 8
  • 1
    In your `countWords` method, you are only incrementing the count when you see spaces. Increment the count when you see newlines (`\n`) as well. – David Choweller Mar 08 '17 at 19:13
  • 1
    Another way to do that is using [BreakIterator](https://docs.oracle.com/javase/7/docs/api/java/text/BreakIterator.html) – Sanjeev Mar 08 '17 at 19:14
  • Assuming this isn't an assignment, you can also use the String#split method to split the line into tokens and count them. (`line.split("\\b");`). – David Choweller Mar 08 '17 at 19:14
  • 1
    You can also use `line.charAt(i).isWhitespace()` to detect multiple types of spaces (newlines, tabs, etc.). – David Choweller Mar 08 '17 at 19:17
  • This topic will help you to count words in a file. [Read next word in java](http://stackoverflow.com/questions/4574041/read-next-word-in-java) – Alper Derya Mar 08 '17 at 19:24

5 Answers5

2

You can change the condition that checks for spaces to include new line too

if ((line.charAt(i) == ' ' || line.charAt(i) == '\n') && line.charAt(i + 1) != ' ')
Vladimír Bielený
  • 1,867
  • 1
  • 11
  • 13
2
 /* * Counting number of words using regular expression. */
public int countWord(String word) {
    return word.isEmpty() ? 0 : word.split("\\s+").length;
}
Thiago Gama
  • 150
  • 5
0

You can also count using regular expressions.

public static int countWords(String line) {

    Pattern pattern = Pattern.compile("\\w+");
    Matcher  matcher = pattern.matcher(line);

    int count = 0;
    while (matcher.find())
        count++;

    return count;

    }
Eduardo Dennis
  • 12,511
  • 11
  • 72
  • 102
0

Here's the reason why "Test word (newline) test words" outputs as "Test wordtest words"

in.nextLine() returns the line as a String excluding the newline character at end of the line. See https://docs.oracle.com/javase/8/docs/api/java/util/Scanner.html#nextLine--

It would be more efficient though to keep track of the word count instead of appending the lines to a String and then counting at the end. The pseudocode would be something like this:

int wordCount = 0
while (file has more lines) {
    line = line.trim()
    int wordsOnLine = numberOfSpacesPlusOne(line)
    wordCount += wordsOnLine
}
Joe
  • 310
  • 1
  • 7
  • Sorry forgot to mention it is for an assignment and we are required to pass it into a string. – Levy Mar 08 '17 at 20:34
  • 1
    No problem :) You may want to use a StringBuilder instead of a String. Every time you do a String concatenation (line += ..), a new String is created. Your code would be something like this with a StringBuilder: `StringBuilder line = new StringBuilder(); int lines = 0; while(in.hasNextLine()){ lines++; line.append("\n").append(in.nextLine()); } ... int words = countWords(line.toString());` – Joe Mar 08 '17 at 20:46
  • 1
    Thanks for the input. Helps me through this class more than you think :) – Levy Mar 08 '17 at 20:53
0

Why dont you just

String sentence = "This is a sentence.";
String[] words = sentence.split(" ");
System.out.println(words.length);

split your string at the " " and count the words.

Master Azazel
  • 491
  • 1
  • 10
  • 25