14

I want to count the words in a specific string , so I can validate it and prevent users to write more than 100 words for example .

I wrote this function but I don't think it's effective enough , I used the explode function with space as a delimiter but what if the user puts two spaces instead of one . can you give me a better way to do that ?

function isValidLength($text , $length){

   $text  = explode(" " , $text );
   if(count($text) > $length)
          return false;
   else
          return true;
}
Waseem Senjer
  • 996
  • 3
  • 12
  • 25
  • http://stackoverflow.com/questions/21652261/using-str-word-count-for-utf8-texts – trante Feb 10 '14 at 17:29
  • You might find [`count(s($str)->words())`](https://github.com/delight-im/PHP-Str/blob/8fd0c608d5496d43adaa899642c1cce047e076dc/src/Str.php#L363) helpful, as found in [this standalone library](https://github.com/delight-im/PHP-Str). – caw Jul 27 '16 at 03:56

9 Answers9

23

Maybe str_word_count could help

http://php.net/manual/en/function.str-word-count.php

$Tag  = 'My Name is Gaurav'; 
$word = str_word_count($Tag);
echo $word;
Bangkokian
  • 6,122
  • 2
  • 14
  • 23
Francesco Laurita
  • 22,784
  • 7
  • 52
  • 63
  • Just one other has mentioned `str_word_count`. Isn't it appropriate? – Francesco Laurita Jan 24 '11 at 20:35
  • 14
    str_word_count is BAD! It counts "the" multiple times if it is contained in bigger words like "theme" "theory" etc. str_word_count sucks and I see it all over on stackoverflow – giorgio79 Oct 14 '11 at 13:35
  • 7
    @giorgio79 What about offering an alternative rather than ranting like a madman. – Henrik Petterson Aug 09 '15 at 18:35
  • This function also counts hyphens as words. I found it better using this function after using a preg_replace to replace all none alpha characters e.g: str_word_count(preg_replace('/[^a-z]+/i', ' ', $string)) – TURTLE Dec 07 '15 at 19:08
  • str_word_count will consider "Yet" and "yet" as two different words. which is fair enough I guess. This can be solved by lower casing the string prior to testing. – prog_24 Mar 14 '17 at 08:49
  • `str_word_count()` is always returning 1 for some reason – mrid Jun 19 '18 at 05:11
16

Try this:

function get_num_of_words($string) {
    $string = preg_replace('/\s+/', ' ', trim($string));
    $words = explode(" ", $string);
    return count($words);
}

$str = "Lorem ipsum dolor sit amet";
echo get_num_of_words($str);

This will output: 5

Amr
  • 3,978
  • 6
  • 39
  • 55
  • 5
    This is actually the best answer so far that is both concise and doesn't have serious issues of some kind. But I would simplify the function body as simply `return count(explode(' ', preg_replace('/\s+/', ' ', trim($string))));`. – orrd Sep 30 '15 at 22:04
10

You can use the built in PHP function str_word_count. Use it like this:

$str = "This is my simple string.";
echo str_word_count($str);

This will output 5.

If you plan on using special characters in any of your words, you can supply any extra characters as the third parameter.

$str = "This weather is like el ninã.";
echo str_word_count($str, 0, 'àáã');

This will output 6.

Michael Irigoyen
  • 21,233
  • 17
  • 82
  • 125
4

str_count_words has his flaws. it will count underscores as separated words like this_is two words:

You can use the next function to count words separated by spaces even if theres more than one between them.

function count_words($str){

    while (substr_count($str, "  ")>0){
        $str = str_replace("  ", " ", $str);
    }
    return substr_count($str, " ")+1;
}


$str = "This   is  a sample_test";

echo $str;
echo count_words($str);
//This will return 4 words;
Mackraken
  • 337
  • 3
  • 5
4

This function uses a simple regex to split the input $text on any non-letter character:

function isValidLength($text, $length) {
    $words = preg_split('#\PL+#u', $text, -1, PREG_SPLIT_NO_EMPTY);
    return count($words) <= $length;
}

This ensures that is works correctly with words separated by multiple spaces or any other non-letter character. It also handles unicode (e.g. accented letters) correctly.

The function returns true when the word count is less than $length.

Arnaud Le Blanc
  • 90,979
  • 22
  • 192
  • 188
2

Use preg_split() instead of explode(). Split supports regular expressions.

Jeff Lamb
  • 5,135
  • 4
  • 33
  • 54
1

Using substr_count to Count the number of any substring occurrences. for finding number of words set $needle to ' '. int substr_count ( string $haystack , string $needle)

$text = 'This is a test';
echo substr_count($text, 'is'); // 2


echo substr_count($text, ' ');// return number of occurance of words
Behzad-Ravanbakhsh
  • 944
  • 10
  • 12
  • 1
    There are a few issues with this. It counts spaces, not words. So if there's one word it would return 0. And it counts multiple spaces as words (such as if you put two spaces after each period as is often done). – orrd Sep 30 '15 at 21:50
0

There are n-1 spaces between n objects so there will be 99 spaces between 100 words, so u can choose and average length for a word say for example 10 characters, then multiply by 100(for 100 words) then add 99(spaces) then you can instead make the limitation based on number of characters(1099).

function isValidLength($text){

if(strlen($text) > 1099)

     return false;

else return true;

}

Fenn-CS
  • 613
  • 10
  • 25
0

I wrote a function which is better than str_word_count because that PHP function counts dashes and other characters as words.

Also my function addresses the issue of double spaces, which many of the functions other people have written don't take account for.

As well this function handles HTML tags. Where if you had two tags nested together and simply used the strip_tags function this would be counted as one word when it's two. For example: <h1>Title</h1>Text or <h1>Title</h1><p>Text</p>

Additionally, I strip out JavaScript first other wise the code within the <script> tags would be counted as words.

Lastly, my function handles spaces at the beginning and end of a string, multiple spaces, and line breaks, return characters, and tab characters.

###############
# Count Words #
###############
function count_words($str)
{
 $str = preg_replace("/[^A-Za-z0-9 ]/","",strip_tags(str_replace('<',' <',str_replace('>','> ',str_replace(array("\n","\r","\t"),' ',preg_replace('~<\s*\bscript\b[^>]*>(.*?)<\s*\/\s*script\s*>~is','',$str))))));
 while(substr_count($str,'  ')>0)
 {
  $str = str_replace('  ',' ',$str);
 }
 return substr_count(trim($str,' '),' ')+1;
}