0

I have a list of tweets about the mobile and a list of mobile phones name and now i have to make count for each of mobile phone names. I used array list to get the name of mobile phones as follows

brand_list.add("Samsung Galaxy S5");
brand_list.add("Nolia Lumia 525");

then I had a list of tweets about the phones like "RT @protectyrbubble: #PYBS5giveaway #WIN a Samsung Galaxy S5. Just follow @protectyrbubble and RT! Details & T&Cs http://t.co/u0NTM00rhA ht…"

then I used the following code to count for each of the phone as follows

for(int j=0;j<array_list.size();j++)
           {
              pattern = Pattern.compile(" ((.*)Samsung(.*)Galaxy(.*)S5(.*)",Pattern.CASE_INSENSITIVE) ;
                        matcher = pattern.matcher(array_list.get(j).toString());
                        while (matcher.find()) 
                        {

                              count++;

                        }
           }

in the above ,array_list holds tweets about mobiles.Now if I use above regex it works fine for the above mentioned tweet but it doesnt work for string like

"Galaxy S5 Mini Sempat Nongol di Situs Samsung http://t.co/sinWiLpUNV"

so,I need a regex which also finds the above mentioned tweets.

Thanks in advance

Oliver Charlesworth
  • 252,669
  • 29
  • 530
  • 650
Reddevil
  • 692
  • 1
  • 8
  • 21

1 Answers1

1

You cannot check an order with regular expressions. But it seems as if you only want to know if the strings "Samsung", "Galaxy" and "S5" are contained in the strings, so you could just ask for 3 matches: ".*Samsung.*", ".*Galaxy.*" and ".*S5.*".

The String#contains() method is also a possibility but unfortunately it cannot check case insensitively.

EDIT: It might work with something like "(.*(Samsung|Galaxy|S5))*.*" but I'm not sure about the right syntax... maybe you get my idea.

If your phone names are inside your brand_list, you could just do:

for(int j=0;j<array_list.size();j++)
{
    boolean allIn = true;
    for (String phoneName: brand_list)
    {
        String[] phoneWords = phoneName.split(" ");

        for (int wordIndex = 0; wordIndex < phoneWords.length; wordIndex++)
        {
            String regexPattern = "(.*)" + phoneWords[wordIndex] + "(.*)";
            pattern = Pattern.compile(regexPattern, Pattern.CASE_INSENSITIVE);
            matcher = pattern.matcher(array_list.get(j).toString());

            if (!matcher.find()) 
            {
                allIn = false;
            }
        }
    }
    System.out.println(allIn); // should be false here if one of the words
                               // couldn't be found in the strings and
                               // should be true otherwise
}
  • Thanks man for ur reply @michel ..Yep u r right but I do have phone names of various sizes .How can i do that – Reddevil May 10 '14 at 11:08
  • It depends on where your phones names come from. In your example the names are hard coded but I guess you just did it for the example. So if you have an array or a `List` with all the names in it you could just iterate through it and create a new `Pattern` with these elements. – Michel Michael Meyer May 10 '14 at 11:12
  • but for me three words "Samsung galaxy s5" have to present in that line but may be different order like "galaxy s5 samsung","s5 samsung galaxy" – Reddevil May 10 '14 at 11:15
  • i had iterated through list and created a regex as follows "((.*)S5(.*)Galaxy(.*)Samsung(.*) | (.*)Galaxy(.*)S5(.*)Samsung(.*) | (.*)Galaxy(.*)Samsung(.*)S5(.*) | (.*)S5(.*)Samsung(.*)Galaxy(.*) | (.*)Samsung(.*)S5(.*)Galaxy(.*) | (.*)Samsung(.*)Galaxy(.*)S5(.*)) " but it doesnt works for me – Reddevil May 10 '14 at 11:18
  • I updated my answer. It still don't know where your phone names come from so I just assumed they are contained in an array like shown in my answer. EDIT: Sorry, I didn't notice your brand_list ... I'll update my answer...again :D – Michel Michael Meyer May 10 '14 at 11:23
  • ,in ur code,first it checks for samsung and increases the count and then it loops again it checks for galaxy(also s5) and increases the count ...But for me it have to increment when three of them exists – Reddevil May 10 '14 at 12:02
  • Ah, ok. Well then you just need to handle with some `boolean`s and check if all of them are `true` (via `&&`). Or you use only one `boolean` which is `true` at the beginning and will be set to `false` if one of the words cannot be found. – Michel Michael Meyer May 10 '14 at 12:13