0

I reference the answer to parse the google search result.

How can you search Google Programmatically Java API

However ,when I try the code .Error occurs .

How should I make the modifications?

import java.net.URLDecoder;
import java.net.URLEncoder;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements ;

public class JavaApplication22 {
public static void main(String[] args) {
   String google = "http://www.google.com/search?q=";
   String search = "stackoverflow";
   String charset = "UTF-8";
    String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage!

    Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select(".g>.r>a");

    for (Element link : links) {
        String title = link.text();
        String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".
        url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8");

        if (!url.startsWith("http")) {
            continue; // Ads/news/etc.
        }



        System.out.println("Title: " + title);
        System.out.println("URL: " + url);
    }
}

}

I guess it is because the libraries matters. But I tried ctrl +shift+i .It shows that nothing to fix in import statements.

Error

Exception in thread "main" java.lang.RuntimeException: Uncompilable
source code - unreported exception java.io.IOException; must be caught
or declared to be thrown    at
javaapplication22.JavaApplication22.main(JavaApplication22.java:32)

How should I modify the code so that I can parse the Google Search result ?

Community
  • 1
  • 1
evabb
  • 403
  • 2
  • 17

1 Answers1

1

Please replace your main class with below code :

public static void main(String[] args) throws UnsupportedEncodingException, IOException {

    String google = "http://www.google.com/search?q=";

    String search = "stackoverflow";

    String charset = "UTF-8";

    String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage!

    Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select(".g>.r>a");

    for (Element link : links) {
        String title = link.text();
        String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".
        url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8");

        if (!url.startsWith("http")) {
            continue; // Ads/news/etc.
        }
        System.out.println("Title: " + title);
        System.out.println("URL: " + url);
    }
}
  • should I make change in `String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage!` – evabb Sep 20 '16 at 11:21
  • if I run directly the code ,`Exception in thread "main" java.net.SocketTimeoutException: connect timed out` shows – evabb Sep 20 '16 at 11:22
  • no no just add this line "throws UnsupportedEncodingException, IOException" after : " public static void main(String[] args) " – Nilkanth Rangoonwala Sep 20 '16 at 11:23
  • it works .thanks .Btw , I want to make some further adjustment .1) The search source should be from Google news ->how to change the`String google` .2) The search result preview can be parsed . Can you give me some idea or I just open another question thread? – evabb Sep 20 '16 at 12:12
  • Anytime bro, For new just add one parameter "&tbm=nws" in your Google String So new String will be **"http://www.google.com/search?q=stackoverflow&tbm=nws"** – Nilkanth Rangoonwala Sep 20 '16 at 12:24
  • ` String news="&tbm=nws"; String join= search+news; Elements links = Jsoup.connect(google + URLEncoder.encode(join, charset)).userAgent(userAgent).get().select(".g>.r>a");` I tried this . The result is still the result of normal google search, but not news search – evabb Sep 20 '16 at 12:35
  • Elements links = Jsoup.connect(google + URLEncoder.encode(search , charset) + news).userAgent(userAgent).get().select(".g>.r>a"); please do like this and you have to change ".g>.r>a" according to pagesource because new page pagesource might change. – Nilkanth Rangoonwala Sep 20 '16 at 12:53
  • yes for that you have to change ".g>.r>a" according to Google news – Nilkanth Rangoonwala Sep 20 '16 at 18:12