24

It's working fine over HTTP, but when I try and use an HTTPS source it throws the following exception:

10-12 13:22:11.169: WARN/System.err(332): javax.net.ssl.SSLHandshakeException: java.security.cert.CertPathValidatorException: Trust anchor for certification path not found.
10-12 13:22:11.179: WARN/System.err(332):     at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl.startHandshake(OpenSSLSocketImpl.java:477)
10-12 13:22:11.179: WARN/System.err(332):     at org.apache.harmony.xnet.provider.jsse.OpenSSLSocketImpl.startHandshake(OpenSSLSocketImpl.java:328)
10-12 13:22:11.179: WARN/System.err(332):     at org.apache.harmony.luni.internal.net.www.protocol.http.HttpConnection.setupSecureSocket(HttpConnection.java:185)
10-12 13:22:11.179: WARN/System.err(332):     at org.apache.harmony.luni.internal.net.www.protocol.https.HttpsURLConnectionImpl$HttpsEngine.makeSslConnection(HttpsURLConnectionImpl.java:433)
10-12 13:22:11.189: WARN/System.err(332):     at org.apache.harmony.luni.internal.net.www.protocol.https.HttpsURLConnectionImpl$HttpsEngine.makeConnection(HttpsURLConnectionImpl.java:378)
10-12 13:22:11.189: WARN/System.err(332):     at org.apache.harmony.luni.internal.net.www.protocol.http.HttpURLConnectionImpl.connect(HttpURLConnectionImpl.java:205)
10-12 13:22:11.189: WARN/System.err(332):     at org.apache.harmony.luni.internal.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:152)
10-12 13:22:11.189: WARN/System.err(332):     at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:377)
10-12 13:22:11.189: WARN/System.err(332):     at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:364)
10-12 13:22:11.189: WARN/System.err(332):     at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:143)

Here's the relevant code:

try {
    doc = Jsoup.connect("https url here").get();
} catch (IOException e) {
    Log.e("sys","coudnt get the html");
    e.printStackTrace();
}
jfisk
  • 5,756
  • 19
  • 68
  • 109
  • 1
    Just add `ignoreHttpErrors(true)` It works for v 1.7.2 `Jsoup.connect(url).userAgent(userAgent).ignoreHttpErrors(true).get()` – qwert_ukg Jun 25 '19 at 02:51

10 Answers10

58

If you want to do it the right way, and/or you need to deal with only one site, then you basically need to grab the SSL certificate of the website in question and import it in your Java key store. This will result in a JKS file which you in turn set as SSL trust store before using Jsoup (or java.net.URLConnection).

You can grab the certificate from your webbrowser's store. Let's assume that you're using Firefox.

  1. Go to the website in question using Firefox, which is in your case https://web2.uconn.edu/driver/old/timepoints.php?stopid=10
  2. Left in the address bar you'll see "uconn.edu" in blue (this indicates a valid SSL certificate)
  3. Click on it for details and then click on the More information button.
  4. In the security dialogue which appears, click the View Certificate button.
  5. In the certificate panel which appears, go to the Details tab.
  6. Click the deepest item of the certificate hierarchy, which is in this case "web2.uconn.edu" and finally click the Export button.

Now you've a web2.uconn.edu.crt file.

Next, open the command prompt and import it in the Java key store using the keytool command (it's part of the JRE):

keytool -import -v -file /path/to/web2.uconn.edu.crt -keystore /path/to/web2.uconn.edu.jks -storepass drowssap

The -file must point to the location of the .crt file which you just downloaded. The -keystore must point to the location of the generated .jks file (which you in turn want to set as SSL trust store). The -storepass is required, you can just enter whatever password you want as long as it's at least 6 characters.

Now, you've a web2.uconn.edu.jks file. You can finally set it as SSL trust store before connecting as follows:

System.setProperty("javax.net.ssl.trustStore", "/path/to/web2.uconn.edu.jks");
Document document = Jsoup.connect("https://web2.uconn.edu/driver/old/timepoints.php?stopid=10").get();
// ...

As a completely different alternative, particularly when you need to deal with multiple sites (i.e. you're creating a world wide web crawler), then you can also instruct Jsoup (basically, java.net.URLConnection) to blindly trust all SSL certificates. See also section "Dealing with untrusted or misconfigured HTTPS sites" at the very bottom of this answer: Using java.net.URLConnection to fire and handle HTTP requests

Community
  • 1
  • 1
BalusC
  • 992,635
  • 352
  • 3,478
  • 3,452
  • just found this question...... i have the same problem but what do i do with the crt file if i i'm using eclipse? what is the keytool's alternative for eclipse? – Ali Elgazar Mar 08 '13 at 19:55
  • Apparently firefox allows the usage of a domain-level certificate to visit the subdomains as well. However, JSoup will not allow this. Any suggestions to fix this? – bvdb Nov 28 '14 at 12:12
  • Thanks for the tip ! Still have a problem to load the .jdk file .. looks like it's not included or accesible from an external /crt directory. File f = new File(Environment.getRootDirectory() + "/crt/www.loterie.lu.jks"); if(f.isFile()) Log.i("JSOUP", "Certificate file found"); else Log.i("JSOUP", "ERROR : Certificate file not found "+f.getAbsolutePath()); – Dax May 30 '15 at 07:15
  • the deepest item is **Thumbprint**, not a site! – user25 Mar 31 '18 at 00:26
  • is it even possible with Google Chrome? – user25 Mar 31 '18 at 00:29
  • I downloaded `cer` file (no `crt` file available from Google Chrome) but during keytool converting it asks `Trust this certificate? [no]:` I typed yes, I hope it will work – user25 Mar 31 '18 at 00:38
  • didn't work `Exception in thread "main" javax.net.ssl.SSLException: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty` – user25 Mar 31 '18 at 00:40
14

In my case, all I needed to do was to add the .validateTLSCertificates(false) in my connection

Document doc  = Jsoup.connect(httpsURLAsString)
            .timeout(60000).validateTLSCertificates(false).get();

I also had to increase the read timeout but I think this is irrelevant

johnmerm
  • 576
  • 5
  • 14
  • It exists in version 1.8.3. I saw that In version 1.11.2 it is marked for deprecation https://jsoup.org/apidocs/org/jsoup/Connection.html What version of Jsoup are you using – johnmerm Apr 01 '18 at 11:03
  • Since version 1.12.1, validateTLSCertificates method has been officially removed. (see https://jsoup.org/news/release-1.12.1) – Stephan Apr 05 '20 at 03:41
10

I stumbled over the answers here and in the linked question in my search and want to add two pieces of information, as the accepted answer doesn't fit my quite similar scenario, but there is an additional solution that fits even in that case (cert and hostname don't match for test systems).

  1. There is a github request to add such a functionality. So perhaps soon the problem will be solved: https://github.com/jhy/jsoup/pull/343 edit: Github request was resolved and the method to disable certificate validation is: validateTLSCertificates(boolean validate)
  2. Based on http://www.nakov.com/blog/2009/07/16/disable-certificate-validation-in-java-ssl-connections/ I found a solution which seems to work (at least in my scenario where jsoup 1.7.3 is called as part of a maven task). I wrapped it in a method disableSSLCertCheck() that I call before the very first Jsoup.connect().

Before you use this method, you should be really sure that you understand what you do there - not checking SSL certificates is a really stupid thing. Always use correct SSL certificates for your servers which are signed by a commonly accepted CA. If you can't afford a commonly accepted CA use correct SSL certificates nevertheless with @BalusC accepted answer above. If you can't configure correct SSL certificates (which should never be the case in production environments) the following method could work:

    private void disableSSLCertCheck() throws NoSuchAlgorithmException, KeyManagementException {
    // Create a trust manager that does not validate certificate chains
    TrustManager[] trustAllCerts = new TrustManager[] {new X509TrustManager() {
            public java.security.cert.X509Certificate[] getAcceptedIssuers() {
                return null;
            }
            public void checkClientTrusted(X509Certificate[] certs, String authType) {
            }
            public void checkServerTrusted(X509Certificate[] certs, String authType) {
            }
        }
    };

    // Install the all-trusting trust manager
    SSLContext sc = SSLContext.getInstance("SSL");
    sc.init(null, trustAllCerts, new java.security.SecureRandom());
    HttpsURLConnection.setDefaultSSLSocketFactory(sc.getSocketFactory());

    // Create all-trusting host name verifier
    HostnameVerifier allHostsValid = new HostnameVerifier() {
        public boolean verify(String hostname, SSLSession session) {
            return true;
        }
    };

    // Install the all-trusting host verifier
    HttpsURLConnection.setDefaultHostnameVerifier(allHostsValid);
    }
mori
  • 72
  • 10
NextThursday
  • 2,074
  • 3
  • 13
  • 17
  • 1
    for next readers...be careful with that: this change the behaviour of ANY class in your app that creates an intance of HttpsURLConnection, not only in the class you run it. – exoddus Nov 11 '16 at 16:01
  • How do I integrate this solution with the Jsoup.connect(httpsurl).get() method? – Luke Feb 12 '18 at 10:38
  • @Luke parse the the result of HttpsURLConnection: jsoupDoc = Jsoup.parse(urlConnection.getInputStream() – CaptainCrunch Jul 21 '19 at 07:28
8

To suppress certificate warnings for specific JSoup connection can use following approach:

Kotlin


val document = Jsoup.connect("url")
        .sslSocketFactory(socketFactory())
        .get()


private fun socketFactory(): SSLSocketFactory {
    val trustAllCerts = arrayOf<TrustManager>(object : X509TrustManager {
        @Throws(CertificateException::class)
        override fun checkClientTrusted(chain: Array<X509Certificate>, authType: String) {
        }

        @Throws(CertificateException::class)
        override fun checkServerTrusted(chain: Array<X509Certificate>, authType: String) {
        }

        override fun getAcceptedIssuers(): Array<X509Certificate> {
            return arrayOf()
        }
    })

    try {
        val sslContext = SSLContext.getInstance("TLS")
        sslContext.init(null, trustAllCerts, java.security.SecureRandom())
        return sslContext.socketFactory
    } catch (e: Exception) {
        when (e) {
            is RuntimeException, is KeyManagementException -> {
                throw RuntimeException("Failed to create a SSL socket factory", e)
            }
            else -> throw e
        }
    }
}

Java



 Document document = Jsoup.connect("url")
        .sslSocketFactory(socketFactory())
        .get();


  private SSLSocketFactory socketFactory() {
    TrustManager[] trustAllCerts = new TrustManager[]{new X509TrustManager() {
      public java.security.cert.X509Certificate[] getAcceptedIssuers() {
        return null;
      }

      public void checkClientTrusted(X509Certificate[] certs, String authType) {
      }

      public void checkServerTrusted(X509Certificate[] certs, String authType) {
      }
    }};

    try {
      SSLContext sslContext = SSLContext.getInstance("TLS");
      sslContext.init(null, trustAllCerts, new java.security.SecureRandom());
      return sslContext.getSocketFactory();
    } catch (NoSuchAlgorithmException | KeyManagementException e) {
      throw new RuntimeException("Failed to create a SSL socket factory", e);
    }
  }

NB. As mentioned before ignoring certificates is not a good idea.

  • 2
    Since version 1.12.1, validateTLSCertificates method has been officialy removed. (see https://jsoup.org/news/release-1.12.1) – Stephan Apr 05 '20 at 03:32
  • @Stephan - thanks! Removed that part from the answer as no longer relevant. – Dmitri Korobtsov Apr 06 '20 at 12:26
  • @Kumar got some specific error to share? What was the problem? Using same solution somewhere in my Kotlin code atm, works just fine. – Dmitri Korobtsov May 08 '20 at 09:44
  • This does not work while mori's answer works. Is this related to non-standard port in URL? IU am using "https://192.168.1.10:5001". – eos1d3 Apr 16 '21 at 02:02
  • @eos1d3 was there some specific error returned? in general - not enough info to make any conclusions. Solution is valid, feels like you have some corner case. – Dmitri Korobtsov Apr 19 '21 at 12:24
  • @DmitriKorobtsov Using SSLSocketFactory returns the same error as if it is not used at all. And see my answer below. From my test, the SSLSocketFactory is totally useless. Only HttpsURLConnection.setDefaultHostnameVerifier is required and it works! – eos1d3 Apr 25 '21 at 10:35
3

I've had the same problem but took the lazy route - tell your app to ignore the cert and carry on anyway.

I got the code from here: How do I use a local HTTPS URL in java?

You'll have to import these classes for it to work:

import javax.net.ssl.HostnameVerifier;
import javax.net.ssl.HttpsURLConnection;
import javax.net.ssl.SSLContext;
import javax.net.ssl.SSLSession;
import javax.net.ssl.TrustManager;
import javax.net.ssl.X509TrustManager;

Just run that method somewhere before you try to make the connection and voila, it just trusts the cert no matter what. Of course this isn't any help if you actually want to make sure the cert is real, but good for monitoring your own internal websites etc.

Community
  • 1
  • 1
RobinCJ
  • 31
  • 1
0

I was facing the same issue with Jsoup, I was not able to connect and get the document for https urls but when I changed my JDK version from 1.7 to 1.8, the issue got resolved.

It may help you :)

ramkishorbajpai
  • 189
  • 3
  • 14
0

I've had that problem only in dev environment. The solution to solve it was just to add a few flags to ignore SSL to VM:

-Ddeployment.security.TLSv1.1=false 
-Ddeployment.security.TLSv1.2=false
pawelini1
  • 467
  • 7
  • 15
0

After testing the solutions here. It is strange that sslSocketFactory setting in Jsoup is completely useless and it never works. So there is no need to get and set SSLSocketFactory.

Actually the second half of Mori solution works. Just need the following before using Jsoup:

// Create all-trusting host name verifier
HostnameVerifier allHostsValid = new HostnameVerifier() {
    public boolean verify(String hostname, SSLSession session) {
        return true;
    }
};

// Install the all-trusting host verifier
HttpsURLConnection.setDefaultHostnameVerifier(allHostsValid);

This is tested with Jsoup 1.13.1.

eos1d3
  • 150
  • 1
  • 7
0

I'm no expert in this field but I ran into a similar exception when trying to connect to a website over HTTPS using java.net APIs. The browser does a lot of work for you regarding SSL certificates when you visit a site using HTTPS. However, when you are manually connecting to sites (using HTTP requests manually), all that work still needs to be done. Now I don't know what all this work is exactly, but it has to do with downloading certificates and putting them where Java can find them. Here's a link that will hopefully point you in the right direction.

http://confluence.atlassian.com/display/JIRA/Connecting+to+SSL+services

jeff
  • 4,175
  • 14
  • 26
-5

Try following (just put it before Jsoup.connect("https://example.com"):

    Authenticator.setDefault(new Authenticator() {
        @Override
        protected PasswordAuthentication getPasswordAuthentication() {
            return new PasswordAuthentication(username, password.toCharArray());
        }
    });
Roman Kazanovskyi
  • 2,295
  • 1
  • 19
  • 21