139

I am using HTTPURLConnection to connect to a web service. I know how to use HTTPURLConnection but I want to understand how it works. Basically, I want to know the following:

  • On which point does HTTPURLConnection try to establish a connection to the given URL?
  • On which point can I know that I was able to successfully establish a connection?
  • Are establishing a connection and sending the actual request done in one step/method call? What method is it?
  • Can you explain the function of getOutputStream and getInputStream in layman's term? I notice that when the server I'm trying to connect to is down, I get an Exception at getOutputStream. Does it mean that HTTPURLConnection will only start to establish a connection when I invoke getOutputStream? How about the getInputStream? Since I'm only able to get the response at getInputStream, then does it mean that I didn't send any request at getOutputStream yet but simply establishes a connection? Do HttpURLConnection go back to the server to request for response when I invoke getInputStream?
  • Am I correct to say that openConnection simply creates a new connection object but does not establish any connection yet?
  • How can I measure the read overhead and connect overhead?
Arci
  • 6,327
  • 20
  • 65
  • 95

5 Answers5

192
String message = URLEncoder.encode("my message", "UTF-8");

try {
    // instantiate the URL object with the target URL of the resource to
    // request
    URL url = new URL("http://www.example.com/comment");

    // instantiate the HttpURLConnection with the URL object - A new
    // connection is opened every time by calling the openConnection
    // method of the protocol handler for this URL.
    // 1. This is the point where the connection is opened.
    HttpURLConnection connection = (HttpURLConnection) url
            .openConnection();
    // set connection output to true
    connection.setDoOutput(true);
    // instead of a GET, we're going to send using method="POST"
    connection.setRequestMethod("POST");

    // instantiate OutputStreamWriter using the output stream, returned
    // from getOutputStream, that writes to this connection.
    // 2. This is the point where you'll know if the connection was
    // successfully established. If an I/O error occurs while creating
    // the output stream, you'll see an IOException.
    OutputStreamWriter writer = new OutputStreamWriter(
            connection.getOutputStream());

    // write data to the connection. This is data that you are sending
    // to the server
    // 3. No. Sending the data is conducted here. We established the
    // connection with getOutputStream
    writer.write("message=" + message);

    // Closes this output stream and releases any system resources
    // associated with this stream. At this point, we've sent all the
    // data. Only the outputStream is closed at this point, not the
    // actual connection
    writer.close();
    // if there is a response code AND that response code is 200 OK, do
    // stuff in the first if block
    if (connection.getResponseCode() == HttpURLConnection.HTTP_OK) {
        // OK

        // otherwise, if any other status code is returned, or no status
        // code is returned, do stuff in the else block
    } else {
        // Server returned HTTP error code.
    }
} catch (MalformedURLException e) {
    // ...
} catch (IOException e) {
    // ...
}

The first 3 answers to your questions are listed as inline comments, beside each method, in the example HTTP POST above.

From getOutputStream:

Returns an output stream that writes to this connection.

Basically, I think you have a good understanding of how this works, so let me just reiterate in layman's terms. getOutputStream basically opens a connection stream, with the intention of writing data to the server. In the above code example "message" could be a comment that we're sending to the server that represents a comment left on a post. When you see getOutputStream, you're opening the connection stream for writing, but you don't actually write any data until you call writer.write("message=" + message);.

From getInputStream():

Returns an input stream that reads from this open connection. A SocketTimeoutException can be thrown when reading from the returned input stream if the read timeout expires before data is available for read.

getInputStream does the opposite. Like getOutputStream, it also opens a connection stream, but the intent is to read data from the server, not write to it. If the connection or stream-opening fails, you'll see a SocketTimeoutException.

How about the getInputStream? Since I'm only able to get the response at getInputStream, then does it mean that I didn't send any request at getOutputStream yet but simply establishes a connection?

Keep in mind that sending a request and sending data are two different operations. When you invoke getOutputStream or getInputStream url.openConnection(), you send a request to the server to establish a connection. There is a handshake that occurs where the server sends back an acknowledgement to you that the connection is established. It is then at that point in time that you're prepared to send or receive data. Thus, you do not need to call getOutputStream to establish a connection open a stream, unless your purpose for making the request is to send data.

In layman's terms, making a getInputStream request is the equivalent of making a phone call to your friend's house to say "Hey, is it okay if I come over and borrow that pair of vice grips?" and your friend establishes the handshake by saying, "Sure! Come and get it". Then, at that point, the connection is made, you walk to your friend's house, knock on the door, request the vice grips, and walk back to your house.

Using a similar example for getOutputStream would involve calling your friend and saying "Hey, I have that money I owe you, can I send it to you"? Your friend, needing money and sick inside that you kept it for so long, says "Sure, come on over you cheap bastard". So you walk to your friend's house and "POST" the money to him. He then kicks you out and you walk back to your house.

Now, continuing with the layman's example, let's look at some Exceptions. If you called your friend and he wasn't home, that could be a 500 error. If you called and got a disconnected number message because your friend is tired of you borrowing money all the time, that's a 404 page not found. If your phone is dead because you didn't pay the bill, that could be an IOException. (NOTE: This section may not be 100% correct. It's intended to give you a general idea of what's happening in layman's terms.)

Question #5:

Yes, you are correct that openConnection simply creates a new connection object but does not establish it. The connection is established when you call either getInputStream or getOutputStream.

openConnection creates a new connection object. From the URL.openConnection javadocs:

A new connection is opened every time by calling the openConnection method of the protocol handler for this URL.

The connection is established when you call openConnection, and the InputStream, OutputStream, or both, are called when you instantiate them.

Question #6:

To measure the overhead, I generally wrap some very simple timing code around the entire connection block, like so:

long start = System.currentTimeMillis();
log.info("Time so far = " + new Long(System.currentTimeMillis() - start) );

// run the above example code here
log.info("Total time to send/receive data = " + new Long(System.currentTimeMillis() - start) );

I'm sure there are more advanced methods for measuring the request time and overhead, but this generally is sufficient for my needs.

For information on closing connections, which you didn't ask about, see In Java when does a URL connection close?.

jmort253
  • 32,054
  • 10
  • 92
  • 114
  • 1
    Hi. Thanks!!! That was indeed a detailed explanation and I really appreciate your answer. If I understand your answer correctly, both getOutputStream and getInputStream establishes a connection if no connection has been established yet. If I call getOutputStream, then call getInputStream, internally, HTTPURLConnection won't be restablishing a connection at getInputStream anymore since I was already able to established it at getOutStream? HttpURLConnection will reuse whatever connection I was able to establish at getOutputStream in getInputStream. – Arci Apr 12 '12 at 05:39
  • Cont.: Or does it establish a new and seperate connection for getOutputStream and getInputStream? Also, if I want to get the connect overhead, then the proper place to put my timer is before and after getOutputStream. If I want to get the read overhead, then the proper place to put my timer is before and after getInputStream. – Arci Apr 12 '12 at 05:40
  • Remember what the javadoc says about getInputStream and getOutputStream: `Returns an output stream that writes to this connection.` and `Returns an input stream that reads from this open connection.`. The outputstream and inputstream are separate from the connection. – jmort253 Apr 12 '12 at 05:51
  • Great question Arci, I think there is a flaw in my answer. The url.openConnection method does indeed establish the connection, it just doesn't deal with sending or receiving. Sending/Receiving is the job of the InputStream and OutputStream. I'll update my answer. You definitely have a good understanding of this, and you've helped me gain a better understanding as well. – jmort253 Apr 12 '12 at 05:53
  • @Arci - I updated the answer using strikethrough to highlight the changes. Now that I re-read this, it makes a lot more sense. Feel free to edit and remove the strikethrough once you are done. I just put them in there to make it easy for you to see my edits. :) – jmort253 Apr 12 '12 at 06:01
  • 1
    Thanks again. I've tried putting a timer before and after getOutputStream. I think the actual connection is established at getOutputStream because if for example my wiFi is disabled, the Exception is only thrown at getOutputStream. openConnection does not throw any Exception for that case. Also, when I set my connectTimeout to 10 seconds, the value of the timer which I put before and after getOutputStream is also 10 seconds after invoking getOutputStream. – Arci Apr 12 '12 at 06:16
  • Cont. As far as I know, connectTimeout is the maximum allowable time for trying to establish a connection. Since the value of the connectTimeout (which is the timeout for establishing a connection) is the same to the value of the timer after invoking getOutputStream, this means that the connection was only establish when getOutputStream was called. – Arci Apr 12 '12 at 06:16
  • 1
    This leaves me to wander what does opening a connection means? If it does not establish a connection, then what does it do? Does it simply instantiate a new connection object? By the way, I've already put a check on your answer as I'm already able to understand what I wanted to understand earlier. But please feel free also to leave more comments as I also want to gain a deeper understanding on this. I really appreciate your help! Thank you very much! – Arci Apr 12 '12 at 06:16
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/9987/discussion-between-jmort253-and-arci) – jmort253 Apr 12 '12 at 06:18
  • 9
    It's worth noting that it appears that the HttpURLConnection object only reaches out to the target URL at the point at which it NEEDS to do so. In your example, you have input and output streams, which of course can't do anything until the connection is open. A much simpler case is a GET operation, in which you do nothing but initialize the connection and then check the response code. In that case, the connection isn't actually made until the getResponseCode() method is called. Otherwise, this is a great explanation and exploration of the connection lifecycle! – Spanky Quigman May 10 '13 at 15:52
  • @jmort253 : I'm not sure your answer correctly explains the situation when you might want to make many requests to the same URL. The connection is not established when calling "url.openConnection()" if it's already been established. Is this worth explaining in an edit? – Tim Cooper Sep 04 '13 at 00:59
  • @TimCooper - Is this the part that you're saying needs to be revised, in the inline comments for Q1? "instantiate the HttpURLConnection with the URL object - A new connection is opened every time by calling the openConnection method of the protocol handler for this URL." – jmort253 Sep 04 '13 at 03:25
  • The wording of the javadocs between Java 6 and Java 7 are slightly different. I'm not 100% sure it's incorrect to state what I've stated for Java 6, but for Java 7, you're definitely correct that the [connection is not opened a second time in Java 7](http://docs.oracle.com/javase/7/docs/api/java/net/URL.html#openConnection%28%29), but in Java 6 it omits that and instead says [A new connection is opened every time by calling the openConnection method of the protocol handler for this URL.](http://docs.oracle.com/javase/6/docs/api/java/net/URL.html#openConnection%28%29). Whatdya think? – jmort253 Sep 04 '13 at 03:39
  • @jmort253 : I have to confess I haven't actually tried any of this stuff - I'm just about to. I'll be sending a great many small messages down an HTTPS pipe (our customers are not letting us use raw sockets) so I really hope the SSL handshaking stuff happens once only. – Tim Cooper Sep 05 '13 at 04:57
  • 1
    openConnection() just creates a new Socket. The actual Connect doesn't happens until getInputStream(). I have also noted that calling getResponseCode() before getInputStream() also works. – nightlytrails Sep 06 '13 at 00:08
  • 1
    I was confused before between the 'UrlConnection' instance and the underlying Tcp/Ip/SSL connection, 2 separate concepts. The former is basically synonymous with a single HTTP page request. The latter is something that hopefully will be created once only if you're doing multiple page requests to the same server. – Tim Cooper Sep 07 '13 at 20:12
  • 1
    `connection.setDoOutput(true);` implies `POST`. Also you must correct your examples with the friend - `openConnection()` _is_ a NOOP - till actually the connection is needed you won't get any exceptions - see [below](http://stackoverflow.com/a/17665771/281545) :`openConnection simply creates a new connection object but does not establish any connection yet`. At least in android - which is Java 6 - you won't get any exceptions till you try to create the streams, get response code etc – Mr_and_Mrs_D Sep 15 '13 at 18:50
  • @Mr_and_Mrs_D - Seems different versions of Java have different definitions of `openConnection()`, so I feel like changing the explanation would then just lead to confusion. Different implementations of Java, such as Android, will likely have slight behavioral differences in terms of how things are implemented. As for the edits, thank you for making the code comments easier to read by eliminating the horizontal scrollbars. :) – jmort253 Sep 15 '13 at 21:05
  • 1
    You are welcome - yes it was plain unreadable on a small screen :) Still `openConnection` seems not to do much in all Java implementations. Try `openConnection("non existent url"); sleep ; disconnect;` - what will happen ? (not sure but I guess no exceptions etc) – Mr_and_Mrs_D Sep 15 '13 at 21:15
  • Where should I put the `connection.connect()` and `disconnect()`? – Alston Jul 25 '14 at 14:53
  • @Stallman - I've not needed to call connect and disconnect. It seems that possibly happens under the hood. But check out this example http://alvinalexander.com/blog/post/java/how-open-url-read-contents-httpurl-connection-java to see where you'd place connect, if you were to call it. Looks like it should be called *after* setting all of the headers/properties on the HttpUrlConnection. Hope this helps. – jmort253 Jul 25 '14 at 18:42
  • Maybe you could to include in your description when the request HTTP headers are sent and the response HTTP headers are received? – Paŭlo Ebermann Nov 23 '15 at 15:04
  • This edited version of the answer which claims that a connection is made when you call openConnection does not match my experience on Windows and Linux over several Java versions and monitoring traffic with a tool like Wireshark. My experience is more similar to the pre-edit version and what is noted at this other site. It's like the connection is lazy loaded when it is actually needed: http://www.tbray.org/ongoing/When/201x/2012/01/17/HttpURLConnection – jla Nov 18 '20 at 21:16
  • @jla - This seems to also match what https://stackoverflow.com/a/60349100/552792 was saying when he ran with Wireshark as well. It's one of the reasons I struck out the original text as opposed to deleting it, as some folks were saying they experienced it differently. I don't exactly remember as I did this several years ago, but I think I stepped through with a debugger and watched logs on my server to confirm the connection being established. I wonder if maybe different implementations of Java perhaps behave differently? – jmort253 Jan 27 '21 at 06:28
18

Tim Bray presented a concise step-by-step, stating that openConnection() does not establish an actual connection. Rather, an actual HTTP connection is not established until you call methods such as getInputStream() or getOutputStream().

http://www.tbray.org/ongoing/When/201x/2012/01/17/HttpURLConnection

anonymous
  • 181
  • 1
  • 2
3

On which point does HTTPURLConnection try to establish a connection to the given URL?

On the port named in the URL if any, otherwise 80 for HTTP and 443 for HTTPS. I believe this is documented.

On which point can I know that I was able to successfully establish a connection?

When you call getInputStream() or getOutputStream() or getResponseCode() without getting an exception.

Are establishing a connection and sending the actual request done in one step/method call? What method is it?

No and none.

Can you explain the function of getOutputStream() and getInputStream() in layman's term?

Either of them first connects if necessary, then returns the required stream.

I notice that when the server I'm trying to connect to is down, I get an Exception at getOutputStream(). Does it mean that HTTPURLConnection will only start to establish a connection when I invoke getOutputStream()? How about the getInputStream()? Since I'm only able to get the response at getInputStream(), then does it mean that I didn't send any request at getOutputStream() yet but simply establishes a connection? Do HttpURLConnection go back to the server to request for response when I invoke getInputStream()?

See above.

Am I correct to say that openConnection() simply creates a new connection object but does not establish any connection yet?

Yes.

How can I measure the read overhead and connect overhead?

Connect: take the time getInputStream() or getOutputStream() takes to return, whichever you call first. Read: time from starting first read to getting the EOS.

user207421
  • 289,834
  • 37
  • 266
  • 440
  • 2
    I think OP meant which point connection is established and at which point we can get to know the connection status. Not the port url connects to. I am guessing this was directed toward openConnection() and getInoutStream()/getOutputStream()/getResponseCode() the answer to which is later. – Aniket Thakur Apr 12 '17 at 16:22
  • @AniketThakur I've answered all of that, in considerable detail. And he did ask about 'the port URL connects to', so I answered that too. – user207421 Jan 09 '21 at 00:24
1

On which point does HTTPURLConnection try to establish a connection to the given URL?

It's worth clarifying, there's the 'UrlConnection' instance and then there's the underlying Tcp/Ip/SSL socket connection, 2 different concepts. The 'UrlConnection' or 'HttpUrlConnection' instance is synonymous with a single HTTP page request, and is created when you call url.openConnection(). But if you do multiple url.openConnection()'s from the one 'url' instance then if you're lucky, they'll reuse the same Tcp/Ip socket and SSL handshaking stuff...which is good if you're doing lots of page requests to the same server, especially good if you're using SSL where the overhead of establishing the socket is very high.

See: HttpURLConnection implementation

Community
  • 1
  • 1
Tim Cooper
  • 8,926
  • 4
  • 57
  • 68
1

I went through the exercise to capture low level packet exchange, and found that network connection is only triggered by operations like getInputStream, getOutputStream, getResponseCode, getResponseMessage etc.

Here is the packet exchange captured when I try to write a small program to upload file to Dropbox.

enter image description here

Below is my toy program and annotation

    /* Create a connection LOCAL object,
     * the openConnection() function DOES NOT initiate
     * any packet exchange with the remote server.
     * 
     * The configurations only setup the LOCAL
     * connection object properties.
     */
    HttpURLConnection connection = (HttpURLConnection) dst.openConnection();
    connection.setDoOutput(true);
    connection.setRequestMethod("POST");
    ...//headers setup
    byte[] testContent = {0x32, 0x32};

    /**
     * This triggers packet exchange with the remote
     * server to create a link. But writing/flushing
     * to a output stream does not send out any data.
     * 
     * Payload are buffered locally.
     */
    try (BufferedOutputStream outputStream = new BufferedOutputStream(connection.getOutputStream())) {
        outputStream.write(testContent);
        outputStream.flush();
    }

    /**
     * Trigger payload sending to the server.
     * Client get ALL responses (including response code,
     * message, and content payload) 
     */
    int responseCode = connection.getResponseCode();
    System.out.println(responseCode);

    /* Here no further exchange happens with remote server, since
     * the input stream content has already been buffered
     * in previous step
     */
    try (InputStream is = connection.getInputStream()) {
        Scanner scanner = new Scanner(is);
        StringBuilder stringBuilder = new StringBuilder();
        while (scanner.hasNextLine()) {
        stringBuilder.append(scanner.nextLine()).append(System.lineSeparator());
        }
    }

    /**
     * Trigger the disconnection from the server.
     */
    String responsemsg = connection.getResponseMessage();
    System.out.println(responsemsg);
    connection.disconnect();
HarryQ
  • 873
  • 6
  • 19