1

I am writing a server that receives http request(only GET method as part of simplification from school work)

I used Socket class to get the connection I then used InputStream as well as Scanner to read the http request.

However, while I was reading the http request headers line by line using hasNext(). The program hangs at hasNext(), waiting for more inputs even it has consumed all the lines.

Below is my readRequest method:

public void readRequest(Socket client) throws BadRequestException {
    StringBuilder builder = new StringBuilder();

    try {
        Scanner sc = new Scanner(client.getInputStream());
        sc.useDelimiter("\\r\\n");

        while(sc.hasNext()){
            builder.append(sc.next());
            builder.append("\n");
        }

        parseRequestFromClient(builder.toString());
    } catch (IOException e) {
        throw new BadRequestException(e.getMessage());
    }
Xu Chen
  • 347
  • 2
  • 8
  • 1
    Have you tried any debugging? – xenteros Sep 03 '16 at 11:16
  • Are you sure you are reading anything in your loop? I don't see any part which is actually sends headers to server about what type of request is it (is it GET or maybe something else, which resource exactly do you want to get from server). Without this headers server may simply wait and not send you anything which may hold `hasNext()`. – Pshemo Sep 03 '16 at 11:31
  • @xenteros Yes, I used Eclipse to debug. And it aways stops at that `while(sc.hasNext())` line. – Xu Chen Sep 03 '16 at 11:45
  • @Pshemo Sorry for the ambiguity. It is actually part of school work. So it is simplified in a way it only handles GET method for now. – Xu Chen Sep 03 '16 at 11:46
  • My question is: are you sure that server is sending you anything back? If its not, but connection is still opened (since `hasNext()` didn't return `false`) then it may mean that server doesn't know what resource you want to get. To specify that resource (like if it is `http://server.com/path/to/resource`) you would need to send "get" header `GET /path/to/resource HTTP/1.1` (assuming your socked is connected to `http://server.com`). If your homework doesn't make you use sockets, then you can simplify your code with URLConnection class http://stackoverflow.com/q/2793150/1393766. – Pshemo Sep 03 '16 at 12:05
  • Anyway why do you want to use `hasNext()` with `\r\n` delimiter? Scanner already provides `hasNextLine()` and `nextLine()` methods. – Pshemo Sep 03 '16 at 12:07
  • You aren't just reading the headers with this code. You are reading the entire response as lines, and you are running into HTTP keepalive. You need a good knowledge of RFC 2616 to implement HTTP, specifically the parts about content length, and there is nothing in this code that attempts to implement it. Given that `HttpURLConnection` already exists, it is hard to see the point of even trying. – user207421 Sep 03 '16 at 13:03

2 Answers2

0

Your facing this issue because hasNext will behind the scene read the source to check if there is another matching token until the source reaches its end by returning -1 which is not not your case here.

As reminder here is the Javadoc of the method hasNext:

Returns true if this scanner has another token in its input. This method may block while waiting for input to scan. The scanner does not advance past any input.


You should avoid reinventing the wheel and use a library that will do it for you, like DavidWebb and many others.

Nicolas Filotto
  • 39,066
  • 11
  • 82
  • 105
  • Thanks for the post shared. In this case, using merely sc.hasNext() will not let me consume all the content of the http request then? For now, I used `do{ nextLine = sc.next(); builder.append(nextLine); builder.append("\n"); } while (!nextLine.equals(""));` to get around this issue. – Xu Chen Sep 03 '16 at 11:48
  • You should consider using a BufferedReader instead https://docs.oracle.com/javase/8/docs/api/java/io/BufferedReader.html – Nicolas Filotto Sep 03 '16 at 11:52
0

According to rfc 7230 (2616 is obsoleted) read bytes from the socket.

Http request MUST be encoded using 7 bit usascii. Any byte with bit 7 set or less than 0x20 except 0x0a and 0x0d should result in a 400 bad request.

Read until 0x0d 0x0a 0x0d 0x0a sequence.

After that the first line is the request line separeted by 0x0d 0x0a, rest is header lines.

Split the request line on 0x20, it should return exactly 3 parts. Anything else means someone is hacking your server, send 400 status.

Headers should splice on ": " giving exactly 2 parts. Anything else send 400 status. Trim (headerkey)!=headerkey 400 status.

Only after this you may start urldecode the requestline[2] and the headervalues.

Duplicate headerkeys? 400 status. Duplicate request parameters? 400 status.

Doing anything else will result in a server than is known to be hackable by request paramer and or header smuggling.

Only by interpreting usascii 7 bit you will be immune for utf8 hacks 0x0a has only one representation in usascii 7 bit. But several in utf8 0x0a 0x00a 0x000a 0x0000a 0x00000a 0x00000a 0x0000000a are the same!

Read rfc 7230-7235 before doing any programming! Make your server very hard to hack.