4

I have been developing a Java web app that simply takes first_name, middle_name and last_name parameters via an HTML form and then embeds that data into an XML file and responds back to the client.

I set the Content-Type: text/xml.

Here is my servlet code:

package com.adi.request.xml;

import java.io.*;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class RequestToXMLServlet extends HttpServlet {

    private String lastName;
    private String firstName;
    private String middleName;

    /* Request Handling... */

    @Override
    public void doGet(HttpServletRequest request, HttpServletResponse response) {

        setName(request);  // Initialising the firstName, middleName and lastName
        String xmlDoc = getXML();  // Build and recieve the XML output

        response.setContentType("text/xml");  // TO BE NOTED...

        try(PrintWriter writer = response.getWriter()) {

            writer.print(xmlDoc);  // Printing the XML output
            writer.flush();
        } catch(IOException e) {

            e.printStackTrace();
        }
    }   


    // Setting the firstName, middleName and lastName
    private void setName(HttpServletRequest request) {

        firstName = request.getParameter("first_name");
        lastName = request.getParameter("last_name");
        middleName = request.getParameter("middle_name");
    }

    // Building the XML output
    private String getXML() {
        // The append() methods just adds a \r\n at the end of every line.
        String xmlDoc = append("<?xml version=\"1.0\" encoding=\"utf-8\"?>")+
                        append("<Request>")+
                        append("    <FirstName>"+firstName+"</FirstName>")+
                        append("    <MiddleName>"+middleName+"</MiddleName>")+
                        append("    <LastName>"+lastName+"</LastName>")+
                        append("</Request>");

        return xmlDoc;              
    }

    private String append(String str) {

        return str + "\r\n";
    }
}  

The HTML form:

<!DOCTYPE html>
<html> 
<head>
 <title>Request to XML - Servlet</title>
</head>

<body>
  <form method="GET" action="Request.do">
   <label for="first_name">Firstname:</label>
   <input type="text" name="first_name" id="first_name" />
   
   <br>
   
   <label for="middle_name">Middlename</Label>
   <input type="text" name="middle_name" id="middle_name" />
   
   <br>
   
   <label for="last_name">Lastname</Label>
   <input type="text" name="last_name" id="last_name" />
   
   <br>
   
   <input type="submit" name="submit" value="GET" />
  </form>
 </body>
</html>

This works fine and my browser properly displays the XML formatted data.

The problem is that

I wrote a small jython app that makes an HTTP POST request using raw sockets to the above written Java Servlet. Though it recieves proper XML formatted data, it also recieves unwanted characters at the begenning and end of the actual required XML data.

Here is my jython code:

from java.io import *
from java.net import *
from java.util import *

sock = Socket("localhost", 8080)

ostream = sock.getOutputStream()
writer = PrintWriter(ostream)

params="first_name=Aditya&middle_name=Rameshwarpratap&last_name=Singh"

writer.print("GET /RequestToXML/Request.do?"+params+" HTTP/1.1\r\n")
writer.print("Host: localhost:8080\r\n")
writer.print("Connection: Close\r\n")
writer.print("\r\n")
writer.flush()

istream = sock.getInputStream()
scanner = Scanner(istream)

while(scanner.hasNextLine()):
    print(scanner.nextLine())

istream.close()
ostream.close()
scanner.close()
writer.close()
sock.close()  

The output of this code is:

HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Content-Type: text/xml;charset=ISO-8859-1
Transfer-Encoding: chunked
Date: Thu, 16 Jul 2015 18:46:37 GMT
Connection: close

bc  // What is this?    
<?xml version="1.0" encoding="utf-8"?>
<Request type="POST">
    <FirstName>Aditya</FirstName>
    <MiddleName>Rameshwarpratap</MiddleName>
    <LastName>Singh</LastName>
</Request>

0  // And this?

So my questions are:

  1. What are those characters and why are they even sent when the content type is text/xml?

  2. This is irrelevant, but still, in my jython code, I've closed all the streams and socket at the end of the code. Is it necessary to close all of them or a few of them would do the cleanup job?

Aditya Singh
  • 2,275
  • 1
  • 17
  • 38

1 Answers1

4

It are chunk lengths in hex. Look, the response body is being sent in chunks as per below header:

Transfer-Encoding: chunked

More detail about this transfer encoding can be found in Wikipedia. The line with bc indicates start of a chunk of 188 bytes long (0xBC = 188). The line with 0 indicates the terminating chunk (so the client knows it can stop reading and don't need to wait for new chunks with remaining content, in case the connection is set to keep alive).

The servletcontainer will automatically switch to chunked encoding when the content length is unknown and the client has identified itself as a HTTP 1.1 capable client. It's even explicitly mentioned in javadoc of doGet():

...

Where possible, set the Content-Length header (with the ServletResponse.setContentLength(int) method), to allow the servlet container to use a persistent connection to return its response to the client, improving performance. The content length is automatically set if the entire response fits inside the response buffer.

When using HTTP 1.1 chunked encoding (which means that the response has a Transfer-Encoding header), do not set the Content-Length header.

...

Your client is not written in such way that it's capable of consuming chunked responses. It's basically an extremely basic socket which is in the request header pretending to be a HTTP 1.1 client.

If it's not affordable to rewrite the client in such way that it can deal with it (at least try pretending as a HTTP 1.0 client), or to switch to a real 1.1 HTTP aware client (in Java terms, that would be e.g. URLConnection), then rewrite your servlet in such way that it sets the content length.

@Override
public void doGet(HttpServletRequest request, HttpServletResponse response) {
    // ...
    String xmlDoc = getXML();
    byte[] content = xmlDoc.getBytes("UTF-8");

    response.setContentType("text/xml");
    response.setCharacterEncoding("UTF-8");
    response.setContentLengthLong(content.length);
    response.getOutputStream().write(content);
}   

If you're not on Java EE 7 / Servlet 3.1 yet, and you can guarantee that the XML content is not larger than Integer.MAX_VALUE (2GB), then use

    response.setContentLength((int) content.length);

or if you can't guarantee that, then use

    response.setHeader("Content-Length", String.valueOf(content.length));

Note that it must represent the byte length and thus certainly not the character (string) length. Also note that you don't need a try-with-resources statement. The container will all by itself worry about flushing and closing.

See also:


Unrelated to the concrete problem, your servlet is dealing with instance variables on a per-request basis. This is not threadsafe. Move those instance variables to inside the method block. For more detail, see also How do servlets work? Instantiation, sessions, shared variables and multithreading.

Community
  • 1
  • 1
BalusC
  • 992,635
  • 352
  • 3,478
  • 3,452