1

I want to have an application which parses various RSS feeds and send the information to a remote server. The information is sent in xml format via http. At first I tried to deploy this application on my own server, so I send the xml using the method shown in this tutorial by Java Tips. Here is my code which is replicated from the example:

First Method

    String strURL = "http://localhost/readme/readme_xml";
    String strXMLFilename = "output.xml";
    File input = new File(strXMLFilename);


    PostMethod post = new PostMethod(strURL);
    post.setRequestEntity(new InputStreamRequestEntity(
            new FileInputStream(input), input.length()));
    post.setRequestHeader(
            "Content-type", "text/xml; charset=ISO-8859-1");
    HttpClient httpclient = new HttpClient();
    try {

        int result = httpclient.executeMethod(post);            
        System.out.println("Response status code: " + result);            
        System.out.println("Response body: ");
        System.out.println(post.getResponseBodyAsString());            
    } finally {
        post.releaseConnection();
    }

This works perfectly (I even tested using a remote server outside the localhost). Then, somehow I cant use my own server to deploy this application, so I decided to migrate to Google Apps Engine. One thing about it, as we know it, is that not all libraries are allowed in the environment. So I try another method shown in ExampleDepot.com (I can't find where the exact url though) as below:

Second Method

try {
           /* fill up this url with the remote server url */
            URL url = new URL("http://localhost/readme/readme_xml");
            FileReader fr = new FileReader("output.xml");
            char[] buffer = new char[1024*10];
            int len = 0;
            if ((len = fr.read(buffer)) != -1){
            /* send http request to remote server */
                URLConnection conn = url.openConnection();
                conn.setRequestProperty("Content-Type","text/xml;charset=ISO-8859-1"); /* need to specify the content type */
                conn.setDoOutput(true);
                conn.setDoOutput(true);
                PrintWriter pw = new PrintWriter(conn.getOutputStream());
                pw.write(buffer, 0, len);
                pw.flush();
                /* receive response from remote server*/
                BufferedReader bf = new BufferedReader(new InputStreamReader(conn.getInputStream()));
                String input = null;
                while ((input = bf.readLine()) != null){
                    System.out.println(input);
                }
            }

        } catch (MalformedURLException e) {
                e.printStackTrace();
        } catch (FileNotFoundException e) {
                e.printStackTrace();
        } catch (IOException e) {
                e.printStackTrace();
        }
    }

The second method though, doesn't work and gives the following error (I use SimpleXMLElement (php) object to parse xml in the remote hosting):

Error message from remote server

Here's the php code from the remote server (In here, I just want the SimpleXMLElement to parse the xml without doing anything else fancy for now)

$xml = new SimpleXMLElement('php://input', NULL, TRUE);
   foreach ($xml -> attributes() as $name => $val){
   echo "[".$name."] = ".$val."\n";
   }

I thought the cause of this problem is the malfunction xml file (because the eclipse IDE indicates there's error of "invalid byte 1 of 1-byte utf-8 sequence"). Then I use the same exact input xml file to the first method, but it still works perfectly.

So is there any adjustment that I need to make to the second method? Or is there any other method that I can use to send xml file to remote server? Let me know if I need to add some other details. Thanks for your help.


NOTE: I actually solved this problem by using the solution given in the comments. I didn't use approaches suggested in the answers, even though those answers are pretty useful. So, I didn't select the best answer out of those answers given. Nonetheless, I still appreciate all of your helps, thus deserve my upvote. Cheers!

vandershraaf
  • 875
  • 3
  • 11
  • 23
  • 2
    How big is your input file (output xml). In your second solution, you'll only read the first 10240 bytes of the file. Also, instead of flushing the print stream, you should close the output stream. – beny23 Aug 10 '11 at 13:29
  • 1
    Like beny23 told, you're not reading the entire XML. Convert that `if` statement to a `while` loop and move the network code out of the loop. Also close the `PrintWriter`. – asgs Aug 10 '11 at 13:50
  • @beny23 and asgs I just follow both of your advice and pretty fix most of stuff (there still persist another thing, but the said errors are gone). I didnt realize that i was reading only limited amount of stream. So upvote for both of you guys! Thanks! – vandershraaf Aug 10 '11 at 15:45
  • @asgs upvote for you too – vandershraaf Aug 10 '11 at 15:45

3 Answers3

1

I guess you need to change the content type to multipart/form-data. See an already answered question in detailed. The file upload is discussed at the bottom of this example

Community
  • 1
  • 1
Santosh
  • 16,973
  • 4
  • 50
  • 75
1

I would, as the first answer suggest, read the file with an InputStream. Converting from byte to char and back again is unnecessary and a source of error. Also, verify that the input file really is using the ISO-8859-1 encoding.

UPDATE:

When using a FileReader, you accept the default encoding (i.e. how to make chars from bytes). This encoding must match the encoding used for the input file, otherwise there's a great risk that the result is corrupted. The default Java encoding is different for different platforms, so it is generally not a good idea to rely on it.

In your second example, there's no reason to read the file as characters, since it will be sent on the wire as bytes anyway. Using byte streams all the way also avoids the encoding issue (apart from the information in the content-type header).

forty-two
  • 11,771
  • 2
  • 22
  • 32
0

never read a file as chars unless you are reading a text file. xml is not text, it is a binary format. copy the file using normal InputStreams and byte[]s.

also, as @beny23 suggested in his comment, make sure you always copy streams using a loop, not a single read() (even if your buffer is big enough, it is not guaranteed that the InputStream will give you all the bytes in one call, even for a FileInputStream).

jtahlborn
  • 50,774
  • 5
  • 71
  • 112
  • wow, i love the downvotes! seriously, please read up on the xml format. then, please remove your downvotes once the code is fixed and you realize that my answer is correct. @Ryan Stewart - please do some background reading before posting comments on stuff you do not understand. – jtahlborn Aug 10 '11 at 13:59
  • to all those who are confused, yes, i understand that xml looks like text. however, if you understand all the details of parsing an xml document, you will understand that just blindly converting an xml document to chars is a sure-fire way to corrupt the document. – jtahlborn Aug 10 '11 at 14:01
  • I'd extend your statement to say that there's no such thing as a "text" file, period. There's a stream of bytes and an encoding that can be applied to those bytes. And if you pick the wrong encoding, you're screwed. XML has the benefit that its encoding is either specified in the file or implied. – parsifal Aug 10 '11 at 16:45
  • Per [Extensible Markup Language 1.1](http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-well-formed): "Definition: A textual object is a well-formed XML document if..." There is no corresponding, "A binary object is a well-formed XML document if..." Of course any data stored on or manipulated by a computer system is represented in binary format, but if that's what you meant, your answer is somewhat ambiguous. – Ryan Stewart Aug 10 '11 at 23:52
  • @Ryan Stewart - no, i mean if you blandly treat xml as textual data, then you are guaranteed to corrupt it, exactly like the example above. the fact that you aren't understanding this seems to indicate that you have not spent a lot of time dealing with it. i've fixed many a bug in systems that i've worked in and answered many a post various forums where people treated their xml data like text and it bit them. if you are not parsing the xml, you should be treating it like bytes. if you want to parse it, then use an xml parser which correctly manages the byte -> char interpretation. – jtahlborn Aug 11 '11 at 00:12
  • I understand the basis of your statement now. I would agree with you in principle, but aren't you assuming there's a prolog with encoding information in it? That's not a given in this problem, and the fact that it's sent over HTTP means there could be external encoding information that could be important as well. That's aside from the fact that the problem wasn't with encoding anyway. If you clarify your answer, I'll upvote it to balance the downvote because properly handling the encoding *is* an important consideration, if irrelevant to this problem. – Ryan Stewart Aug 13 '11 at 00:49
  • @Ryan Stewart - regardless of the prolog (which certainly _is_ important) this line of code `new FileReader("output.xml");` is virtually guaranteed to be a bug. that will work/break depending on the system settings of the jvm. the part about http, etc. is completely irrelevant. if the data is broken when it is read from file, everything after that point is meaningless. – jtahlborn Aug 13 '11 at 02:28
  • You're right. I wasn't thinking about that part, but again, that can cause problems when reading any type of file, not just XML, and an XML parser with no encoding information is just as prone to failure as this. If you know otherwise, then yes, you know more than I, and I'd appreciate being enlightened. – Ryan Stewart Aug 13 '11 at 03:25
  • @Ryan Stewart - no, xml encoding has well defined semantics http://www.w3.org/TR/xml/#sec-guessing . if there is no prolog, then basically (slightly more complicated, but this is the gist) the xml is encoded using utf-8 or utf-16 if a BOM is included (unless you are using some other mechanism to convey the encoding). in a lot of ways, you are basically making my argument for me. treat encoded xml data like binary data at all times. if you do that, you won't break it. – jtahlborn Aug 13 '11 at 16:52