How to get a file from a curl get request in Java?

Question

I'm trying to use an API to download some XBRL files. In order to do that I need to do a curl request, like this:

curl -XGET http://distribution.virk.dk/offentliggoerelser --data-binary @query_regnskaber.json

The idea is, as I understand it, that "@query_regnskaber.json" is a json file / json query that I need to send with my request and in return I get a XBRL file(s) / some data. I'm using Java with the play framework (not specifically using play framework for the curl request though, but maybe someone know some play features to do curl requests).

This is my current code:

        String jsonStr =
    "{" +
        "\"query\": {" +
        "\"bool\": {" +
            "\"must\": [" +
            "{" +
                "\"term\": {" +
                    "\"offentliggoerelse.dokumenter.dokumentMimeType\": \"application\"" +
            "}" +
            "}," +
            "{" +
                "\"term\": {" +
                    "\"offentliggoerelse.dokumenter.dokumentMimeType\": \"xml\"" +
            "}" +
            "}," +
            "{" +
                "\"range\": {" +
                    "\"offentliggoerelse.offentliggoerelsesTidspunkt\": {" +
                        "\"from\": \"2016-12-01\"" +
                    "}" +
                "}" +
            "}" +
            "]," +
            "\"must_not\": []," +
            "\"should\": []" +
        "}" +
    "}," +
        "\"size\": 1000" +
    "}";
    String urlStr = "http://distribution.virk.dk/offentliggoerelser";
    JSONObject jsonObj = new JSONObject(jsonStr);
    URL myURL = new URL(urlStr);
    HttpURLConnection urlCon = (HttpURLConnection)myURL.openConnection();
    urlCon.setRequestProperty("Content-Type", "application/json; charset=UTF-8");
    urlCon.setRequestMethod("GET");
    urlCon.setDoInput(true);
    urlCon.setDoOutput(true);
    urlCon.connect();
    OutputStream os = urlCon.getOutputStream();
    os.write(jsonObj.toString().getBytes("UTF-8"));
    os.close();
    BufferedReader br = new BufferedReader(new InputStreamReader((urlCon.getInputStream())));
    String output;
    System.out.println("Output from Server .... \n");
    while ((output = br.readLine()) != null) {
        System.out.println(output);
    }
    urlCon.disconnect();

Something goes wrong and I'm not sure whether it's because of some missing settings, my code or both. I get the 403 error on the "urlCon.getInputStream()" call.

The only documentation I can find for the API is in Danish. It also mentions that it uses ElasticSearch, which I assume is used to find specific XBRL files that can be found on "http://distribution.virk.dk/offentliggoerelser/_search". Finding specific XBRL files is something I want to be able to do to. Just in case, here is a link to the API documentation.

I'm using the example json query that can be found in the documentation, in my code.

Thank you for your help.

My json test query:

{
    "query": {
        "bool": {
            "must": [
                {
                    "term": {
                        "offentliggoerelse.dokumenter.dokumentMimeType": "application"
                    }
                },
                {
                    "term": {
                        "offentliggoerelse.dokumenter.dokumentMimeType": "xml"
                    }
                },
                {
                    "range": {
                        "offentliggoerelse.offentliggoerelsesTidspunkt": {
                            "from": "2014-10-01"
                        }
                    }
                }
            ],
            "must_not": [],
            "should": []
        }
    },
    "size": 1000
}

`403 Forbidden` is returned if you are not allowed to access this resource. Are the credentials set in the JSON document? — rmuller, Dec 15 '16 at 12:56
Have you tried testing your json request? I would suggest use a tool like postman and ensure that the API call returns as expected using the json request. — Soumik Mukherjee, Dec 15 '16 at 13:01
Possible duplicate of [http get request with body](http://stackoverflow.com/questions/27180431/http-get-request-with-body) — glee8e, Dec 15 '16 at 13:03
Notice the answer of Nick. It probably describe why you got a 403. — glee8e, Dec 15 '16 at 13:05
Login credentials should not be necessary. I have not tested my json request either, but I got postman installed (although I'm using the example query from the API documentation). Just gonna figure out how to use it again :) — Marcus, Dec 15 '16 at 13:09
@glee8e You're referring to that the url class automatically assumes it's a POST because I'm using outstream? That sucks :( I'll look into that too! — Marcus, Dec 15 '16 at 13:13
Yeah, I think it may be the cause. May you post the java version you are using? — glee8e, Dec 15 '16 at 13:15
@nafas I have added the full java code for my json query in the post. — Marcus, Dec 15 '16 at 13:22
@SoumikMukherjee When I run my code without the upload part, I receive the contents of the site. Same result with a normal GET in Postman. When I try to POST in Postman I get error 403, same as in my code when I try to upload. So, I assume that the java class I'm using can't handle a GET request that sends a file. Also, how can I do/simulate the same thing in Postman? :P — Marcus, Dec 15 '16 at 13:36
@Marcus check out my answer mate. that hopefully should explain it — nafas, Dec 15 '16 at 13:42

nafas · Accepted Answer · 2016-12-15T13:50:59.260

0

it seems its a ElasticSearch at the backend and not much has changed,

to send query to http://distribution.virk.dk/offentliggoerelser is forbidden as is in ElasticSearch. (you can't query index directly)

But should work if you send POST (GET seems to work too) query to http://distribution.virk.dk/offentliggoerelser/_search (NOTE /_search) so change

String urlStr = "http://distribution.virk.dk/offentliggoerelser";

to

String urlStr = "http://distribution.virk.dk/offentliggoerelser/_search";

Optionally change

urlCon.setRequestMethod("GET");

to

urlCon.setRequestMethod("POST");

NOTE:

in case you are wondering why your CURL works, well it doesn't, because you use XGET instead of XPOST it simply ignores the query file you are sending thus spits out some information that don't correspond to your query which is clearly wrong.

edited Dec 15 '16 at 13:50

answered Dec 15 '16 at 13:31

nafas

5,004
1
23
47

Woah! Something happened. I got a ton of data back. This actually solved my stated problem, thanks! Now I need to check what the data is all about, because I doubt I'm supposed to get that much data considering my query, but I might be assuming wrong. Anyways, thanks :) – Marcus Dec 15 '16 at 13:44
@Marcus well, one way to check is to see the number of results. in the returned json object, if you check at the top of the page, you should see `"total":13542` with query and `"total":1160917` without query. it means your query works ( if its a right query or not well that's something for you to verify) :) – nafas Dec 15 '16 at 13:46
if you never worked with `ElasticSearch`, you should read the documentation, you can do wonders :) – nafas Dec 15 '16 at 13:49
I read a little bit, but I thought I would try doing some curl request without the _search first... I guess that was a mistake :) The query I used was just for testing, so it could very well be correct. Also, the numbers you gave makes perfect sense, so I assume it actually works as intended ;) So, again, thanks a lot! – Marcus Dec 15 '16 at 13:54
@Marcus `curl` is your friend, just know that you can't send data with `XGET` , you just need to use `XPOST` in order to send data (in this case your query), otherwise `curl` will ignore it. – nafas Dec 15 '16 at 13:58

score 0 · Answer 2 · answered Dec 15 '16 at 23:04

0

(Posted solution on behalf of the OP).

I added "/_search" to the site and changed the request method to POST. Explanation in nafas' answer.

answered Dec 15 '16 at 23:04

halfer

18,701
13
79
158

score 0 · Answer 3 · answered Dec 15 '16 at 23:13

0

Setting doOutput to TRUE turns it into a POST. Remove, along with whatever output you're sending. Request parameters should be in the URL for GET requests.

answered Dec 15 '16 at 23:13

user207421

289,834
37
266
440

How to get a file from a curl get request in Java?

3 Answers3