268

I have a REST web service that currently exposes this URL:

http://server/data/media

where users can POST the following JSON:

{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873
}

in order to create a new Media metadata.

Now I need the ability to upload a file at the same time as the media metadata. What's the best way of going about this? I could introduce a new property called file and base64 encode the file, but I was wondering if there was a better way.

There's also using multipart/form-data like what a HTML form would send over, but I'm using a REST web service and I want to stick to using JSON if at all possible.

Daniel T.
  • 33,336
  • 31
  • 125
  • 191
  • 43
    Sticking to using only JSON is not really required to have a RESTful web service. REST is basically just anything that follows the main principles of the HTTP methods and some other (arguably non-standardised) rules. – Erik Kaplun Oct 25 '12 at 20:25

7 Answers7

202

I agree with Greg that a two phase approach is a reasonable solution, however I would do it the other way around. I would do:

POST http://server/data/media
body:
{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873
}

To create the metadata entry and return a response like:

201 Created
Location: http://server/data/media/21323
{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873,
    "ContentUrl": "http://server/data/media/21323/content"
}

The client can then use this ContentUrl and do a PUT with the file data.

The nice thing about this approach is when your server starts get weighed down with immense volumes of data, the url that you return can just point to some other server with more space/capacity. Or you could implement some kind of round robin approach if bandwidth is an issue.

Darrel Miller
  • 129,370
  • 30
  • 183
  • 235
  • 8
    One advantage to sending the content first is that by the time the metadata exists, the content is already present. Ultimately the right answer depends on the organisation of the data in the system. – Greg Hewgill Oct 15 '10 at 03:09
  • Thanks, I marked this as the correct answer because this is what I wanted to do. Unfortunately, due to a weird business rule, we have to allow the upload to occur in any order (metadata first or file first). I was wondering if there was a way to combine the two in order to save the headache of dealing with both situations. – Daniel T. Oct 15 '10 at 19:56
  • @Daniel If you POST the data file first, then you can take the URL returned in Location and add it to the ContentUrl attribute in the metadata. That way, when the server receives the metadata, if a ContentUrl exists then it already knows where the file is. If there is no ContentUrl, then it knows that it should create one. – Darrel Miller Oct 15 '10 at 21:10
  • if you was to do the POST first, would you post to the same URL? (/server/data/media) or would you create another entry point for file-first uploads? – Matt Brailsford Dec 17 '10 at 11:39
  • @Matt No. I would return a link header with rel="metadata" and it would tell me where to put the metadata. – Darrel Miller Dec 17 '10 at 12:19
  • Say we wanted to upload multiple files, would it work by posting multiple meta data objects, like so -> `{ , , }`. We would then simply PUT the file data to `http://server/upload/file/` one at a time? Is this how facebook and other major players do it? – James111 Jul 24 '16 at 09:26
  • Out of curiosity, @DarrellMiller, why do you recommend PUT over POST for sending file data? We are using ASP.NET Core 2.0 and many examples use POST for sending the file data. Is this just so that we are more closely following the W3C standards? Or, is there a technical reason for it? – Christian Findlay Jan 23 '18 at 23:44
  • This post SEEMs like a very elegant solution. But, what about the case where you need to pass across some authentication credentials or auth token? In this case, you would either have to pass them in the querystring or as headers, which really takes us back to the original problem. The original problem being that we need to upload some details with the file data. So, I can't really see how breaking the call up in to two separate parts POST->PUT solves the problem. – Christian Findlay Jan 24 '18 at 00:02
  • IMO this is not a perfect solution in the sense that it breaks data integrity. Meta data and payload should be in one transaction. The question didn't say anything about this, but I suppose this could be important for many other similar use cases. Making it two separate requests simply breaks this. Erik Allik's answer is a (much)better way – Faraway Feb 26 '18 at 07:01
  • 1
    @Faraway What if the metadata included the number of "likes" of an image? Would you treat it as a single resource then? Or more obviously, are you suggesting that if I wanted to edit the description of an image, I would need to re-upload the image? There are many cases where multi-part forms are the right solution. It is just not always the case. – Darrel Miller Feb 26 '18 at 16:42
  • @DarrelMiller hmm... ok, make sense. I guess I had a narrow view. Though in my case( and I believe some of other people's cases) multipart form is still a better way, this could be a good solution for the cases you mentioned. Thanks for the reply. Really appreciated. Down vote removed. (update: can't remove downvote... sorry) – Faraway Feb 27 '18 at 04:52
112

Just because you're not wrapping the entire request body in JSON, doesn't meant it's not RESTful to use multipart/form-data to post both the JSON and the file(s) in a single request:

curl -F "metadata=<metadata.json" -F "file=@my-file.tar.gz" http://example.com/add-file

on the server side:

class AddFileResource(Resource):
    def render_POST(self, request):
        metadata = json.loads(request.args['metadata'][0])
        file_body = request.args['file'][0]
        ...

to upload multiple files, it's possible to either use separate "form fields" for each:

curl -F "metadata=<metadata.json" -F "file1=@some-file.tar.gz" -F "file2=@some-other-file.tar.gz" http://example.com/add-file

...in which case the server code will have request.args['file1'][0] and request.args['file2'][0]

or reuse the same one for many:

curl -F "metadata=<metadata.json" -F "files=@some-file.tar.gz" -F "files=@some-other-file.tar.gz" http://example.com/add-file

...in which case request.args['files'] will simply be a list of length 2.

or pass multiple files through a single field:

curl -F "metadata=<metadata.json" -F "files=@some-file.tar.gz,some-other-file.tar.gz" http://example.com/add-file

...in which case request.args['files'] will be a string containing all the files, which you'll have to parse yourself — not sure how to do it, but I'm sure it's not difficult, or better just use the previous approaches.

The difference between @ and < is that @ causes the file to get attached as a file upload, whereas < attaches the contents of the file as a text field.

P.S. Just because I'm using curl as a way to generate the POST requests doesn't mean the exact same HTTP requests couldn't be sent from a programming language such as Python or using any sufficiently capable tool.

Erik Kaplun
  • 33,421
  • 12
  • 92
  • 102
  • 4
    I had been wondering about this approach myself, and why I hadn't seen anyone else put it forth yet. I agree, seems perfectly RESTful to me. – soupdog Oct 23 '13 at 02:37
  • 1
    YES! This is very practical approach, and it isn't any less RESTful than using "application/json" as a content type for the whole request. – sickill Mar 08 '14 at 12:04
  • 1
    ..but that's only possible if you have the data in a .json file and upload it, which is not the case – itsjavi Apr 02 '15 at 13:38
  • 5
    @mjolnic your comment is irrelevant: the cURL examples are just, well, _examples_; the answer explicitly states that you can use anything to send off the request... also, what prevents you from just writing `curl -f 'metadata={"foo": "bar"}'`? – Erik Kaplun Apr 02 '15 at 15:18
  • 3
    I'm using this approach because the accepted answer wouldn't work for the application I'm developing (the file cannot exist before the data and it adds unnecessary complexity to handle the case where the data is uploaded first and the file never uploads). – BitsEvolved Mar 17 '16 at 22:55
37

One way to approach the problem is to make the upload a two phase process. First, you would upload the file itself using a POST, where the server returns some identifier back to the client (an identifier might be the SHA1 of the file contents). Then, a second request associates the metadata with the file data:

{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873,
    "ContentID": "7a788f56fa49ae0ba5ebde780efe4d6a89b5db47"
}

Including the file data base64 encoded into the JSON request itself will increase the size of the data transferred by 33%. This may or may not be important depending on the overall size of the file.

Another approach might be to use a POST of the raw file data, but include any metadata in the HTTP request header. However, this falls a bit outside basic REST operations and may be more awkward for some HTTP client libraries.

Greg Hewgill
  • 828,234
  • 170
  • 1,097
  • 1,237
  • You can use Ascii85 increasing just by 1/4. – Singagirl Sep 13 '16 at 19:47
  • Any reference on why base64 increases the size that much? – jam01 Jan 31 '19 at 16:30
  • 1
    @jam01: Coincidentally, I just saw something yesterday which answers the space question well: [What is the space overhead of Base64 encoding?](https://lemire.me/blog/2019/01/30/what-is-the-space-overhead-of-base64-encoding/) – Greg Hewgill Jan 31 '19 at 17:30
11

I realize this is a very old question, but hopefully this will help someone else out as I came upon this post looking for the same thing. I had a similar issue, just that my metadata was a Guid and int. The solution is the same though. You can just make the needed metadata part of the URL.

POST accepting method in your "Controller" class:

public Task<HttpResponseMessage> PostFile(string name, float latitude, float longitude)
{
    //See http://stackoverflow.com/a/10327789/431906 for how to accept a file
    return null;
}

Then in whatever you're registering routes, WebApiConfig.Register(HttpConfiguration config) for me in this case.

config.Routes.MapHttpRoute(
    name: "FooController",
    routeTemplate: "api/{controller}/{name}/{latitude}/{longitude}",
    defaults: new { }
);
Greg Biles
  • 871
  • 2
  • 10
  • 8
7

I don't understand why, over the course of eight years, no one has posted the easy answer. Rather than encode the file as base64, encode the json as a string. Then just decode the json on the server side.

In Javascript:

let formData = new FormData();
formData.append("file", myfile);
formData.append("myjson", JSON.stringify(myJsonObject));

POST it using Content-Type: multipart/form-data

On the server side, retrieve the file normally, and retrieve the json as a string. Convert the string to an object, which is usually one line of code no matter what programming language you use.

(Yes, it works great. Doing it in one of my apps.)

ccleve
  • 13,099
  • 21
  • 76
  • 137
  • I'm way more surprised that no one expanded on Mike's answer, cause that's exactly how **multipart** stuff should be used: each part has it's own mime-type and DRF's multipart parser, should dispatch accordingly. Perhaps it's hard to create this type of envelope on the client side. I really should investigate... – Melvyn Nov 26 '20 at 20:21
6

If your file and its metadata creating one resource, its perfectly fine to upload them both in one request. Sample request would be :

POST https://target.com/myresources/resourcename HTTP/1.1

Accept: application/json

Content-Type: multipart/form-data; 

boundary=-----------------------------28947758029299

Host: target.com

-------------------------------28947758029299

Content-Disposition: form-data; name="application/json"

{"markers": [
        {
            "point":new GLatLng(40.266044,-74.718479), 
            "homeTeam":"Lawrence Library",
            "awayTeam":"LUGip",
            "markerImage":"images/red.png",
            "information": "Linux users group meets second Wednesday of each month.",
            "fixture":"Wednesday 7pm",
            "capacity":"",
            "previousScore":""
        },
        {
            "point":new GLatLng(40.211600,-74.695702),
            "homeTeam":"Hamilton Library",
            "awayTeam":"LUGip HW SIG",
            "markerImage":"images/white.png",
            "information": "Linux users can meet the first Tuesday of the month to work out harward and configuration issues.",
            "fixture":"Tuesday 7pm",
            "capacity":"",
            "tv":""
        },
        {
            "point":new GLatLng(40.294535,-74.682012),
            "homeTeam":"Applebees",
            "awayTeam":"After LUPip Mtg Spot",
            "markerImage":"images/newcastle.png",
            "information": "Some of us go there after the main LUGip meeting, drink brews, and talk.",
            "fixture":"Wednesday whenever",
            "capacity":"2 to 4 pints",
            "tv":""
        },
] }

-------------------------------28947758029299

Content-Disposition: form-data; name="name"; filename="myfilename.pdf"

Content-Type: application/octet-stream

%PDF-1.4
%
2 0 obj
<</Length 57/Filter/FlateDecode>>stream
x+r
26S00SI2P0Qn
F
!i\
)%!Y0i@.k
[
endstream
endobj
4 0 obj
<</Type/Page/MediaBox[0 0 595 842]/Resources<</Font<</F1 1 0 R>>>>/Contents 2 0 R/Parent 3 0 R>>
endobj
1 0 obj
<</Type/Font/Subtype/Type1/BaseFont/Helvetica/Encoding/WinAnsiEncoding>>
endobj
3 0 obj
<</Type/Pages/Count 1/Kids[4 0 R]>>
endobj
5 0 obj
<</Type/Catalog/Pages 3 0 R>>
endobj
6 0 obj
<</Producer(iTextSharp 5.5.11 2000-2017 iText Group NV \(AGPL-version\))/CreationDate(D:20170630120636+02'00')/ModDate(D:20170630120636+02'00')>>
endobj
xref
0 7
0000000000 65535 f 
0000000250 00000 n 
0000000015 00000 n 
0000000338 00000 n 
0000000138 00000 n 
0000000389 00000 n 
0000000434 00000 n 
trailer
<</Size 7/Root 5 0 R/Info 6 0 R/ID [<c7c34272c2e618698de73f4e1a65a1b5><c7c34272c2e618698de73f4e1a65a1b5>]>>
%iText-5.5.11
startxref
597
%%EOF

-------------------------------28947758029299--
Tenaciousd93
  • 3,140
  • 4
  • 29
  • 51
Mike Ezzati
  • 2,218
  • 1
  • 19
  • 31
0

To build on ccleve's answer, if you are using superagent / express / multer, on the front end side build your multipart request doing something like this:

superagent
    .post(url)
    .accept('application/json')
    .field('myVeryRelevantJsonData', JSON.stringify({ peep: 'Peep Peep!!!' }))
    .attach('myFile', file);

cf https://visionmedia.github.io/superagent/#multipart-requests.

On the express side, whatever was passed as field will end up in req.body after doing:

app.use(express.json({ limit: '3MB' }));

Your route would include something like this:

const multerMemStorage = multer.memoryStorage();
const multerUploadToMem = multer({
  storage: multerMemStorage,
  // Also specify fileFilter, limits...
});

router.post('/myUploads',
  multerUploadToMem.single('myFile'),
  async (req, res, next) => {
    // Find back myVeryRelevantJsonData :
    logger.verbose(`Uploaded req.body=${JSON.stringify(req.body)}`);

    // If your file is text:
    const newFileText = req.file.buffer.toString();
    logger.verbose(`Uploaded text=${newFileText}`);
    return next();
  },
  ...

One thing to keep in mind though is this note from the multer doc, concerning disk storage:

Note that req.body might not have been fully populated yet. It depends on the order that the client transmits fields and files to the server.

I guess this means it would be unreliable to, say, compute the target dir/filename based on json metadata passed along the file

Will59
  • 748
  • 7
  • 15