24

Imagine a web-application storing some data-resource with some id which stores three attachment (e.g. pdf) per datum.

The URL scheme is

data/{id}/attachment1
data/{id}/attachment2
data/{id}/attachment3

An RESTful API exists for the attachments providing GET/PUT/DELETE operations implementing CRUD operations on the server side.

Letting the id be 123, I would like to perform an operation where

  • attachment1 is replaced by a new attachment (such that GET file/123/attachment1 returns the a new attachment)
  • attachment2 is deleted (such that that GET file/123/attachment2 returns 404)
  • attachment3 remains unchanged.

The update should be atomic - the complete update is performed by the server or nothing at all.

Applying a simple PUT file/123/attachment1 and DELETE file/123/attachment2 is not atomic, since the client could crash after the PUT and the server has no hint that he should do a rollback in this case.

So how do I implement the operation in a RESTful way?

I've thought of two solutions but they both do not seem to be 100% RESTful:

  • Use PATCH (could be PUT, but PATCH better reflects the semantics of an partial update) with multipart/form-data on data/123: The multipart/form-data is a sequence of entities consisting of a new "application/pdf" associated with the field "attachment1" and something which would represent a null-value to denote deletion of attachment2.

While this ensures atomicity, I doubt this is RESTful since i overload the PATCH method using different parameter lists, which violates the uniform-interface constraint.

  • Use a resource representing a transaction. I could POST the data id 123 to a transaction-URL which would create a transaction resource representing a copy of the current state of the data-resource stored on the server, e.g. transaction/data/123. Now i can call PUT and DELETE on the attachments of this temporary resource (e.g. DELETE transaction/data/123/attachment2) and communicate the commit of this version of the resource to the server via a PUT on transaction/data/123. This ensures atomicity while a have to implement additional server side logic to deal with multiple clients changing the same resource and crashed clients which never committed.

While this seems to be consistent with REST it seems to violate the contraint of statelessness. The state of the transactional resource is not service state but application state, since every transactional resource is associated with a single client.

I'm kind of stuck here, so any ideas would be helpful, thanks!

mtsz
  • 2,445
  • 7
  • 24
  • 39
  • 2
    The second approach has the benefit of providing a nice history of data changes and might let you skip some logging. – Jasper Apr 06 '12 at 07:07
  • @mtsz I'm struggling with this problem right now. I like the answer you selected below, but it seems like a lot of work to create a transaction resource with a short, temporary lifespan. Do you think it would be bad to give the atomic transaction to be performed a name like "switcheroo" and just create a specific web service that performs that transaction? e.g., POST /doSwitcheroo with a body of {fileId: 123} .... This service would have the logic to atomically perform the actions you described above on the file with id 123 – Niko Bellic Nov 11 '15 at 02:22

4 Answers4

16

You want to use the second option, the transaction option.

What you're missing is the creation of the transaction:

POST /transaction

HTTP/1.1 301 Moved Permanently
Location: /transaction/1234

Now you have a transaction resource that is a first class citizen. You can add to it, delete from it, query to see its current contents, and then finally commit it or delete (i.e. rollback) the transaction.

While the transaction is in progress, it's just another resource. There's no client state here. Anyone can add to this transaction.

When its all done, the server applies the changes all at once using some internal transaction mechanism that's out of scope here.

You can capture things like Etags and if-modified headers in the transaction sub actions so that when they're all applied, you know that something didn't change behind your back.

Will Hartung
  • 107,347
  • 19
  • 121
  • 195
  • Sounds reasonable. Probably the notion of "application state" is a bit over-exaggerated in this case and this "first class citizen" is normal service state. [Richardson & Ruby](http://shop.oreilly.com/product/9780596529260.do) promote this kind of solution, although it is applied in a scenario with multiple servers. I wonder, what would Roy Fielding do? :) – mtsz Apr 09 '12 at 18:22
  • 8
    You can look at the transaction the same way that you could look at a shopping cart. In a shopping cart, a client builds up their transaction over time, and then they simply go through the checkout process that eventually ends with them "confirming" or "canceling" the order. When you look at it in those terms, with that vocabulary, it makes more sense. But at 10,000 feet, they're basically the same problem. "Here's a list of all the stuff I want to do -- now, GO!" The fact that a shopping cart is bound to a specific user is a security/identity issue, not a "state" vs "statelessness" issue. – Will Hartung Apr 09 '12 at 18:30
  • I've implemented this solution which works well. You get the benefit of resolving "edit races" without loosing data for free, not possible with the "list"-solution. – mtsz Apr 19 '12 at 18:35
  • "When its all done, the server..." – Niko Bellic Nov 11 '15 at 02:07
  • 2
    @Niko Bellic You could simply POST the transaction link to a /commit resource. You could make the transactions state (COMMITED, UNCOMMITED) as attributes of the resource, and when you're done, change it to COMMITED and PUT the resource. That can work fine as well. Another way to look at a transaction resource is like a Shopping Cart. A Shopping Cart is simply a specialized type of transaction. – Will Hartung Nov 11 '15 at 04:08
5

Very interesting question. A C.S. professor at university of Lugano (Switzerland) wrote some slides about this situation:

http://www.slideshare.net/cesare.pautasso/atomic-transactions-for-the-rest-of-us

However I'm not really sure that the solution he provide is totally RESTful because it doesn't seem really stateless on the server side.

Being honest, since the transaction itself is composed by multiple states, I don't think there can be a totally RESTful solution for this problem.

thermz
  • 2,237
  • 2
  • 18
  • 27
  • 1
    I skimmed their paper (Towards Distributed Atomic Transactions over RESTful Services). They propose a Try-Cancel/Confirm protocol with a coordinator resource which finally confirms the changes on all participating resources (or initiates recovery of prior state). The resources are distributed servers, so the scope is slightly larger then mine. Nevertheless this TCC protocol should be quite similar to the second solution mentioned above. At least they conceptually solve application state problem by turning it into service state by attaching means to recover previous state.Maybe thats the way. – mtsz Jan 30 '12 at 03:27
0

Assuming your URIs are hierarchical:

PUT data/{id}
[attachment2,attachment3]

Part of your problem is that attachment1/2/3 is a terrible identifier. An index should never be part of your URIs.

noah
  • 20,296
  • 17
  • 59
  • 84
  • The identifiers are arbitrary to make the scenario more generic. Imagine a concrete scenario where you have data blobs with expected model, description, picture etc. attached. I don't see how your solution solves the problem of a consistent update of multiple attached resources, since the PUTs are sequential? – mtsz Apr 06 '12 at 12:56
  • Think of your list of attachments as a resource. You're PUTing the contents and order. 1 operation. – noah Apr 06 '12 at 13:37
  • So, if I understand correctly, you opt for using a media type modelling a sequence, like multipart/form-data or a custom type. Since my original question contained this solution, it would be interesting to read how this solution is justified. – mtsz Apr 09 '12 at 18:01
0

I am not experienced, but I have an idea for a solution as I am facing exactly this problem in development.

Firstly, I use the analogy of me (client) sending a message to Fred1 in a house (server with resources) that I want him to turn off the light switch (change state of part of a resource) and turn on the kettle (change state of another part of the resource). After turning off the light switch Fred, unfortunately, has a heart attack.

Now I have got nothing back from Fred to say whether or not he did what I asked. Fred is replaced by another Fred. The message I sent has received no answer. The only way I can proceed is to ask Fred2 if the light switch is off and the kettle is on (the resource is in the state I would expect after I asked him to do stuff for me). This is an unfortunate state of affairs (error) and adds to my workload, but I can now proceed on the basis that I know what Fred1 did before his heart attack. I can either go back to the drawing board (inform user that something went wrong and we need to re-do it) or make the changes that would complete my request if that is still relevant (turn on the kettle).

This is the beginning of how I would do it, there are obviously concern re scope, but if I have already defined my scope (I'm only interested in the light switch and the kettle) then I should have enough information (knowing the state of the light switch and the kettle) to give a new command to Fred2 without going back to the user for instruction.

How does that sound?

tentimes
  • 1,432
  • 1
  • 11
  • 16
  • Interesting, but problematic: When a Fred gets an heart attack after turning off the light, you have an inconsistent state. It will be inconsistent as long as you check the state using your client and send another message to the next Fred (who will hopefully not die). What will happen if your client dies just after fred died? Then the intended state is lost forever. Other clients will be confronted with the current inconsistent state which can lead to serious trouble... – mtsz Apr 19 '12 at 18:32
  • Interesting indeed :) What about databases - how do they do it? A database has to be atomic internally, so maybe that is another good place to look. I am running PostgreSQL so will check the documentation. Here is a link to PostgreSQL atomic: http://www.postgresql.org/files/developer/transactions.pdf – tentimes Apr 20 '12 at 11:09
  • You can adopt the notion of a database session (e.g. sql session): All modifications you apply inside of your session are not performed, until you commit your session. When something bad happens with one of the modifications, the session is rolled back and the previous consistent state is preserved. So when Fred dies after the turning of the light switch, he throws a HeartFailureException which gets caught, triggering a session rollback and the light goes on again ;) – mtsz Apr 23 '12 at 15:37