57

My application has a resource at /foo. Normally, it is represented by an HTTP response payload like this:

{"a": "some text", "b": "some text", "c": "some text", "d": "some text"}

The client doesn't always need all four members of this object. What is the RESTfully semantic way for the client to tell the server what it needs in the representation? e.g. if it wants:

{"a": "some text", "b": "some text", "d": "some text"}

How should it GET it? Some possibilities (I'm looking for correction if I misunderstand REST):

  • GET /foo?sections=a,b,d.
    • The query string (called a query string after all) seems to mean "find resources matching this condition and tell me about them", not "represent this resource to me according to this customization".
  • GET /foo/a+b+d My favorite if REST semantics doesn't cover this issue, because of its simplicity.
    • Breaks URI opacity, violating HATEOAS.
    • Seems to break the distinction between resource (the sole meaning of a URI is to identify one resource) and representation. But that's debatable because it's consistent with /widgets representing a presentable list of /widget/<id> resources, which I've never had a problem with.
  • Loosen my constraints, respond to GET /foo/a, etc, and have the client make a request per component of /foo it wants.
    • Multiplies overhead, which can become a nightmare if /foo has hundreds of components and the client needs 100 of those.
    • If I want to support an HTML representation of /foo, I have to use Ajax, which is problematic if I just want a single HTML page that can be crawled, rendered by minimalist browsers, etc.
    • To maintain HATEOAS, it also requires links to those "sub-resources" to exist within other representations, probably in /foo: {"a": {"url": "/foo/a", "content": "some text"}, ...}
  • GET /foo, Content-Type: application/json and {"sections": ["a","b","d"]} in the request body.
    • Unbookmarkable and uncacheable.
    • HTTP does not define body semantics for GET. It's legal HTTP but how can I guarantee some user's proxy doesn't strip the body from a GET request?
    • My REST client won't let me put a body on a GET request so I can't use that for testing.
  • A custom HTTP header: Sections-Needed: a,b,d
    • I'd rather avoid custom headers if possible.
    • Unbookmarkable and uncacheable.
  • POST /foo/requests, Content-Type: application/json and {"sections": ["a","b","d"]} in the request body. Receive a 201 with Location: /foo/requests/1. Then GET /foo/requests/1 to receive the desired representation of /foo
    • Clunky; requires back-and-forth and some weird-looking code.
    • Unbookmarkable and uncacheable since /foo/requests/1 is just an alias that would only be used once and only kept until it is requested.
Henke
  • 1,466
  • 2
  • 9
  • 22
Jordan
  • 4,120
  • 4
  • 29
  • 40
  • 3
    I'd go for the query string approach. The query string is a part of the URL and that's where scoping information should go. It's also how Facebook does it. Here's a neat screencast discussing this matter: [Teach a dog to REST](https://blog.apigee.com/detail/restful_api_design/). It shows some other formats you might consider too. – toniedzwiedz Apr 29 '13 at 07:38
  • 1
    You could make them 2 representations of the same resource and with content negotiation.. 100% restful. vnd.jordan.foo would include all, vnd.jordan.foo.minimal would not include d. if that's just an example and not a common case you'd go with _fields that also 100% restful and customizable. Your favorite a+b+d is REALLY bad so avoid it :) foo/requests/1 is also bad.. not because it's not performant or logical but because it's so custom. REST is mainly about using best practices and avoiding surprises :) NO to custom headers. NO to GET with body.. Sorry for structureless comment :) – EralpB Apr 24 '17 at 11:44
  • _fields is what you call "sections", you are right it looks like a filter but _fields is commonly used and accepted, wouldn't surprise any developer. – EralpB Apr 24 '17 at 11:46
  • 1
    Revisiting this, it looks like GraphQL is basically what I was looking for. – Jordan Jan 24 '20 at 05:38

5 Answers5

12

I would suggest the querystring solution (your first). Your arguments against the other alternatives are good arguments (and ones that I've run into in practise when trying to solve the same problem). In particular, the "loosen the constraints/respond to foo/a" solution can work in limited cases, but introduces a lot of complexity into an API from both implementation and consumption and hasn't, in my experience, been worth the effort.

I'll weakly counter your "seems to mean" argument with a common example: consider the resource that is a large list of objects (GET /Customers). It's perfectly reasonable to page these objects, and it's commonplace to use the querystring to do that: GET /Customers?offset=100&take=50 as an example. In this case, the querystring isn't filtering on any property of the listed object, it's providing parameters for a sub-view of the object.

More concretely, I'd say that you can maintain consistency and HATEOAS through these criteria for use of the querystring:

  • the object returned should be the same entity as that returned from the Url without the querystring.
  • the Uri without the querystring should return the complete object - a superset of any view available with a querystring at the same Uri. So, if you cache the result of the undecorated Uri, you know you have the full entity.
  • the result returned for a given querystring should be deterministic, so that Uris with querystrings are easily cacheable

However, what to return for these Uris can sometimes pose more complex questions:

  • returning a different entity type for Uris differing only by querystring could be undesirable (/foo is an entity but foo/a is a string); the alternative is to return a partially-populated entity
  • if you do use different entity types for sub-queries then, if your /foo doesn't have an a, a 404 status is misleading (/foo does exist!), but an empty response may be equally confusing
  • returning a partially-populated entity may be undesirable, but returning part of an entity may not be possible, or may be more confusing
  • returning a partially populated entity may not be possible if you have a strong schema (if a is mandatory but the client requests only b, you are forced to return either a junk value for a, or an invalid object)

In the past, I have tried to resolve this by defining specific named "views" of required entities, and allowing a querystring like ?view=summary or ?view=totalsOnly - limiting the number of permutations. This also allows for definition of a subset of the entity that "makes sense" to the consumer of the service, and can be documented.

Ultimately, I think that this comes down to an issue of consistency more than anything: you can meet HATEOAS guidance using the querystring relatively easily, but the choices you make need to be consistent across your API and, I'd say, well documented.

Dan Puzey
  • 31,916
  • 3
  • 71
  • 95
  • I like the limited combinations idea. I decided to use it, but in the path rather than the query. – Jordan Apr 30 '13 at 04:01
  • The caching points are important, especially about caching the result of the undecorated URI - the sole reason I'm supporting representations of a subset of a resource is to reduce the network and server load, and the best way to do that (assuming several requests to the same resource) is to cache everything the client might need to know about the resource and just get pieces of it from there. – Jordan Apr 30 '13 at 04:05
8

I've decided on the following:

Supporting few member combinations: I'll come up with a name for each combination. e.g. if an article has members for author, date, and body, /article/some-slug will return all of it and /article/some-slug/meta will just return the author and date.

Supporting many combinations: I'll separate member names by hyphens: /foo/a-b-c.

Either way, I'll return a 404 if the combination is unsupported.

Architectural constraint

REST

Identifying resources

From the definition of REST:

a resource R is a temporally varying membership function MR(t), which for time t maps to a set of entities, or values, which are equivalent. The values in the set may be resource representations and/or resource identifiers.

A representation being an HTTP body and an identifier being a URL.

This is crucial. An identifier is just a value associated with other identifiers and representations. That's distinct from the identifier→representation mapping. The server can map whatever identifier it wants to any representation, as long as both are associated by the same resource.

It's up to the developer to come up with resource definitions that reasonably describe the business by thinking of categories of things like "users" and "posts".

HATEOAS

If I really care about perfect HATEOAS, I could put a hyperlink somewhere in the /foo representation to /foo/members, and that representation would just contain a hyperlink to every supported combination of members.

HTTP

From the definition of a URL:

The query component contains non-hierarchical data that, along with data in the path component, serves to identify a resource within the scope of the URI's scheme and naming authority (if any).

So /foo?sections=a,b,d and /foo?sections=b are distinct identifiers. But they can be associated within the same resource while being mapped to different representations.

HTTP's 404 code means that the server couldn't find anything to map the URL to, not that the URL is not associated with any resource.

Functionality

No browser or cache will ever have trouble with slashes or hyphens.

Jordan
  • 4,120
  • 4
  • 29
  • 40
  • I was thinking on same lines, e.g /foo/view/summary , /foo/view/full looks more appropriate than matrix or query params – spats Sep 25 '13 at 07:27
  • They are different resources, while being mapped to the same entity, no? Also, perfect HATEOAS is a nice goal (it serves the client) but in order to get the links of the subset resources the client must first get the full resource? – Dave Van den Eynde Feb 19 '14 at 08:20
6

Actually it depends on the functionality of the resource. If for example the resource represents an entity:

/customers/5

Here the '5' represents an id of the customer

Response:

{
   "id": 5,
   "name": "John",
   "surename": "Doe",
   "marital_status": "single",
   "sex": "male",
   ...
}

So if we will examine it closely, each json property actually represents a field of the record on customer resource instance. Let's assume consumer would like to get partial response, meaning, part of the fields. We can look at it as the consumer wants to have the ability to select the various fields via the request, which are interesting to him, but not more (in order to save traffic or performance, if part of the fields are hard to compute).

I think in this situation, the most readable and correct API would be (for example, get only name and surename)

/customers/5?fields=name,surename

Response:

{
   "name": "John",
   "surename": "Doe"
}

HTTP/1.1

  • if illegal field name is requested - 404 (Not Found) is returned
  • if different field names are requested - different responses will be generated, which also aligns with the caching.
  • Cons: if the same fields are requested, but the order is different between the fields (say: fields=id,name or fields=name,id), although the response is the same, those responses will be cached separately.

HATEOAS

  • In my opinion pure HATEOAS is not suitable for solving this particular problem. Because in order to achieve that, you need a separate resource for every permutation of field combinations, which is overkill, as it is bloating the API extensively (say you have 8 fields in a resource, you will need enter image description here permutations!).
  • if you model resources only for the fields but not all the permutations, it has performance implications, e.g. you want to bring the number of round trips to minimum.
Genry
  • 1,236
  • 2
  • 20
  • 38
2

If a,b,c are property of a resource like admin for role property the right way is to use is the first way that you've suggested GET /foo?sections=a,b,d because in this case you would apply a filter to the foo collection. Otherwise if a,b and c are a singole resource of foo collection the the way that would follow is to do a series of GET requests /foo/a /foo/b /foo/c. This approach, as you said, has a high payload for request but it is the correct way to follow the approach Restfull. I would not use the second proposal made ​​by you because plus char in a url has a special meaning.

Another proposal is to abandon use GET and POST and create an action for the foo collection like so: /foo/filter or /foo/selection or any verb that represent an action on the collection. In this way, having a post request body, you can pass a json list of the resource you would.

LuckyStarr
  • 1,460
  • 2
  • 25
  • 38
  • foo as collection of members is initially very attractive. However, I can't really consider it a collection in the same way that `/posts` is a collection because `/posts` represents all resources of type 'post', such that a query would limit the range of posts it represents (see my [answer](http://stackoverflow.com/a/16291445/972499) for how I define 'resource') whereas `/foo` still just represents one resource. So doing it this way would give the URI query a dual meaning. (Anyway, REST doesn't care about the query -- it just cares about which identifiers associate to which representations.) – Jordan Apr 30 '13 at 03:41
  • This is helpful, especially the point about `+` -- I can't seem to find an authoritative answer to whether `+` is reserved outside or just inside the query, but I'd rather be safe than sorry. – Jordan Apr 30 '13 at 03:53
2

you could use a second vendor media-type in the request header application/vnd.com.mycompany.resource.rep2, you can't bookmark this however, query-parameters are not cacheable (/foo?sections=a,b,c) you could take a look at matrix-parameters however regarding this question they should be cacheable URL matrix parameters vs. request parameters

Community
  • 1
  • 1
cproinger
  • 2,163
  • 1
  • 15
  • 31
  • Good find on matrix parameters. However, it traces back to an [opinion essay](http://www.w3.org/DesignIssues/MatrixURIs.html) with no standardization since then, and I'd rather use a standard solution than a nonstandard one. – Jordan Apr 30 '13 at 01:04