Adventures in REST-alandia

Overview

I recently managed a REST project from inception to deployment. As in any endeavor, there were quite a few interesting lessons to be learned.

I had long evangelized REST at this company, and finally got a chance to implement my “dream”. Although loath to use the word “fun”, I admit it was so albeit a lot of work. REST, as we all know, is a “style” and not a “spec”, so much is left to non-standard interpretation. I worked on this project right after a SOAP/WSDL project so it was interesting to compare the pluses and minuses (cost/benefit analysis) of both.

Our REST service exposes two kinds of features for video playlists and playlist groups – basic CRUD operations and more complex business-specific operations. The former are a good fit for REST – the latter are not such a good fit.

  • /service/v1/ACCOUNT/playlists/1776
  • /service/v1/ACCOUNT/videos/1066
  • /service/v1/ACCOUNT/players/1054
  • /service/v1/ACCOUNT/genres
  • /service/v1/ACCOUNT/permissions
  • /service/v1/ACCOUNT/release

One of more interesting outcomes was the active use of the live REST system by our business analyst. Granted he’s a sharp cookie, but as any one of us, he’s a very busy person. The fact that he managed to quickly teach himself the basics of HTTP GET, PUT and POST using Google’s restclient tool, underscores the lower barrier-of-entry of REST over SOAP and the former’s ultimate advantage. This is a good example of the empowering nature of an “on-site customer” which is so important to agile projects. Once the basic REST model is comprehended, its “intuitive” nature is easily grasped.

Resource Representation

We ended up using our own custom XML vocabulary defined by an XSD. I had seriously considered using Atom Pub as our format since I had been on an intense Atom-learning binge, but after careful cost-benefit analysis, I opted for custom XSD. XSD was a known and Java has better tools for Java-binding with XMLBeans. I was initially “surprised” to learn that Atom doesn’t have a normative XSD or RelaxNG schema, and even though there are rocking Abdera and Rome toolkits out there, at the end of the day you are still obligated to manually map XML to Java (and vice versa). The automated and faithful binding of XMLBeans is a compelling advantage.

Our data model would obviously require significant Atom namespace extensions, and there was obviously lots of extra work here in the too few months we had to roll out our application. We already had a doc/lit SOAP service deployed with its own XSD, so reusing this grammar was another reason to go the custom XML route. Finally, there was no compelling business reason to use Atom, and no-one else except me was familiar (or interested) with Atom. All in all the risk outweighed the benefit. Shameless plug: this is what you get when you hire me – objective advice not tainted by latest sexy trends even though it might be better for my resume to have that latest trend on it!

Nevertheless, the framework did have built-in support for content negotiation, so besides the standard XML format, Atom, MRSS and JSON were returned for selected resources as an experimental next-generation feature. All you have to do is specify a query parameter “format=atom” for GET and lo and behold the format would be different. Conneg is truly one of the key distinguishing features of REST from other paradigms that can take a while to get used to. Since it is such a paradigm shift, the consequent plethora of rich opportunities are not immediately apparent. From an implementation perspective, the pluggable nature of Spring dependency injection readily leads itself to variant formats.

Errors

REST proponents make much of the simplicity of returning standard HTTP error codes, but upon further examination some serious limitations of this approach are apparent:

  • HTTP error codes are split into 400 client and 500 server errors. This is definitely unlike most standard programming use cases.
  • HTTP error codes are not rich enough to capture business errors.

As we all know, happy path programming is never adequate, and true enterprise-style applications have to account for a wide variety of errors. Paying business clients do NOT want to see Java exceptions or stack traces – but the cost of mapping these to intelligent error messages can carry a non-trivial cost. Much depends on the imagination of the product owners – they need to envision how negative the down-stream cost of confusing and incoherent error messages will be on the client and insist that development put in place an exception handling mechanism that will deliver accurate and understandable messages.

It just seems that a simple concept such as an error reporting mechanism and format could be standardized upon. SOAP has the SOAP Fault element, and all SOAP applications can interoperably process an application’s errors. We need to build upon HTTP errors and have a richer mechanism.

Client and Server Error Codes

The distinction between errors caused by the client or server is not typically encountered in non-REST programming scenarios. This is a prime example of one of those pesky “devil in the details” (Tasmanian devil?) that can cause inordinate grief. For every error encountered, we have to decide whether is a client or server error. The need to pass down the client/server context to a the culprit exception is a non-trivial exercise.

For example, assume we have an XML syntax error on a playlist representation. If this error generated by the client or the server? If it is an inbound request, a 4xx client error code should be generated. If the error is encountered when returning a response, a 5xx code should be returned since it is the server that messed up. The upshot is this: unlike “normal cases” the context (client or server) has to be passed down to every place where an XML syntax error could be encountered. Ugh! This is definitely a lot of work. Much simpler to return the ubiquitous 500 for most everything even though it ain’t pretty.

The inevitable choice:

  • Dispense with this expensive error cause differentiation and return less helpful error messages.
  • Be faithful to the spirit of REST/HTTP and give clients informative and intelligent messages but pay a non-significant cost.

Adequacy of HTTP Error Codes

The HTTP model supports a limited range of errors codes – some for the server and some for the client. Surprisingly, there is no extension mechanism available to supply application-specific error codes. Of course, the HTTP response Reason Phrase can be used, but what is missing is a code – a well-known extensible value that a program can switch on.
Only two server error messages – 500 (Internal Server Error) or 501 (Not Implemented) are available. Obviously, this is hardly adequate to account for real-life error cases.

Client error codes are similar – 400 (Bad Request. The request contains bad syntax or cannot be fulfilled) or 409 (Conflict). For example, if the client specified a non-existent video for a playlist, how can we convey the precise message to the client? It is not a syntax error – its a semantic error. HTTP has no provision for the latter. A 409 typically represents a concurrent update clash or a unique constraint.

Two non-standard solutions for returning application-specific error codes:

  • Return the application error code in the response body
  • Return the application error code in a custom response header, e.g. X-MyApp-Error-Code

Response body error messages require the client to switch on the returned content. This implies that errors that must be trapped by standard Servlet web.xml features are for the whole web application. If the REST service owns the WAR that is OK – if it shares it with other services (our case), then it has to account for error use cases of these other services – not a necessarily a non-trivial task if you are adding a new service to an existing multi-service WAR.

Response headers have the advantage of consistency in that the client looks for errors only in one place. The client checks for “major” errors in the standard HTTP Status Code – and “minor” errors in a custom HTTP header.

Resource Modeling Limitations

Classical CRUD semantics for resources are a good fit for REST services. Inevitably however, more complex business requirements that enlist several resources are not as easily supported by REST.

The unlimited operation name space of SOAP operations allows you to easily add any new functionality that the business mandates. The constraining nature of four HTTP verbs can often make it difficult to come up with an appropriate resource name. Noun-ifying a new operation doesn’t seem to be always the best course. As a conscientious designer, I want to keep my resource model as coherent as possible and REST compliant. Pragmatically, business people are less concerned about this, and often time pressures can lead us to add non-REST-ian resource names. With SOAP you just add a new operation name and some arguments – with REST there can be a lot of hand-wringing and “religious discussions” on what is the best approach. Again, this is one of the consequences of implementing a “style” as opposed to a “specification”.

Moving a Playlist from one Playlist Group to Another

For example, we encountered the problem of moving a child playlist from one playlist group to another. Assume we want to move “Beyonce 30-th Birthday” playlist from “My Playlist Group” to “Your Playlist Group”. There is no natural REST mapping of this use case – a lot of extra cognitive overhead has to be expended in creating a non-standard solution. In the procedural (e.g. SOAP) mode, this could be easily implemented as:

move(“Beyonce 30-the Birthday”,”My Playlist Group”,”Your Playlist Group”)

or if you use IDs:

move(1776, 1812, 1861)

In a REST-ian model, this action enlists three resources: playlist, source playlist group and target playlist group. Which resource is this operation to be based upon? The playlist or one of the playlist groups? Which HTTP method should be used: POST or PUT? One example:

PUT /playlist/1776

<source id=”1812″/>
<target id=”1865″/>

Subbu Allamaraju – a Yahoo REST thought leader – has proposed the “Account Transfer Pattern” where you create a new resource that atomically encompasses the participant resources. He also specified the operation as a POST – we are creating a new “transfer” resource – in this case a transfer of a playlist from one group to another.

To summarize, there is no standard REST “spec” or common usage for such use cases – and it is up to each application to create its own composite resources with non-trivial cognitive overhead.

Distributing and Recalling a Video

Another example was a new requirement that we add the capability of distributing and recalling a video. Conceptually (in an OO non-RESTian manner) this tells me to add a new method on the video object. But I don’t have this ability in REST for I am limited to the four holy verbs. So do I create new resources such as:

  • /service/v1/ACCOUNT/videodistribute
  • /service/v1/ACCOUNT/videorecall

To distribute a video, do I use a POST or PUT? But this just doesn’t seem intuitively correct to me. After some pondering I decided upon:

  • /service/v1/ACCOUNT/videos/distribute
  • /service/v1/ACCOUNT/videos/recall

using a POST where the request body contained a list of video IDs to distribute or recall plus other information such as a message.

Batching

REST is a good fit for manipulating individual resources, but when it comes to manipulating several resources at once things get a bit more difficult. The main problem with batching is how do we successfully report errors for those batch items that failed and those that did not? The whole HTTP error mechanism proves inadequate here. If we submit ten items in a batch job and three fail, we do not want to return a 400 since this would not correctly convey that seven had succeeded. We are left with the task of inventing our own mechanism – again a non-standard way to solve a standard problem.

There has been recognition of this problem in the REST community, and each application is providing its own conventions – for one example, see Youtube batching .

Using our video example, a regular singular get is:

  • /service/v1/ACCOUNT/videos/1492

Now we want to “batch-ify” the resource:

  • /service/v1/ACCOUNT/videos/1492,1519

Simple enough you say – just return a list of video representations. But what if 1492 doesn’t exist and 1519 does exist. In the singular version we can return the conventional 404. In the batch version, we don’t have this option and ultimately we have to somehow convey those batch items that succeeded and those that didn’t. We have to then enhance our response data format to contain this information. Things get messy pretty quickly.

In the first SOAP-based incarnation of the service, we had actually designed all our SOAP operations as batch. Every response object was derived from a base class that had a list of result statuses (success or error) and the actual result data if successful. Although this was a clean solution, for those clients that were only manipulating one object this proved irksome. Furthermore, implementing batch on the back end is a non-trivial exercise. Nevertheless, for those business use cases and for performance sake (i.e. avoiding multiple remote calls) batch is needed.

Conclusion

Alas, as Fred Brooks pointed out a while back, there is no silver bullet in our business! REST excels for “simple” cases – but the case can be made that SOAP has better support for arbitrary non-CRUD complex usage scenarios.

Web Linking

We all know that web linking is a “good” thing, but again, when it comes time implement it things get get a bit sticky. But that “we” might not extend to the business people involved in the resource modeling. Typically business folks aren’t that well versed in REST-ian convention and don’t really care that much. They just want the job done with minimal cost. Furthermore, IDs can be simpler to understand and manipulate. Amazon Web services deal in IDs.

<video>
<id>1453</id>
<genres>
<link href=”1618″ />
</genres>
</video>

or

<video>
<id>1453</id>
<genres>
<link href=”http://www.foobar.com/service/v1/ACCOUNT/genres/1618&#8243; />
</genres>
</video>

Our particular business requirements made links especially problematic. Firstly, the business wanted to use genre names and not IDs. IDs were too hard to remember and to keep track of. As long as both names and IDs are unique, this could have been done. However, our genres were hierarchical and names were not unique but full genre paths were. For example, “animals/canines/dogs” instead of “dogs” since there was nothing to prohibit you from having “pets/dogs”. As you can see, putting a slash delimited path in the URI can be confusing.

Furthermore, our requirements were to only expose genres and permissions as collections and there was no requirement to expose singular versions. GET service/v1/ACCOUNT/genres would return a collection of genre representations but service/v1/ACCOUNT/genres/1618 would return a 404. Granted it seems inconsistent and “not pretty”, but the alternative is to deploy and test a feature the business doesn’t explicitly want. So web links to individual genres could not be dereferenced – you would get a 404!

The outcome was to add a “kind” attribute to the link element with three possible values: id, name or uri. We rolled out the application using id or name links but retained the ability to support uri in the future.

Another problem with the uri version is that it required quite a bit of extra work on the implementation side since the request context had to be passed down the stack to every place a reference was being constructed. It wasn’t sufficient just to have the object’s ID or name – we also had to have the current URL of the application. Again, the devil is in the details.

WADL

There has been apparent movement to using WADL as a service definition language a la WSDL.
Some preliminary thoughts on WADL:

  • A good idea in theory – in practice questions remain.
  • Most surprising and interesting is that WADL hasn’t been improved since Nov. 2006. Que pasa? Does anyone own this still? Progress is needed, is anyone responsible for this? Hmmm. Red-ish flags.
  • Current WADL doesn’t really support the DRY principal – resource-wide features (query parameters, request/response headers, XML representatons) are not specified in one place. Instead, they are repeated in-line for every resource URI. For any REST API with a significant number of resources, this quickly leads to repetitive, confusing and overwhelming feature descriptions.

Implementation

The REST service was implemented as a Java web application with a custom in-house REST framework based upon Spring. The reason for this was partially historical – the project had started as a skunks work and was substantially developed by the time it was officially blessed. So considering the compressed delivery schedule and limited resources, we went ahead with what we had. In addition, I had reservations about the maturity of two prime candidate frameworks – restlet and Jersey.

Neither had self-evident easy support of Spring and that was a deal breaker in itself. Jersey at that time didn’t seem mature enough plus I have reservations about its exclusive use of annotations to define the REST contract. Mixing service-level declarative information with implementation Java code is not a good practice in my opinion even though it is popular in many circles today – the fad du jour syndrome. But that’s another topic. Furthermore our business requirements were relatively simple from the REST-ian perspective (no fancy matrix parameters or conneg) so the home-grown system was adequate.

Sidebar: I am pleasantly surprised with JPA’s well-thought out mechanism for over-riding annotations with XML descriptors. Too bad JAX-RS doesn’t provide a similar mechanism.

One thing I wished I had time to finish was a Spring namespace extension that defined a REST-ian DSL that could more succinctly capture REST declarative semantics. But time was short and darn – that Spring feature is quite complex. So many cool software toys – so little time.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: