Java and Technology weblog
REST, or Representational State Transfer is an architectural style, or more simply, a set of constraints.
We will look at the constraints REST imposes for web apps, but some highlights are:
- Uniform interfaces: all resources are identified by URIs (think: links)
- It relies on a stateless, client-server, cacheable communications protocol (think: HTTP).
- Interaction with resources is via a set of standard methods (think: HTTP verbs)
REST can be viewed as a lightweight alternative to mechanisms like RPC (Remote Procedure Calls) and Web Services protocols (SOAP, WSDL, etc)., but it is much more than that too! It is not an exaggeration to say that REST has been used to guide the design and development of the architecture for the modern Web.
The term REST was defined in 2000 by Roy Fielding in his doctoral dissertation at UC Irvine.
A brief history of WWW
Back in 1989, Tim Berners-Lee first proposed the “WorldWideWeb” project. Berners-Lee was a software engineer working at at CERN, the large particle physics laboratory in Switzerland. Many scientists worked at CERN for periods of time, then returned to their own labs around the world and so there was a need for them to be able to share and link their research documents. To facilitate this, Berners-Lee proposed three technologies that would become the foundation of Web:
- HTTP: Hypertext Transfer Protocol. HTTP is a protocol, or a formal set of rules, for exchanging information over the web. It allows for the retrieval of linked resources from across the Web.
- HTML: HyperText Markup Language. The publishing format for the Web, including the ability to format documents and link to other documents and resources.
- URI: Uniform Resource Identifier. A kind of “address” that is unique to each resource on the Web.
(we are not going to delve into HTML here, instead the focus is on HTTP and a little on URIs)
The first documented version of HTTP was HTTP V0.9 (1991) and had only one method, namely GET, which would make a request to a server and the server would respond with HTML page. It was a good start, but need many enhancements to support the exploding popularity of the Web.
So, Berners-Lee teamed up with researcher Roy Fielding, and others, to develop HTTP 1.0. HTTP 1.0 transformed HTTP from a trivial request/response application to a true messaging protocol. It described a complete message format for HTTP, and explained how it should be used for client requests and server responses, and supported multiple media types.
Unfortunately, some of the limitations of HTTP 1.0 were increasingly causing problems as web usage grew. For example, a separate connection to the server is made for every resource request. There was also a lack of support for caching and proxying.
Jump forward to 1994. The web was growing really fast. It was an exciting time. The WWW was becoming a buzzword and getting a huge amount of press. Sites like hotmail, yahoo, altavista were taking off. Google didn’t even exist yet.
But the architecture and technologies on which the web was built were beginning to creak at the seams. So, TBL, Fielding, who were researchers at MIT and UCI respectively, and a number of other leading technologists, including folks from Compaq, Xerox and Microsoft, got together to specify and improve the WWW infrastructure through the IETF working groups on URI, HTTP, and HTML.
Through this work, HTTP 1.1 was born.
Some of the big improvements introduced in HTTP 1.1 were:
- Multiple Host Name Support: Allows one Web server to handle requests for many different virtual hosts.
- Persistent Connections: Allows a client to send multiple requests for documents in a single TCP session.
- Partial Resource Selection: A client can ask for only part of a resource rather than the entire document, reducing load and required bandwidth
- Better Caching and Proxying Support
- Content Negotiation: Allows the client and server to exchange information to help select the best resource when multiple are available.
- Better Security: Defines authentication methods and is generally more “security aware”
Work began on HTTP 1.1 in 1994, and it was official released in 1997.
And what version of HTTP 1.1 is in use today? Still 1.1, over 25 years later! Considering how quickly technology changes, that is an incredible achievement. How many projects have you worked on that have stood the test of time so well?
Fielding had been involved in the web from its infancy and experienced first hand its rapid growth, both as a user and as an architect. He understood better than most the reasons for its success and so after the release of HTTP1.1, Fielding begin to write about what he had learned working on HTTP, and the other web technologies (Fielding has also been involved in the development of HTML, URIs and was a co-founder of the Apache HTTP Server project). He took the knowledge of web’s architectural principles and presented them as a framework of constraints, or as he called them, an architectural style. Specifically, Fielding wrote a PhD thesis focused on the rationale behind, and key architectural principles of, the design of the modern Web architecture.
Fielding’s thesis was published in 2000, and was called Architectural Styles and the Design of Network-based Software Architectures. I have to admit that I have not read many PhD theses, but his most be among the most readable of them. It even contains Monty Python quotes!
In it, Fielding discusses Network-based Application Architectures and Architectural Styles, before introducing and defining the term REST. Although introduced in Fielding’s paper, Fielding noted that “REST has been used to guide the design and development of the architecture for the modern Web”. So, while the term REST didn’t come about until afterwards, it is the design style behind HTTP. Fielding didn’t ‘invent’ REST in his paper, instead he developed it in collaboration with his colleagues while working on HTTP and URIs, but it was in his paper that the term was coined and defined.
Fielding tried to answer the question of why the Web has been such a successful platform by explaining it guiding principles, and how they can be correctly applied when building distributed systems?
So, want to build a distributed web app? Not sure what architecture to use? Why not base it on the Web’s architecture!
Before diving in to what REST is, feel free to read the terminology section at the end.
What is REST?
REST is an architectural style, or a set of constraints, for distributed hypermedia systems.
Imagine you were designing a freeway. You might impose rules such as cars only (no trucks, pedestrians or bicycles), all traffic must travel between 40 and 70 mph, and no traffic lights (only on and off ramps). Although these rules constrain the system, they make it work better overall; in this case allow more traffic to flow freer and faster.
REST imposes constraints on web apps, or distributed hypermedia systems, in order to enable those apps to scale and perform as desired.
What were the constraints that Fielding suggested?
1) Client Server
By separating the user interface concerns from the data storage concerns, we improve the portability of the user interface across multiple platforms and improve scalability by simplifying the server components. Separation also allows the components to evolve independently.
Communication must be stateless. Each request from client to server must contain all of the information necessary to understand the request. Session state is kept entirely on the client.
Reliability is improved because it eases the task of recovering from partial failures.
Scalability is improved because not having to store state between requests allows the server component to quickly free resources, and simplifies implementation.
Cache constraints require that the data within a response to a request be labeled as cacheable or non-cacheable. If a response is cacheable, then a client cache is given the right to reuse that response data for later, equivalent requests.
4) Uniform Interface
The central feature that distinguishes the REST architectural style from other network-based styles is its emphasis on a uniform interface between components. Implementations are decoupled from the services they provide, which encourages independent evolvability.
5) Layered System
The layered system style allows an architecture to be composed of layers by constraining component behavior such that each component cannot “see” beyond the immediate layer with which they are interacting.
The final addition to our constraint set for REST comes from the code-on-demand style. REST allows client functionality to be extended by downloading and executing code in the form of applets or scripts. This simplifies clients by reducing the number of features required to be pre-implemented. Allowing features to be downloaded after deployment improves system extensibility. However, it also reduces visibility, and thus is only an optional constraint within REST.
Those are the constraints that make up REST. Next, HTTP.
HTTP has a very special role in web architecture, and with REST in particular.
Note however that REST doesn’t have to use HTTP. There are other application-level protocols that could, possibly, be candidates for use with REST: The Gopher was widely used in the early days of the web, although was overtaken by HTTP; Fielding himself has been working on a new http-like protocol called waka; There is also a Google developed protocol called SPDY that has goals of reducing web page load latency and improving web security.
However in practice REST and HTTP are closely related. Fielding not only introduced REST, he was also one of the principal authors of the HTTP specification, so it is not too surprising that the two are closely linked.
We will dive in to HTTP and look at some example requests & responses and the HTTP methods and response codes that are commonly used.
An example of a HTTP request:
GET /index.html HTTP/1.1
This is made up of the following components:
- Method: GET
- URI: /index.html
- Version: HTTP/1.1
- Headers: Host: www.example.com
- Body: empty in this case
Version/Status code; Reason phrase
HTTP/1.1 200 OK Version/Status code; Reason phrase Date: Mon, 23 May 2005 22:38:34 GMT HEADERS Server: Apache/220.127.116.11 (Unix) (Red-Hat/Linux) Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT ETag: "3f80f-1b6-3e1cb03b" Content-Type: text/html; charset=UTF-8 Content-Length: 131 Accept-Ranges: bytes Connection: close <html> BODY <head> <title>An Example Page</title> </head> <body> Hello World </body> </html>
In the above request example, the verb is GET. HTTP verbs are also known as methods, are there are 8 supported in the HTTP 1.1 (RFC 2616). First we will look at the 4 most commonly used verbs: GET, PUT, DELETE, POST. Then we will look at the lesser used ones: HEAD, OPTIONS, TRACE and CONNECT
However, before we dive in to the methods, let’s take a look at some characteristics, or groupings, of the messages. Specifically, the concept of safe methods and idempotency.
Safe methods are methods that do not modify resources, they are used only for retrieval. (Strictly speaking, some things may change, e.g. logs, caches etc, but the representation of the resource in question must not).
Safe methods are: HEAD, GET, OPTIONS and TRACE
By contrast, non-safe methods such as POST, PUT, DELETE and PATCH are intended to cause side effects either on the server.
Idempotent methods can be called many times without different outcomes. Call it once, or 1 thousand times, the result will be the same. For example, multiplying by 1 is an idempotent operation. So is the assignment ‘a=4;’
More formally “Methods can also have the property of idempotence in that (aside from error or expiration issues) the side-effects of N>0 identical requests is the same as for a single request.” 
The methods GET, HEAD, PUT and DELETE share this property. Also, the methods OPTIONS and TRACE SHOULD NOT have side effects, and so are inherently idempotent.
And now, a look at the 4 most commonly used verbs: GET, PUT, DELETE, POST
Retrieve the resource identified by the URI.
The simplest and most common method! The one you use every time you access a web page.
Store the supplied entity under the supplied URI.
If already exists, update (and return either the 200 OK or 204 No Content)
If not create with that URI (and return ‘201 Created’ response).
Request to accept the entity as a new subordinate of the resource identified by the URI. For example
- Submit data from a form to a data-handling process;
- Post a message to a mailing list or blog
In plain english, create a resource.
Requests that the server delete the resource identified by the URI
PUT vs POST
OK, before we go on to the other lesser used HTTP, verbs, let’s take a look at 2 of the above commonly used verbs that are often most confusing: PUT and POST.
The office HTTP 1.1 doc (RFC 2616) states:
“The fundamental difference between the POST and PUT requests is reflected in the different meaning of the Request-URI. The URI in a POST request identifies the resource that will handle the enclosed entity. That resource might be a data-accepting process, a gateway to some other protocol, or a separate entity that accepts annotations. In contrast, the URI in a PUT request identifies the entity enclosed with the request — the user agent knows what URI is intended and the server MUST NOT attempt to apply the request to some other resource.”
That however is a bit of a mouthful!
PUT and POST can both be used to create or update a resource, but here are some (sometimes contradictory!) rules of thumb:
- PUT is for update; POST is for create
- PUT idempotent; POST is not;
- Who creates the URL of the resource?
- PUT is for creating when you know the URL of the thing you will create;
- POST is for creating when the server decides the URL for you (you just know the URL of the “factory” or manager that does the creation)
- There is also a recent argument (from Thoughtworks for example) that says don’t use Put, always Post (and post events instead).
Short answer? There is no short answer! Use your best judgement.
See some useful discussions at this stackoverflow posting.
Less Common Methods
The other 4 lesser use HTTP verbs are: HEAD, OPTIONS, TRACE and CONNECT
request for information about the capabilities of a server
e.g. request a list of HTTP methods that may be used on this resource.
It would look something like this:
A somewhat obscure part of the HTTP standard. Potentially useful but few web services actual seem to make it available.
Identical to GET except that the server MUST NOT return a message-body in the response
Used for obtaining meta-information about the entity implied by the request without transferring the entity-body itself.
Why use? Useful for testing links, e.g. for validity, accessibility
used to invoke a remote, application-layer loop- back of the request message.
plain english: Echoes back the received request so that a client can see what (if any) changes or additions have been made by intermediate servers.
Trace is often disabled since can represent a security risk.
Connect is for use with a proxy that can dynamically switch to being a tunnel
HTTP Response codes
|Code||Meaning||Plain English(From user perspective)|
|1xx||Informational; indicates a provisional response,e.g. 100||FYI, OK so far and client should continue with the request|
|4xx||Client Error||You messed up|
|5xx||Server Error||We messed up|
Why REST and HTTP?
Because HTTP provides all the characteristics required by REST.
1) Client Server
Http is a “protocol in the client-server computing model”, so meets the first requirement of REST. With HTTP, often the client is a web browser and the server is a piece of software serving content such as Apache, IIS or Nginx. With the “Internet of Things” however, things are becoming less conventional. The client could be your toaster!
HTTP is a stateless protocol. HTTP servers are not required to keep any information or state between requests.
This can be circumvented by using things like cookies and sessions, but Fielding makes it clear in his dissertation that he strongly disagrees with cookies.
HTTP supports caching via three basic mechanisms: freshness, validation, and invalidation.
4) Uniform Interface
Using interfaces to decouple a client/caller from the implementation is a common concept on software.
a) Identification of resources
HTTP supports hyperlinks. Anything of interest can be a resource, and those resources can be identified uniquely by a URI.
How do you identify a book?
How do you identify a user?
All resources are identified by a uniform interface – the URI
b) Manipulation of resources through these representations
URIs, in conjunction with the HTTP methods, can be used to manipulate resources.
c) Self-descriptive messages
In HTTP, messages can describe themselves using media (MIME) types, status codes, and headers to, for example, indicate their cacheability.
d) Hypermedia as the engine of application state (A.K.A. HATEOAS)
More later! See below.
Uniform Interface, in plain English
OK, that covers what Fielding had to say in his dissertation about Uniform Interfaces, but what does it all mean in plain English?
I mentioned earlier that using interfaces to decouple a client/caller from the implementation is a common concept on software.
Similarly, when designing GUIs, you ideally have a very simple user interface, but one that still allows the user to carry out complex tasks.
Generally, a simple interface that provides the client/user all the capabilities they need while hiding the underlying complexities of the implementations is the ideal goal, but tough to achieve.
But that is exactly what Fielding achieved with REST. The interface is simply a link (or more specifically, a URI)! Which is about the the simplest interface you can think of.
Combined with the other HTTP capabilities such as methods and media types and suddenly you have an incredibly powerful but deceptively simple, and widely understood method of communicating intentions.
5) Layered System
The idea behind a layered system is that a client doesn’t know (or care) whether it is connected to the end server, or to an intermediary one. This feature can improve scalability via load-balancing and caches etc. Layers may also enforce security policies.
HTTP supports layering via proxy servers and caching.
Clients know a few simple fixed entry points to the application but have no knowledge beyond that. Instead, they transition (states) by using those links, and the links they lead to. In other words, state transitions are driven by the client based on options the server presents.
If you think of Hypermedia as simply links, then “Hypermedia as the engine of application state” is simply using the links you discover to navigate (or transition state) through the application.
And remember that it doesn’t need to be a user clicking on links; it can just as easily be another software component that is initiating the state transitions.
To quote from Fielding himself:
“Representational State Transfer is intended to evoke an image of how a well-designed Web application behaves:
a network of web pages (a virtual state-machine),
where the user progresses through an application by selecting links (state transitions),
resulting in the next page (representing the next state of the application)
being transferred to the user and rendered for their use.”
What is REST?
- Pretty URLs?
- An alternative to SOAP or RPC?
Really it is an architectural style, or a set of constraints, that captures the fundamental principles that underlie the Web.
The emphasis of REST is on simplicity, and utilizing the power of the existing web technologies and standards such as HTTP and URI
- Uniform interfaces: All resources are identified by URIs
- HTTP Methods: All resources can be created/accessed/updated/deleted by standard HTTP methods
- Stateless: There is no state on the server
Let’s define some useful terminology that is relevant in any discussion of REST.
Wikipedia: Software architecture refers to the high level structures of a software system, the discipline of creating such structures, and the documentation of these structures. The architecture of a software system is a metaphor, analogous to the architecture of a building.
Fielding: A software architecture is an abstraction of the run-time elements of a software system during some phase of its operation 
Fowler: Architecture is a shared understanding of the system design, including how the system is divided into components and how the components interact through interfaces. 
Fielding: An architectural style is a named, coordinated set of architectural constraints that restricts the roles and features of architectural elements 
An architectural style is a named collection of architectural design decisions that (1) are applicable in a given development context, (2) constrain architectural design decisions that are specific to a particular system within that context, and (3) elicit beneficial qualities in each resulting system 
REST or RESTful?
What is the difference between the terms REST and RESTful? From what I have read, there is not a lot of difference. We know that REST is an architectural style for distributed software. Services conforming to that architectural style
Conforming to the REST constraints is referred to as being ‘RESTful’.
Or to put it another way: REST is a noun, RESTful is an adjective.
In plain English: Hypertext is text with links.In plain English
Wikipedia: Hypertext is text displayed on a computer display or other electronic devices with references (hyperlinks) to other text which the reader can immediately access, or where text can be revealed progressively at multiple levels of detail.
Roy Fielding: The simultaneous presentation of information and controls such that the information becomes the affordance through which the user obtains choices and selects actions [slide #50]
In plain English: Interactive multimedia. If you see a booth at a mall with video, sound etc that is multimedia. If you can interact with it – click links, or control the content using buttons or the like, it is hypermedia.
Wikipedia: Hypermedia, an extension of the term hypertext, is a nonlinear medium of information which includes graphics, audio, video, plain text and hyperlinks.
Roy Fielding: Hypermedia is defined by the presence of application control information embedded within, or as a layer above, the presentation of information. 
In plain English: A resource can be anything real, but typical examples would be files, web pages, customers, accounts etc.
Wikipedia: any physical or virtual component of limited availability within a computer system.
Roy Fielding: Any information that can be named can be a resource: a document or image, a collection of other resources, a non-virtual object (e.g. a person). In other words, any concept that might be the target of an author’s hypertext reference must fit within the definition of a resource. 
REST in practice: A resource is anything we expose to the Web, from a document or video clip to a business process or device. From a consumer’s point of view, a resource is anything with which that consumer interacts while progressing toward some goal.
URI – Uniform Resource Identifier
Wikipedia: a string of characters used to identify a name of a resource
W3: Uniform Resource Identifiers (URIs, aka URLs) are short strings that identify resources in the web: documents, images, downloadable files, services, electronic mailboxes, and other resources.
What is the difference between a URI and URL?
The difference between a URI and a URL is subtle, and I don’t think terribly important. A URI identifies a resource either by location and/or a name. A URI does not have to specify the location of a specific representation. If it does, it is also a URL.
A Uniform Resource Locator (URL) is a subset of the Uniform Resource Identifier (URI) that specifies where an identified resource is available and the mechanism for retrieving it”.
So all URLs are URIs, but all URIs are not URLs. URIs can also be URN (Universal Resource Name).
Or: URLs and URNs are special forms of URIs.
For the most part, I think you can think or URI and URLs as being the same thing. I may be flamed for saying that, but it keeps things simpler!
Sources, references, bibliography
1. Architectural Styles and the Design of Network-based Software Architectures (Fielding, 2000)
2. A little REST and relaxation (Fielding)
3. Who Needs An Architect? (Fowler)
4. Software architecture: Foundations, Theory and Practice; R. N. Taylor, N. Medvidović and E. M. Dashofy, . Wiley, 2009.
5. Representational state transfer (Wikipedia)
6. REST in practice (Webber; Parastatidis; Robinson)