Or, even better, why your web framework should not adopt a CGI-based API.
For the past few years I have been studying and observing the development of different emerging languages closely with a special focus on web frameworks/servers. Unfortunately, most of the new web frameworks are following the Rack/WSGI specification which may be a mistake depending on the platform you are targeting (particularly true for Erlang and Node.js which have very strong streaming foundations and is by default part of their stack).
This blog post is an attempt to detail the limitations in the Rack/CGI-based APIs that the Rails Core Team has found while working with the streaming feature that shipped with Rails 3.1 and why we need better abstractions in the long term.
Case in study
The use case we have in mind here is streaming. In Rails, we have focused on streaming as a way to quickly return the head of the HTML page to the browser, so the browser can start downloading assets (like javascript and stylesheet) while the server is generating the rest of the page. There is a great entry on Rails weblog about streaming in general and a Railscast if you want to focus on how to use it in your Rails applications. However, streaming is not only limited to HTML responses and can also be useful in API endpoints, for example, to stream search results as they pop-up or to synchronize with mobile devices.
The Rack specification in a nutshell
In Rack, the response is made by an array with three elements: [status, headers, body]
The body can be any object that responds to the method each
. This means streaming can be done by passing an object that will, for example, lazily read a file and stream chunks when each
is called.
A Rack application is any object that implements the method call
and receives an environment hash/dictionary with the request information. When I said above that most new web frameworks are following the Rack specification, is because they are introducing an API similar to this one just described.
The issue
In order to understand the issue, we will consider three entities: the client, the server and the application. The client (for example a browser) sends a request to the server which forwards it to an application. In this case, the server and the application are communicating via the Rack API.
The issue in streaming cases is that, sending a response back from the application does not mean the application finished processing. For example, consider a middleware (a middleware is an object that sits in between the server and our application) that opens up a connection to the database for the duration of the request and cleans it afterwards:
def call(env)
connection = DB.checkout_connection
env["db.connection"] = connection
@app.call(env)
ensure
DB.checkin_connection connection
end
Without streaming, it would work as follow:
- The server receives a request and passes it down the stack
- The request reaches the middleware
- The middleware checks out the connection
- The application is invoked, renders a view accessing the database using the connection and returns the rendered view as a string
- The middleware checks the connection back in
- The response is sent back to the client
With streaming, this would happen:
- The server receives a request and passes it down the stack
- The request reaches the middleware
- The middleware checks out the connection
- The app is called but does not render anything. Instead it returns a lazy object as response body that will stream the HTML page in chunks as the `each` method is called
- The middleware checks the connection back in
- Back in the server, we have received the lazy body and will start to stream it
- While streaming the body, since the body is lazily calculated, now is the time it must access the database. But, since the middleware already checked the connection back in, our code will fail with a “not connected” exception
The first reaction to solve this issue is to ensure that all streaming happens inside the application, i.e. the application would have a mechanism to stream the response back and only when it is done it would return the Rack response back. However, if the application does this, any middleware that desires to modify the header or the response body won’t be able to do so because the response was already streamed from inside the application.
Our work-around for Rails was to create proxies that wrap the response body:
def call(env)
connection = DB.checkout_connection
env["db.connection"] = connection
response = @app.call(env)
ResponseProxy.new(response).on_close do
DB.checkin_connection connection
end
end
However, this is inefficient and extremely limited (not all middleware can be converted to such approach). In order for streaming to be successful, the underlying server API needs to acknowledge that the headers and the response body can be sent at different times. Not only that, it needs to provide proper callbacks around the response lifecycle (before sending headers, when the response is closed, on each stream, etc).
The trade-off here is that this can no longer be achieved with an easy API as Rack’s. In general, we would like to have a response objects that provides several life-cycle hooks. For example, the middleware above could be rewritten as:
def call(request, response)
connection = DB.checkout_connection
request.env["db.connection"] = connection
response.on_close { DB.checkin_connection(connection) }
@app.call(request, response)
end
The Java Servlet specification is a good example of how request and response objects could be designed to provide such hooks.
Other middleware
In the example above I focused on the database connection middleware but this limitation exists, in one way or the other, in the majority of middleware in a stack. For example, a middleware that rescues any exception inside the application to render a 500 page also needs to be adapted. Other middleware simply won’t work. For instance, Rails ships with a middleware that provides an ETag header based on the body which has to be disabled when streaming.
Looking back
Does this mean moving to Rack was a mistake? Not at all. Rack appeared when the web development Ruby community was fragmented and the simplicity of the Rack API made it possible to unify the different web frameworks and web servers available. Looking back, I would take the standardization provided by Rack any day regardless of the limitations it brings. Now that we have a standard, we are already working on addressing such issues, which leads us to…
Looking forward
Streaming will become more and more important. While working with HTML streaming requires special attention both technically and also in terms of usability, as outlined in Rails’ documentation, API endpoints could benefit from it with basically no extra cost. Not only that, features in HTML5 like server-sent events could easily be built on top of streaming without requiring a specific end-point in your application to handle them.
While CGI was originally streaming friendly, the abstractions we built on top of it (like middleware) are not. I believe web frameworks should be moving towards better connection/socket abstractions and away from the old CGI-based APIs, which served us well but it is finally time for us to let it go.
PS: Thanks to Aaron Patterson (who has also written about this issue in his blog), Yehuda Katz, Konstantin Haase and James Tucker for early review and feedback.
F.A.Q.
This section was added after the blog post was released based on some common questions.
Q: Isn’t it a bad idea to mix both streaming and non-streaming behavior in the same stack?
That depends on the stack. This is definitely not an issue with Erlang and Node.js since both stacks are streaming based. In Ruby, I believe a threaded jRuby or Thin will allow you to get away with keeping a socket open waiting for responses, but it will probably turn out to be a bad idea with other servers since the process holding the socket won’t be able to respond to any other request.
Q: Is there a need to do everything streaming based when a request/response would be fine?
No, there is no need. The point of the blog post is not to advocate for streaming only frameworks, but simply state that a Rack API may severely limit your streaming capability in case your platform supports it. Personally, I would like to be able to choose and mix both, if my stack allows me to do so.
Yes, it was clear that current Rack API is not suitable for streaming, websockets, etc. But are there any concrete plans? Rewrite Rack and drop compatibility with current middlewares?
I use Rack::Static, :urls=>{…} to identify cases where I just need a file delivered statically—perhaps there should be something like Rack::Stream, :urls=>{…}
Placing streaming and action-based webapps in the same basket is dangerous. You can’t just “enable streaming” and hope it will work for all apps at the same time. Building a streaming/always connected solution requires a massive change to the way the apps are built, just look at NodeJS in JavaScript, BlueEyes/Spray in Scala, they are built in a completely different way and that’s how it should be done.
Hiding the details and pros and cons of building a streaming webapp by a “common platform” using an action-based framework is just going to create new problems for developers. If you want streaming, you neeed to build your app from the ground up with that.
Also, on common APIs, the Java Servlet API just added support for streaming/websockets apps using specific (and a different) request cycle, separated form the common request/response cycle of the common Java Servlet.
Jose and Aaron are spot on that this needs to change, the sooner the better. It is a huge hole in Rails concurrency. As mentioned in the post, Java does a lot right here, but so does node.js If you look at how connect implements middleware (http://www.senchalabs.org/connect/) it has 2 key features that make anything on the radar today possible:
1) request and response can pub/sub events (on::data, on::end, etc).
2) middleware app has to explicitly call next for the chain to continue.
The coolest part about #1 is middleware can inject functionality to happen later, this is how they implement x-runtime:
module.exports = function responseTime(){
return function(req, res, next){
var start = new Date;
if (res._responseTime) return next();
res._responseTime = true;
res.on(‘header’, function(header){
var duration = new Date – start;
res.setHeader(‘X-Response-time’, duration + ‘ms’);
});
next();
};
};
> You can’t just “enable streaming” and hope it will work for all apps at the same time.
Yes, this is correct. However, I disagree with other points in your comment. First, I wouldn’t put streaming and always connected/websockets in the same bucket. The latter certainly can’t be mixed and that’s why I haven’t mentioned it in the blog post.
Also, the blog post is focusing at the framework infrastructure layer and there is absolutely no need to have two completely different infrastructures, as they certainly share a lot. After all, an action based app is simply a streaming application that streams just once (the headers + body at once). The problem is that the Rack API limits us on this subset scenario, making it hard for us to properly support streaming for those who need to.
Are you advocating that Rack evolve to support streaming with the callback hooks you describe in your final code example, or do you think we need something entirely new?
I believe Rack should evolve. We can’t afford to go back to the state we had before Rack.
So Python has a similar spec called WSGI that solves/helps this problem in a couple ways.
First, for some quick background: When a WSGI app is called, the function takes two arguments – the “environ” for the request and a “start_response” callable. The app then returns an interator object, which the server iterates over until the end of the response. It’s the app author’s job to call the start_response callable no later than the first iteration of the iterator.
1. The generation of the HTTP status and headers is decoupled from the response body. This is accomplished by using the start_response callable an make calling of it decoupled from returning the response body iterator.
2. It’s possible in WSGI to have a middleware that is wrapping an app delegate it’s own .next() call (.each in Ruby) to the underlying app it’s written in. When the server calls the middleware (which wraps the app) it gets back the middleware’s iterator rather than the app’s iterator. The middleware can then be notified with the app is done iterating a response (via a StopIteration exception) and then it can do the right thing – i.e. in your example, check in the DB connection.
Hope that helps!
Very simple example, stream a large (more than 100mb file), you can’t possibly do this in an action based framework without magic (like x-seendfile) simple because you would never want to hang up a thread doing expensive IO operations both on your disk and network, you would select an efficient async implementation that would only write to the socket if there is a buffer available for it.
Another example, reverse proxy, you just can’t write an efficient reverse proxy implementation using the usual request-response cycle inside of a thread, again due to the IO constraints.
So, while it’s perfectly possible to build an async backend and make it look synchronous for clients, it’s a serious waste of resources and could lead to subtle, hard-to-pinpoint bugs.
That’s why I really don’t believe having a single point of entry for both solutions would work, the abstractions would easily leak to the top level layers (the app layer) and people would be struggling to understand what’s going wrong. The Java servlet model is possibly the best, if you want to go request-response directly, just stay where you are, if you want do to streaming or async solutions, pick the new async API and enjoy it.
If there is code that can be shared between both implementations, awesome, share it, but don’t force a single model on top of two very different solutions to the problem.
I think we are agreeing then. This is very platform specific so when running on Node.js or Erlang (or thin in Ruby) the backend is inherently async so the cost you mention is quite low (which was what I was arguing). But you are totally correct that depending on your platform, the cost may not be worth it and grab a different stack.
I think this is an oversimplification of the solution for which there is no problem. There are a number of use-cases for which streaming is not the answer. I can probably come up with more antipatterns than patterns.
Richard, thanks for the comment. Can you please be a bit more specific?
> I think this is an oversimplification of the solution for which there is no problem.
What is an oversimplication? Which solution (the middleware example given or a Rack API alternative)?
> There are a number of use-cases for which streaming is not the answer.
Agreed. I didn’t say at any moment that streaming is the solution to all problems. In fact, streaming html responses leads to a number of complications, as linked in the post above.
Yeah, I saw your gist earlier. It would be cool, if rack will evolve in it’s API, but providing compatibility layer for “non-streaming” apps (in rails’ terms).
> In general, we would like to have a response objects that provides several life-cycle hooks
I seem to recall saying the same thing to Aaron at some conference a year or so ago. Rack is a great simplifying abstraction, but there’s a limit to what you can do with just one event hook (“on_request”, effectively).