|
Speeding up that first screenful of data Website optimization for complex websites
Jennifer Simonds The web is becoming a slower & slower place. As both content sites and storefronts compete to serve up more sophisticated information & ads that are targeted to their visitors in ever-more sophisticated ways, the backend tiers consist of an ever-increasing number of data servers, all trying to serve up these page fragments in a scalable way. Unfortunately, with this plethora of servers comes a risk of severely slowing down the response time for a given webpage if any one data server happens to be running slow at that moment. This can destroy the user experience if the website isn't properly designed with this risk in mind. How many times have you come upon a webpage with important content, only to see a few graphical elements show up, and perhaps a menu or two... and then nothing more for several long seconds... until finally the article or search result comes up, along with the ad that your browser has obviously been patiently waiting for the ad server to cough up before rendering the rest of the page. I assume you've already made sure your static content is cached properly, and perhaps you're even compressing your JavaScript includes to minimize the number of bytes being sent over the wire. But these basic steps don't help you much if your pages contain a lot of customized content that must be calculated anew every time pages are requested. So let's examine some very simple steps you can take to improve your users' experience by speeding up the response time for your pages' essential content. You can even speed up the perceived speed of the secondary items as well. Note that these optimizations don't even touch the backend data servers, nor do they involve major restructuring of the webserver itself.
The simple caseThe simplest way to serve up a webpage with both primary and secondary content is to have the webserver directly call all of the data servers or make queries to the database itself, one after the other, when it gets the HTTP request.
When it comes to displaying the webpage quickly, this synchronous approach is completely at the mercy of a slow backend server or database query. Critical path (essential data): Every data item up to the last essential item in the markup Critical path (secondary data): Every data item Here's an example page that demonstrates the results of using this approach. NOTE: All these examples contain some static content, plus two essential and 6 secondary page fragments, each of which takes 1 second to get served. Example 0: Synchronous data requests
First optimization: Rearranging the HTML markupYou can ameliorate the problem somewhat by placing the essential page fragments, such as the main content and perhaps the one or two most important widgets, before the secondary data in the HTML markup. In this case, you can put the secondary items in their own absolutely-positioned DIVs, and then specify their positions in the stylesheet as being above or to the side of the main content as necessary for your layout. As long as the essential content comes before the secondary items in the HTML, there's a good chance the essential data will get rendered before the browser stops to wait for the webserver to send back the rest of the page.
Check out Example 1, below. As you can see, depending on the browser the essential data items may get rendered well before the page is fully loaded. (The worst performer in these first two tests was Internet Explorer. Both IE 6 & 7 didn't display the bulk of the main content section until the page was fully loaded. However, Firefox 2 and Opera 7 both displayed the main content well before the end.) Critical path (essential data): Every essential data item Critical path (secondary data): Every data item Example 1: Synchronous data requests, optimized page layout
Second optimization: IFRAMEsA relatively simple yet powerful optimization is to place the secondary page fragments in IFRAMEs. Each IFRAME has a SRC= attribute that represents a direct query of a backend server. The IFRAME's URL contains a session or customer identifier of some kind so that the server can build its customized ad or widget for that visitor. Most third-party ad servers will keep their own cookie so they can track the visitor as they visit the ad servers' various client sites.
As the initial page is being parsed by the browser, each IFRAME it encounters spawns a request for a separate secondary page fragment, resulting in the secondary items being requested in parallel. Ideally these IFRAMEs should be placed after the important content in the HTML, as in the first optimization above—although as you can see in Examples 2a vs. 2b, the major gain in pageload speed comes from simply using the IFRAMEs. With this approach, only the essential data items are requested directly by the webserver, so the initial page & content is now limited only by the slowest server of essential data. As you can see in the example below, using IFRAMEs speeds things up considerably, even in IE. From here on out, IE is every bit as fast in bringing up the essential data items as Firefox & Opera. The disadvantage to this method is that the servers of the secondary data don't even see a request until the essential data items are being displayed on the browser. So the user might not see a secondary item from a slow data server until a significant time after they've already waited for the essential content to appear. Critical path (essential data): Every essential data item Critical path (secondary data): Every essential data item plus the slowest secondary data item Example 2a: Secondary requests come from IFRAMEs Example 2b: Secondary requests come from IFRAMEs w/optimized layout
Third optimization: IFRAMEs displaying pre-staged dataWe can improve on the response time for the secondary data items even more by having the webserver indirectly request the secondary data before requesting the essential data, and storing the secondary data for the IFRAMEs to request later on in the page load. (Thus the secondary data is "pre-staged" for immediate retrieval later.)
Each IFRAME spawns a request for a separate secondary page fragment as before. But since the webserver had already submitted a request for all of the secondary data items before the essential items, most secondary data items will already be sitting in a temporary table, available immediately. And for the data items that are very slow in coming, the perceived delay in their appearing on screen is greatly reduced because their servers have been working on serving the data in parallel from the beginning of the page request, while the visitor has been busy watching the essential data appear on the page. Another advantage this design: To the backend data servers, it still looks like just another series of requests from the webserver. There should be little or no redesign needed to the backend tiers to see big improvements in the user experience. Critical path (essential data): Every essential data item Critical path (secondary data): Either every essential item or the single slowest secondary item, whichever is slower. Example 3: Secondary requests for pre-staged data come from IFRAMEs
Alternate third optimization: AJAX displaying pre-staged dataIf you'd rather not use IFRAMEs to generate requests for the pre-staged data, you can also use AJAX to do the job:
In this approach, at the end of the onLoad handler we send an AJAX request to the page fragments server for a list of fragments we need to display. The page fragments server returns the XML or HTML for each fragment that is in its temporary table. The browser renders these fragments, and then keeps sending another AJAX request for the rest of the fragments until they've all been received. Like the IFRAMEs & pre-staged data approach, this design shouldn't require any redesign of the backend tiers to see big improvements in the user experience, since the requests still look like they're coming from the webserver as before. Critical path (essential data): Every essential data item Critical path (secondary data): Either every essential item or the single slowest secondary item, whichever is slower.
Bottom lineBy dividing the task of optimizing your complex page loads between the browser, webserver, and middle tier, you should see a marked improvement in the perceived responsiveness of your webpages by your visitors—possibly without having to even touch the existing database or application layers of your system at all. If you still need to improve the delivery times of your essential data, your webserver can build its requests for the essential data asychronously instead of synchronously, by using a communications protocol such as COM (if it's an IIS server), or by opening sockets to the data servers. In either case you send the requests, and then poll them for return data or catch the response via a callback function. Or if your webserver is multithreaded as in Java, you can simply spawn a thread for each request, and construct the final page when all the threads have gotten their data. However you implement the ansynchronous requests, the critical path for your essential data is no longer the sum of all the essential data items, but instead is now only the single slowest essential data item. And now you can concentrate on speeding up that one item alone. Which approach is best (or feasible) depends on the details of both your webserver and backend data servers. Have you implemented any of these approaches at your site? What results or pitfalls have you found? Let me know of your experiences below. |