This project was designed to answer the question: Is it possible to create a grid control that remains as responsive as a desktop app even in the face of a large amount of data, without resorting to Flash or a Java applet?

The short answer: Almost. With a lot of tweaking of JavaScript & AJAX, I was able to create a data grid that is surprisingly responsive when moving around the large dataset of a TV schedule—if the client has enough bandwidth to move the data quickly from the server.

To test this, I found a great source of TV schedule data from a service called Schedules Direct. This organization resells the data from Tribune Media Services, which is the major supplier of TV schedules for the cable channels and services like TiVo, newspaper websites, etc. Schedules Direct has a pretty good deal for hobbyists or for anyone who has a software DVR on their PC: $6 for a two month subscription to the data feed, or $25/year.

Getting the data

Schedules Direct lets you download the listings for one lineup which you specify when you open your account (in my case, Dish TV for Seattle). You specify a date/time range and it sends you the program data scheduled for all the channels in your lineup. This is a ton of data—25 MB for 60 hours of schedule info. My grid only shows 2 hours of data for 10 channels at a time, so it would clearly be overkill to query Schedules Direct each time the user scrolls to a new page on the grid. So instead, every 12 hours a cron job downloads the next 60 hours worth of data. The TVGrid control queries my local database for each AJAX request.

Now, here you may be asking: 60 hours? Why not top off the schedule data & only download 12 hours of data from 48-60 hours in the future? The problem is, stations & networks can make last-minute changes to their schedules. So, let's say the President calls an emergency press conference for tomorrow. If the cron job merely topped off the schedule data, it would miss that change. So there's no good alternative to reloading the full 60 hours of data every 12 hours.

Sadly, this app only downloads the data for Dish TV in Seattle. I could, in theory, let the user specify their location and their Schedules Direct account name & password, but then I'd have to be saving 60 hours worth of schedule data for every user - and my poor little AWS micro instance isn't nearly up to the task. As it is, I have to break up the request into five chunks of 12 hours each, or I run out of memory!

The grid

There are several TV grids available on the net, such as IMDB's & Yahoo's old-school tables on static pages, or TiVo's & TV Guide's AJAX-y grids. All TV grids have one thing in common: They're desperately attempting to present a huge amount of data as smoothly & responsively as a native desktop app would. I mean, let's face it: A TV grid is going to be hampered by the fact that the data is off on the server instead on the client's machine. Plus, HTML's and the DOM's native support for active interaction with the webpage is a lot weaker than the Windows or Mac SDKs.

Since a standard TV grid consists of rows of TV channels filled with columns representing programs shown through time, I first tried using a <table> to hold both the channels and the programs. But this quickly became problematic:

  • The <table> is designed to show columns of cells that are all the same width within a given column. But programs can run at odd lengths. We could create a table where each column represented 30 minutes, and a 2-hour program resided in a <td> that had colspan=4, but then what about the movie that runs 2:05? We could have each column represent 5 minute blocks, or even 1 minute blocks, so each show has a colspan=its length in minutes, but then for a table that spans 48 hours we now have a table with 2880 columns. This seems to trigger another hairy problem
  • The standards give user agents some leeway in deciding how to fit a lot of data into a <table>. This makes sense, since they created the <table> so that webpages could show tabular data. And the major browsers seem to try real hard to fit as much of the data within the table's displayed width as possible. There may be a way to force a <table> to just show the darn columns at the size & position you tell it to, but I never found it. With long rows of data that are much wider than the visible table, both IE & Firefox kept trying to "help" me squeeze more columns into the screen.

I think that tells us why none of the other TV grids I've seen take this approach (unless they're only showing a static block of 2 hours at a time and load a whole new page when you scroll to the next time block).

So I bypassed tables altogether, and instead I place each station in its own <div>, and each program for that station is an absolutely positioned <div>. The program's X location is determined by the date & time it occurs. Even though I end up with thousands of <div>s, there's no noticeable impact on how long it takes to fill out the visible portion of the grid.

Optimizing the response times

I tried several things to overcome the bottleneck of data transfer between the view (webpage) & the database on the server, so that the visual feedback is swift when the user moves around the data.
  • When the initial webpage is built on the server, its hashtable of channels (g.aStations) is preloaded with all the channels on this service (i.e. Dish TV), and the hashtable of program descriptions (g.aPrograms) is filled with all the programs that the user's Favorite channels are showing in the next 2 hours. So you immediately see all the programs within the next 2 hours playing on your Favorite channels.
  • When the user marks a channel as a Favorite, it gets displayed at the top of the grid in the first page they see when the page loads, in addition to its normal place in the lineup. This can be a big help when you only ever care about a dozen channels that are scattered among your lineup's 850 channels: You don't have to scroll down past hundreds of channels to locate the one you're interested in.
  • When a channel comes within 1 page of being in view for the first time, we query the server for its basic schedule data and build a blank <div> for each of their programs. Then we query the server again for the detailed data for any of its programs that we haven't already loaded. Creating a blank <div> is very fast, and this way when you scroll down to a page of channels you haven't seen before, you immediately at least see the outlines of their programs while you're waiting for the server to send their full data.
  • After building the blank <div>s for all the visible programs, we check to see if each program's detailed data is already in g.aPrograms. If it is, we fill its onscreen <div> with text. If not, we add its programid to the g.aNeedPrograms hashtable. At the end of scanning the visible program blocks, we send the AJAX request for all the programs in g.aNeedPrograms. Upon receiving the program details from the server we repeat this process. If the user hasn't scrolled away yet, the whole visible grid should now be updated.
    You can easily see this happening when you scroll over to a new time on an existing channel. First you see a few scattered programs get filled in with text, and then the programs get filled in. The first set of programs are those that were already in g.aPrograms, and the later ones were loaded in thru AJAX.

Navigating the dataset

By necessity, all TV grids present a small window on a large "area" of data - hundreds of channels and programs spanning the next several days. So it's important for the user to be able to find the information they want without getting lost. I tried a couple ideas to improve this situation over the other TV grids out there:
  • Standard scrollbars: All other TV grids I've seen make you hunt for a small [<] or [>] button to scroll back & forth in time. But in fact, the lowly standard scrollbar is a much better tool for this. It's ubiquitous, so everybody immediately knows how to navigate the data. It automatically shows the user where they are in relation to the total range of data, and they can quickly move forward or back more than a "page of time" at a time. (This last one is painfully evident with the non-AJAX TV grids: You can only move a single page (usually 2 hours) at a time, each time waiting, patiently, for the new page & their ads to load.)
  • Timeline overview: At the bottom of the grid is an overview of where the user is in the overall timeline. It currently only shows the hours of the day, so it could be further improved by putting the day in there as well.
  • Favorite channels: Out of several hundred channels, you only ever care about at most a dozen—which are interspersed throughout the long channel list. The channels that you mark as Favorites all show up right at the top of the table when the page first loads. Plus their program information is always in memory, so their program blocks get filled in before any others.
  • When you click on a program to see its full data, the popup box has <- and -> buttons. You can click on one to jump directly to the previous/next showing of that particular program on that channel, if any. In theory this idea could be expanded to lead the user to other channels if the program is being shown there instead—however that would require querying the server, since the client never has all the programs loaded for all the channels so it wouldn't know where to take the user. Still, it might be worth it.

Locked headers for columns & rows

The <table> element lacks one feature that hampers its usefulness when displaying large datasets: You can't lock the column & row headers. You can make the first row a set of <th> elements, and the first column can contain the row headers. But if you scroll down a long table, the column headers scroll off the top of the table. You can work around this by putting column headers above the <table> itself, perhaps in its own single-row table. But then what happens if the data table is also too wide for the page, so it has scrollbars for both directions? Now you have columns scrolling off the page to the left & right, so static column headers can't help you—and that column you were using to act as the row header scrolls off the page as well.

Some TV grid developers try to compensate for this deficiency by inserting a new row of column headers every few rows. They compensate for the horizontal scrolling problem by simply re-loading the whole page when you move back & forth in time. It does get the data in front of the user—but it's not pretty. Even the AJAX grids solve the problem by forcing you to scroll left/right a "page" at a time so they can totally rebuild the visible grid in-place.

My solution to this problem was to create a "table" for the column headers and another for the row headers. These are actually <div>s which contain the column & row labels. They're independent of the central <div> containing the data, but when you scroll the data the headers respond to the onscroll event, where I adjust their scrollHeight & scrollLeft properties to match. This worked out quite well.