|
The short answer: Yes & no. Using JavaScript & AJAX, I was able to create a data grid that is surprisingly responsive when moving around the large dataset of a TV schedule—at least in IE 6.0. But in Firefox 2 & 3 it's not nearly as smooth a user experience. More on that below.
To test this, I found a great source of TV schedule data from a service called Schedules Direct. They resell the data from Tribune Media Serivces, which is the major supplier of TV schedule info for the cable channels, TiVo, newspaper websites, etc. They have a pretty good deal for hobbyists, or for someone who has a software DVR on their PC: $5 for a two month subscription (or $20/year) to their data feed.
Getting the data
Schedules Direct lets you download the listings for one lineup which you specify when you open your account (in my case, DishTV for Seattle). You specify the date/time range and it sends you the program data scheduled for all the channels in your lineup. This is a ton of data—4.7 MB for 28 hours, and that's excluding the production crew listings and genre information! My grid only shows 2 hours of data for 10 channels at a time, so it would clearly be overkill to query Schedules Direct each time the user scrolls to a new page on the grid. So instead, every morning I download 28 hours worth of data (spanning 46-74 hours into the future so the next 48 hours will always have data), and query my local database for each AJAX request.
The proper way to run this daily batch process in a production environment would be a crontab running PHP from the command line. This batch script would then query the SOAP server for the data, & populate the DB from the XML response. But my little $7.50/mo website account doesn't allow a PHP script to access a file via HTTP. But that's not a problem: I have AutoIntern 2.1 Job Scheduler on my personal machine. So I have an early morning AutoIntern event on my PC, and download the data using a WIL script (AutoIntern's built-in script language). After stripping out the SOAP envelope, AutoIntern FTPs the XML data to my webserver, and then invokes a PHP script via HTTP. This script now sees the XML file as a local file, which it can read. So THAT script parses the XML into the DB.
Sadly, it only downloads the data for Dish TV in Seattle. I could, in theory, let the user specify their location and their Schedules Direct account name & password, but then I'd have to be saving 48 hours worth of schedule data for every user - and my poor little web account isn't nearly up to the task.
The grid
There are several TV grids available on the net, such as IMDB's & Yahoo's old-school tables on static pages, or TiVo's & TV Guide's AJAX-y grids. All TV grids have one thing in common: They're desperately attempting to present a huge amount of data as smoothly & responsively as a native desktop app would. I mean, let's face it: A TV grid is going to be hampered by the fact that the data is off on the server instead on the client's machine. Plus, HTML's and the DOM's native support for active interaction with the webpage is a lot weaker than the Windows or Mac SDKs.
Since a standard TV grid consists of rows of TV channels filled with columns representing programs shown through time, I first tried using a <table> to hold both the channels and the programs. But this caused two problems, which proved fatal:
<table> is designed to show columns of cells that are all the same width (within the column). But programs can run at odd lengths. We could create a table where each column represented 30 minutes, and a 2-hour program resided in a <td> that had colspan=4, but then what about the movie that runs 2:05? We could have each column represent 5 minute blocks, or even 1 minute blocks, so each show has a colspan=its length in minutes, but then for a table that spans 48 hours we now have a table with 2880 columns. This seems to trigger another hairy problem:
<table>. This makes sense, since they created the <table> so that webpages could show tabular data. And the major browsers seem to try real hard to fit as much of the data within the table's displayed width as possible. There may be a way to force a <table> to just show the darn columns at the size & position you tell it to, but I never found it. With long rows of data that are much wider than the visible table, both IE & Firefox kept trying to "help" me squeeze more columns into the screen.
So I bypassed tables altogether, and instead I place each program on the grid as its own <div>. The Y location is determined by the channel, and the X is determined by the date & time the program occurs. Even though I end up with thousands of <div>s, there's no noticeable impact on how long it takes to fill out the grid.
Optimizing the response times
I tried several things to overcome the bottleneck of data transfer between the view (webpage) & the database on the server, so that the visual feedback is swift when the user moves around the data.
g.aStations) is preloaded with all the channels on this service (Dish TV in my case), and the hashtable of program descriptions (g.aPrograms) is filled with all the programs that the user's Favorite channels are showing in the next 48 hours. You'll notice that the Favorites channels are the first ones to get filled with programs when the page loads.
<div> for each of their programs. Then we query the server again for the detailed data for any of its programs that we haven't already loaded. Creating a blank <div> is very fast, and this way when you scroll down to a page of channels you haven't seen before, you immediately at least see the outlines of their programs while you're waiting for the server to send their full data.
<div>s for all the visible programs, we check to see if each program's detailed data is already in g.aPrograms. If it is, we fill its onscreen <div> with text. If not, we add its programid to the g.aNeedPrograms hashtable. At the end of scanning the visible program blocks, we send the AJAX request for all the programs in g.aNeedPrograms. Upon receiving the program details from the server we repeat this process. If the user hasn't scrolled away yet, the whole visible grid should now be updated.
You can easily see this happening when you scroll over to a new time on an existing channel. First you see several scattered programs get filled in with text, and then you see the rest of the programs get filled in. The first set of programs are those that were already in g.aPrograms.
Firefox slowness
This all happens reasonably fast in IE, and scrolling to a new time or new set of channels is a pleasant experience. It takes under a second to fill out the programs whose data is in memory, and 1-2 seconds more to retrieve & fill out the other program blocks. But in Firefox it takes about twice as long for these steps - making the experience a bit frustrating for the user. The problem happens because it takes much longer to put content into a <div> than it does to create the blank <div> in the first place. This is true whether we call innerHTML to fill the <div> or if we build up its content by creating a document fragment of text nodes and call replaceChild to place the document fragment under the div. I don't know how to get around this problem—at some point the program blocks have to get filled in with text. I guess we could periodically add the data for channels and for time periods that are farther & farther away from the current page while the user is reading the current page. The browser tends to not respond very well to scrolling when it's busy filling out many <div>s at once, but we could at least send AJAX requests for those programs' underlying data.
Navigating the dataset
By necessity, all TV grids present a small window on a large "area" of data - hundreds of channels and programs spanning the next 48 or 72 hours. So it's important for the user to be able to find the information they want without getting lost. I tried a couple ideas to improve this situation over the other TV grids out there:
Fixed headers for columns & rows
The <table> control lacks one feature that hampers its usefulness when displaying large datasets: The column & row headers are not fixed. You can make the first row a set of <th> columns, and the first column can contain the row headers. But if you scroll down a long table, the column headers scroll off the top of the table. You can work around this by putting column headers above the <table> itself, perhaps in its own single-row table. But then what happens if the data table is also too wide for the page, so it has scrollbars for both directions? Now you have columns scrolling to the left & right, so static column headers can't help you—and that first row-header column scrolls off the side.
Some TV grid developers try to compensate for this deficiency by inserting a new row of column headers every few rows. They compensate for the horizontal scrolling problem by simply re-loading the whole page when you move back & forth in time. It does get the data in front of the user—but it's not pretty. The other AJAX grids solve the problem by using static column headers above the table, put the row headers in the leftmost column, and force you to scroll left/right a page at a time so they can totally rebuild the visible grid.
My solution to this problem was to create a "table" for the column headers and another for the row headers. These are actually <div>s which contain the column & row labels. They're independent of the central <div> containing the data, but when you scroll the data the headers respond to the onscroll event, where I adjust their scrollHeight & scrollLeft properties to match. Once again, this works great in IE 6, but Firefox 2 & 3 take a relatively long time to redraw an element, and since scrolling the grid causes a flurry of onscroll events to be fired, it ends up being practically unusable. So I ended up ignoring the onscroll event in Firefox, and I only update the column & row headers once scrolling has stopped.
So - this will be a good solution to the problem of fixed column & row headers, if Firefox can someday improve its performance in this area.