Dennis Forbes on Pragmatic Software Development   Subscribe to RSS


About the Author
Dennis Forbes is a Toronto-based software architect. While focused primarily on the .NET and SQL Server worlds, Dennis frequently ventures outside of this comfort zone into game development, Linux development, and image processing. He has been published in several industry magazines, has been quoted in the Wall Street Journal and has been interviewed by NPR.

He is a vice president and lead software architect at an innovative New York City hedge fund back-office services firm.

Dennis has been working on solutions for the financial, telecommunications, and power generation markets for over 13 years.


Recent Entries


The Feed Bag

 
Tuesday, October 18 2005

OPTGROUP is a grouping element that can be used in SELECT elements. For instance instead of...

Which Web browser do you use most often?

...you might see...

Which Web browser do you use most often?

(That sample courtesy of the OPTGROUP page above, which is why the browser list is obsolete. I'm too lazy to update it)

Whether you see those or not depends on your browser (some ancient ones dislike form elements that aren't in a form), and the visual styling of the group headers will differ.

It's a pretty useful little addition to HTML, and has been around for quite some time. Remarkable, then, that it has seen negligible use in the real world. This despite the fact that it is eminently useful, clear by the fact that lots of sites are doing exactly this sort of thing manually: Adding custom hacked in groupings, where the group headers themselves are selectable but not really selectable (it'll be script-overridden or refused by the form handler if it is selected by the user, which is confusing from a UI perspective) for things like grouping states and provinces by country, and so on.

Of course my point here wasn't to evangelize OPTGROUP, though I do think that it's underused. The profound thing to me, brought to mind by the power of this trivial and underused element, is how marginal the advances in web technology (specifically HTML) have been over the past several years. Everyone is busy doing grossly redundant scripting and hackery for the most mundane of things, trying to use the coarse paintbrush available to build a subpar pseudo-fat client. Maybe they're spending weeks trying to replace their table layout with DIVs to satisfy the XHTML pedants.

How many lame derivatives of combo-boxes have been hacked out? How many intolerable scripted spell checkers? How many hacked out date/time selectors? Every one of them a mystery-meat to unwilling web victims.

Doesn't it seem reasonable that these things could be a part of the basic toolset of HTML by now, allowing rich clients to use their power to present it in the most intelligent way possible?

For that matter, why haven't we made the leap to "databound" controls? Would it really be that difficult to standardize some of the standard "AJAX" type functionality into declarative HTML 5.0? Not only is AJAX far from new, but the fundamentals behind it have been understood since the nascent days of the web. This is painfully obvious stuff.

Is it really that difficult?

Of course, it absolutely would be that difficult. Once a standard is entrenched, it becomes more and more difficult to change through consensus.

History is rife with examples where someone hacked something together and unleashed it on the world, and it was good, and it was revolutionary. The rate of innovation quickly slows, though, as more and more people get involved, and their vested interests and aversion to change takes hold, to the point that the most ridiculously trivial and profoundly necessary of changes take years to see the light of day. RSS feeds are just barely gaining traction, yet already many evangelists have their feet firmly stuck in the concrete, unwilling to even consider trivial changes. For instance alternatives to the absurd RSS/XML icon (who knew that people could get so defensive over a 36x14 icon). RSS will eventually rust, until some new disruptive standard comes along and eats its lunch, and then it will repeat. It is the way these things tend to go.

Oh well, I hear that they're developing Duke Nukem Forever using XForms.

Wednesday, October 19 2005

Just a brief entry today as time is short.

I got a lot of great feedback for yesterday's entry - OPTGROUP and the Pace of Standards. It was a mixed set of responses that were both educational and entertaining. I love the feedback, so if you have a comment please feel free to drop me a line (I might post an entry about your comments, though I won't quote you without your permission. Note that I do monitor referrals, so if you make your comment on a blog with a link, it will be noted just like a trackback).

It is remarkable how quagmired the base foundation of the web, HTML, has become. We live in the most dynamic software world that has ever existed, yet much of the infrastructure is the same artefacts from the 90s duct taped together in precarious (and grotesquely redundant) ways.

Thursday, October 20 2005

It's been around for a while, but a lot of people still haven't experienced it - The Quiet American's One-Minute Vacations Site. It's an expanding collection of user submitted 60-second audio samples from around the world. Absolutely fascinating to listen to, and many of them really do take you there. Take a minute break and go on a vacation.

While people often use the term "Audio Blogging" to refer to the spoken word (which, when fed through RSS, becomes podcasting), I see these sort of audio samples to be more analogous - though in the audio realm - to photo blogging. As much as I appreciate the Quiet American, it would be interesting to have a site like Flickr-for-audio-samples, with thousands or millions of samples from around the world. Heck, maybe just the Flickr we know and love, but with the addition of audio. It would be interesting to see photos of an Indian market, coupled with some audio samples, and be able to search and browse by keywords.

Of course naturally one would think "Duh...that's video with audio...That's Google Video", however video remains too unwieldy, and in the hands of a less-than-expert it very seldom captures the essence of a scene like a carefully taken photo does, nor does it facilitate quick and easy consumption.

Thursday, October 20 2005

One of the big marketing pushes to help hype the release of SQL Server 2000 was a huge onslaught of the benchmarks - before SQL Server 2000 was even available to buy, its results were dominating the TPC results, primarily via clustering. Shortly thereafter, it is purported, Oracle demanded that the TPC separate clustered and non-clustered results. Not long after SQL Server was doing very well in the non-clustered category as well (on very, very, very expensive machines - Big Iron).

SQL Server had joined the big leagues. Any questions about its scalability dissolved.

Remarkably we're on the cusp of the real release of SQL Server 2005 (Nov. 7th I believe), yet there has been barely any noise at all in the TPC results. It has taken more of a lead in the price/performance TPC-C results, and it has pushed a little higher in the pure performance results - though that has more to do with beefier hardware - but all-in-all it has been very sedated in contrast with 2000's release. I wonder if the TPC results simply aren't considered important anymore (probable, giving how old most of the leader results are. 50% of the top 10 are from 2003)

Is the TPC no longer relevant? Does SQL Server 2005 simply offer marginal scalability/performance advantages for the TPC suites?

On the topic of scalability, SQL Server's clustering capabilities could use some improvements. As it is, scaling your database out across two or more servers is most certainly a non-trivial task. It's something you really have to design around (distributed partitioned views don't partition themselves, and it's a leaky abstraction). In an ideal world you could add a new server, install SQL Server and choose "add to the cluster" and it'll automatically propagate some data over and start sharing the load transparently. If it were so easy and elegant Microsoft would see a tonne of license sales as people scaled out.

I'm not an Oracle expert, but I believe that's how their clustering solution has been built.

Of course that sort of clustering is really focusing on the computation end, which really isn't a problem for most scenarios. Instead most are limited by I/O, and we already have methods (via SANs) of tremendously and transparently scaling-out our storage subsystem. Take a look at the full disclosure of the price/performance leader: A single (albeit dual-core) 2.8Ghz processor - a relatively low-end head-end system - backed by a SAN hosting 56 "clustered" hard drives. The TPC-C benchmark is artificial, so this doesn't necessarily mirror the real world, but it is telling. Keep your data efficient through good design and delay the day that you need a 56-disk SAN. 

Thursday, October 20 2005

[Note: Some have noted that it should be Daylight Saving Time, without the pluralization of Saving. I, like many, use it more as a general-use title rather than a literal statement - given that it isn't actually saving daylight - and I generally hear it referred to as Daylight Savings Time. Just thought I should mention that.

If one wants to be a pedant, I believe it should actually be Daylight-Saving Time]

The Ontario government caved today, rashly deciding to follow the lead of a ludicrous U.S. energy bill rider, extending Daylight-Saving Time by three weeks in the spring, and a week in the fall (switching into DST on the second Sunday of March, rather than the first Sunday of April as it currently is, and switching back to Standard time on the first Sunday of November rather than the last Sunday in October).

Given that many don't entirely understand DST, I thought I'd share a graph I made some time back (I originally planned on turning the source algorithm into a web service to allow one to punch in the inputs such as location and generate their own graph, but could never justify spending the time on it).

Toronto Sunrise/Sunset DST

All values are calculated for Toronto, Ontario, for 2005. The red line represents EST sunrise, the yellow EST solar noon, while the cyan line represents EST sunset. The purple lines represent the 9-5 workday, adjusted in the summer months to account for DST (where 9-5 is really 8-4). The blue lines represent the extensions brought about by this change (3-weeks earlier in the spring, one week later in the fall). To recap - only the workday period on the graph above calculates in DST (e.g. the 8pm sunset during the summer is 9pm on the clock during DST, and the 4:20am sunrise is actually 5:20am on the clock).

As much as I dislike the incredibly costly confusion and complexity of DST (my two and a half year old still hasn't adjusted), there is a small amount of logic behind it - Presuming that human beings don't naturally adapt to the sun coming up earlier during the summer, DST moves an hour of this presumably unused time into the traditional post-work hours, "lengthening" the evening (not really lengthening the evening, but artificially doing so by moving the traditional 1950s work hours earlier in the day).

Many people would argue against this, saying that the summer hours give them an opportunity to jog, garden, go to the gym, and otherwise take advantage of the extended pre-work hours. Nonetheless, DST is geared towards those who do nothing until their pre-work preparation (e.g. the alarm clock goes off an hour and 30 minutes before the work day starts). For those people DST is entirely beneficial.

Extrapolate that logic out, though, and there should be a second layer of DST that moves the clock yet another hour forward during May to September. Maybe an hour more during June. Perhaps we should have a dynamic clock, such that 9am is an hour after sunrise year round.

Humor aside, there is a tremendous risk of this DST extension, especially coming into force so quickly. Having worked with a number of daylight-saving time related software problems (please use UTC people, or at the very least disregard DST), I would wager that there will be significant ramifications of this. Millions of dollars will need to be spent preparing for, and then cleaning up after, what many seem to think is a simple date change.

Anyone interested in the source data that I generated for this can find it here (it's a Microsoft Excel worksheet).

http://www.yafla.com/

Friday, October 21 2005

The Seeds of Malcontent

I absolutely despise the acronym "AJAX". Something about it rubs me the wrong way.

Perhaps it's a silly hang-up. Perhaps I'm foolish for not getting with the current lingo.

Nonetheless, I just don't feel comfortable with it. Not only is it impossible for me to say or write it with a straight face, I find it difficult to hear it from others without unconsciously stereotyping the other party as some sort of malleable, misinformed Johnny Come Lately.

It just seems like an uninformed way of uselessly simplifying a complex ecosystem of evolving and varying solutions into a meaningless, trite acronym (don't get me started on "Web 2.0").

Of course AJAX isn't really an acronym for anything anymore - realizing that it was founded on a solid foundation of ignorance, along with an unhealthy lack of historical knowledge, it has become a more generalized term meaning some sort of nebulous "interactive web application" (what we historically called a rich web application, or even the logically descriptive Dynamic HTML or DHTML). Now it's popular for it to just be the non-acronym "Ajax", with a much less restrictive meaning. If it's a cool web application, well that's courtesy of the new-fangled Ajax!

Even ridiculously pedestrian uses of scripted objects are hopping on the AJAX bandwagon these days. "It uses JavaScript...and that's the J....so it's AJAX!"

The History of the Artist Currently Known as AJAX

As a bit of "AJAX" history, way back in 1999 the Microsoft MSXML team - these were the people who made the superlative XML parser - added an oddball little object into the MSXML library: The XMLHTTP COM object. This object - one which originally used its own HTTP transport outside of IE's, leading to all sorts of proxy configuration fun - allowed one to programmatically send parameterized GET and POST requests to HTTP servers, retrieving data back for processing or display. They deserve credit for inventing it, but at the same time it was one of those inevitable solutions (in fact there were already safe for scripting third party HTTP clients, but of course they had very limited deployment).

XMLHTTP, like the rest of the MSXML COM library, was usable and valuable both by native applications, and by Internet Explorer client-side script (because they handily marked it as Safe For Scripting, which is the flag that reveals a COM component to the IE scripting engine). It isn't actually tied to XML in any meaningful way, and the XML misnomer was purely a result of it being created by the MSXML team (it was just a bit of namespace). Microsoft widely distributed the library alongside other products.

Voila, another method of background data loading was released to the world, adding to the already existing and utilized hidden-IFRAME technique. Like many developers in the Microsoft world, I saw the benefits of this relatively small enhancement, and started enabling internal web applications with partial page loading, rendering and post backs (using other parts of MSXML to transform received XML against an XSLT for display - a next generation technique that still has very limited deployment). Indeed, I even used this object in rich, native applications as an easy way of communicating with web services. It was a convenient little object.

All back in the year 2000, half a decade ago. I was hardly a pioneer, and there were thousands of others doing the same thing or more.

Nonetheless, given that it only worked with Internet Explorer, it was a complete non-starter for the public web, at least for the rational.

XMLHttpRequest Goes International

In early 2001, after seeing the value that it brought to the table, the Mozilla team added XMLHttpRequest to Mozilla 0.8. As the years passed, all of the other major-minor browsers incorporated compatible implementations. It was becoming a technology that was usable in the mainstream...but for one small issue: Up into even 2003, a lot of large corporations stuck with Netscape 4.x as their primary browser, or as their back-up cross-platform browser (e.g. they had Internet Explorer as their primary, with Netscape 4.x as the umbrella if Microsoft suddenly jacked up licensing costs and they needed to switch all of their client operating systems).

This lowest common denominator inhibited public sites, forcing them to stick with the tried and true, so even with the pervasive availability of this functionality across all of the late-model browsers it was still off-limits for widescale use because of a very small minority. Imagine a nation restricted by a 30km/h speed limits because there are a couple of people with old, beat-up Yugos, while everyone else is pounding their steering wheels in their beefed-up gas-guzzling 8-cylinders. That was the web world.

Google Shakes Up The Web World

In late-2004/early-2005 Google shocked a lot of the blissfully-unaware world by adding XMLHttpRequest-backed functionality to Google Suggests, offering "type-ahead" usability enhancements (it can also use IFRAMEs, which is one of the original background-communication techniques). The moment was right, especially for a non-critical beta service, when the old (e.g. Netscape 4.7) could finally be tossed aside, and we could move ahead with technologies that have long been available. Google was the biggest player leading the way, but of course they weren't alone, and there were already plenty of public implementations of this sort of dynamic content that demanded a modern browser (Outlook Web Access, a component of Microsoft Exchange, being one of the earliest and best known).

Google Suggests was the largest, most visible implementation, and it opened a lot of web naysayers eyes up to what many of us had been evangelizing for years. Suddenly the unaware could see potential that had been there for years. Google also set a new bar for the amount of server side processing and bandwidth that could be allocated to a single random web visitor (not long ago it would have been considered insane to do that sort of processing and data transfer constantly as users typed keystrokes, for instance, but the availability of cheap computer cycles and copious bandwidth has changed the equation).

Human Nature - The Defensive Compensation Effect

I'm going to segue to a little human-nature story here - I'm a Microsoft-centric user and developer: I chose to professionally focus on their tools and technologies as my specialty early on, and I've done very well with it (though I'll switch to other primary specialties if and when it's in my best interest). Nonetheless I regularly use Linux (primarily through the magic of virtual machines. I'd recommend that everyone does this, as an aside, especially now that VMWare has made their Player free. You owe it to yourself to give Linux a try - you will probably be surprized at how rich and capable it really is on the desktop), and I try to keep myself fairly competent with it. I have lots of experience with other OSs as well, though sadly I've yet to try OS X. Anyways, putting all of that operating system defensiveness aside, I've been with Windows for quite some time.

One thing I've noticed over the past several years, as a Windows user, is that quite a few historically anti-MSers are making the switch to (or back-to) Windows on their desktop. Perhaps Microsoft's security initiative is having an impact, or some other extenuating circumstance is making them want to switch, but nonetheless they're "coming back".

I feel that a lot of the AJAX hype follows the same pattern - someone either poo-poohed web applications for years, or perhaps their career just finally brought them to web applications. Instead of feeling that they're a junior in an established field, they have to invent some paradigm shift that renders everything that came before irrelevant.

Almost invariably they will defend their switch as being justified only by some relatively recent extraordinary revelation in the Windows platform - e.g. "That Windows 2000 was a real piece of crap, and I wouldn't have touched that with a 10-foot pole, but the new XP kernel is pretty good. Windows was crap before, but now it's good". Of course anyone who follows Windows knows that this is utter claptrap - XP is largely 2000 with a facelift and some relatively minor kernel changes. Really 2000 was already quite a well designed, nicely executed, very stable operating system, but they couldn't concede that prior iterations had value if they weren't a part of the game. Ultimately it only has value once they've entered the contest.

You can see this going the alternate direction as well - "Linux was garbage before kernel 2.x.x" - and for virtually any other technology. If someone is late to the game, there's a good chance that they'll try to manufacture some reason to validate making the switch now. Sometimes it's rational, but often it's just complete nonsense. I haven't spent any time with OS X, so instead I'll just wait a bit until the next iteration and declare whatever trivial change they've made to be the pivotal reason that justified my time.

This applies not only to engineers and their technologies, but to business hypsters as well - Instead of pimping themselves as yet-another entrant in the competitive web application market, they need to present the idea that they're coming at the perfect time: So much is possible now that wasn't possible yesterday.

I feel that a lot of the AJAX hype follows the same pattern - someone either poo-poohed web applications for years, or perhaps their career just finally brought them into web applications. Instead of feeling that they're a junior in a long established field, they have to invent or embrace some paradigm shift that renders everything that came before irrelevant. Disregard that the "technology stack" that facilitates a modern web application is extraordinarily wide and deep, and just pretend that this one small schism represents everything.

Conclusion

The crux of the matter, in my opinion, is that the acronym AJAX brings absolutely no clarity to the table, and instead introduces nothing but noise. Not only does it poorly define an implementation pattern that has been in use by experts for years, but its street use is so generic as to be detrimental (just look at that use of AJAX for the "AMASS" solution linked near the outset - Now AJAX just means "something done in a web app". How utterly inane). To others it is a buzzword that tries to simplify a multi-faceted technology platform into a nonsensically literal implementation detail, putting an inordinate amount of attention on one very small part of an interactive web application.

There is such a rich array of techniques and technologies available to a web developer to make a first rate platform, that such a simplistic and meaningless bit of language just confuses things.

Furthermore, there is this simpleton tendency for the hypsters and the unaware alike to see, or talk about, revolutions, where instead there has been a continual evolution (they're ignoring the fossil evidence!). One just has to read Tim O'Reilly's latest self-aggrandizement platform - Web 2.0 - to see this sort of distortion in action. Where many in-the-know see a constant evolution as the platform matured - everyone got faster computers, storage got cheaper, bandwidth became less expensive, the lowest common denominator got better, and so on - these people, with their enormous blind spot, instead see some monumentous divergence between yesterday and today. Yesterday was old-sk00l 1.0, and today is 2.0. Get it? Yesterday's car is a Model-T 1.0, and today's car is 2006 Honda Accord 2.0, with nothing in between.

Great, now come to my conferences, link my pages, and buy my not-open-source-but-open-source-is-great books!

http://www.yafla.com/

Saturday, October 22 2005

Web 2.0 is, in its typical usage, a completely nebulous term. Yet remarkably its boosters will declare that it's all so clear, you idiot - it's "<INSERT THEIR OWN PARTICULARLY DIVERGENT INTERPRETATION HERE>". I've seen this play out in quite a few online and offline discussions, proving to me that it's an amorphous/eye-of-the-beholder sort of term.

Tim O'Reilly and friends were one of the first to widely coin it, so his take obviously deserves attention, yet even it lacks any degree of clarity. Really it appears to be nothing more than a freeze-frame of the web's continuing evolution.

Nonetheless, one recurring attribute of the "Web 2.0" religion deserves attention - Folksonomy, which is the loosely-controlled user-base keyword tagging of content (usually contrasted against the taxonomy of a site like Yahoo, where a central group of annoited ones classify content, albeit really they're just rubber stamping the classification provided by the website owner).

For instance I upload a picture to Flickr, keywording it flower and bee. Now people searching for related pictures can browse amongst pictures of bees, or flowers, or bees and flowers, and see my content amongst everyone elses. del.icio.us follows the same model, with users adding links and meta-data keywords that categorize them. Links can then be searched or related by keyword(s).

Everything old is new again.

In the early days of search engines, the content parsers were really quite dumb - they couldn't read the content of a web site and really figure out what the subject of the page was. As such the META tag was added, allowing website owners to attribute their content with a small, select group of keywords, and those keywords would allow it to choose content appropriate for user searches.

Of course what started as a good idea quickly devolved - nefarious website operators learned to put unrelated, popular terms in their keywords to earn additional hits: What started as a great idea devolved into a tragedy of the commons as more and more people got involved, and they started gaming the system for their own advantage.

The sort of things that work on small-scale, edge sites quickly degrade as they become larger and more important. Already many of the "social tagging" sites are getting overwhelmed with spam and false hits, and of course errors, and more commonly errors of omission, are extraordinarily common on these tagging sites.

Of course for photos there are no other options - we're not at a point where an automated analyzer can look at a picture and determine what it's about, so tagging is the best we've got. However for some attributes, like location, free-text tagging is terribly unreliable, which is why I look forward to the automated GPS tags talked about previously.

http://www.yafla.com

Earlier EntriesLater Entries

Dennis Forbes