Dennis Forbes on Pragmatic Software Development
Subscribe to RSS
 
Tuesday, April 03 2007

When People Danced to the Macarena

The exploding importance of the Internet in the mid-90s brought tremendous change to the technology market. It forced industry leaders and followers to hastily adapt to the new opportunities and challenges.

It was a do-or-die time, and you had to embrace and adapt, or get extinguished.

To everyone but Microsoft, it seemed.

Despite the hurricane-force winds of change around them, the industry leading behemoth looked to be stuck in a recursive loop. While upstarts were racing in every direction, envisioning and implementing new uses for this growingly accessible platform, Microsoft seemed to be busy navel gazing, more worried about how to maintain the status quo.

Despite the relative success of Windows 95 -- the long-overdue migration to mainstream 32-bit computing -- Microsoft's slow-moving heft seemed to make them incredibly vulnerable during this critical transition period, making them appear a lumbering giant that could be toppled by the smallest adversary.

The young upstart Netscape appeared a likely candidate to shoot the mortal stone: Sales of Netscape's server and browser products yielded a revenue growth curve exceeding that of any software company in history. They were actually running a profitable business, which was a remarkable feat for a technology company at a time.

Their Netscape Navigator browser had fortified a seemingly insurmountable position in the marketplace . The company image was hip, and Mozilla adorned swag was flying out of their online store.

In that era of seemingly boundless opportunity, inebriated with the seemingly limitless potential of the company that he co-founded, Marc Andreessen made the infamous comment that the Netscape Navigator browser, coupled with the Java platform, would reduce Windows to an "unimportant collection of slightly buggy device drivers."

By then Gates had penned his famous internal "Internet Memo", demanding that the company focus on the Internet. The cruise ship Microsoft was ever so slowly changing course.

While the overwhelming majority of Microsoft's renewed focus turned out to be largely useless "internet-enabled" bedazzling of existing products -- the oft-lauded "turn on a dime" fiction about Microsoft's Internet revolution is grossly overstated -- where it really counted, the browser, Microsoft executed very well.

Microsoft's browser offering quickly became good enough that that average user couldn't be bothered to download and configure a competitor's products on their new PC (Microsoft didn't have to provide a better product, or even as good for that matter: It just had to be good enough to dissuade an average user from seeking out alternatives. This is a bundling reality used in all industries).

Add to that the fact that Netscape's development cycles got longer and longer, their innovation dried up, and their product got buggier.

Eventually Internet Explorer was the winning product on merit alone.

Soon we had an internet full of "Made for Internet Explorer" buttons. Much of the non-academic web had been Microsoft-ized, and you couldn't play unless you went where Microsoft was going today.

The rest is, of course, history: Internet Explorer rocketed to success, almost entirely at the expense of Netscape.

Knowing how things turned out, with the all-knowing clarity of hindsight, Andreessen's claims of course look like foolish bravado. Even at the time it sounded like nonsense: Java applets had shown little promise, delivering terrible performance, atrocious interfaces, and an awkward, crippled interaction with their host environment. The browser wasn't much better, limited mostly to rendering personal pages full of blink tags and gaudy color schemes.

I recall reading that quote from Andreessen back then (I believe in a Dvorak article in PC Magazine), puhshaw-ing in disbelief. I couldn't believe his audacity, and as a junior Windows-targeting developer at the time, with perhaps a bit of a fear of change (nobody likes when their skills, even at a beginner stage, are being obsoleted), I cheered on a Microsoft response.

"Bill Gates is going to CRUSH this guy!" I thought.

And of course Microsoft easily won that battle.

But are they losing the war?

The Pillars of Our Reliance on Windows

Windows as an operating system certainly has a lot going for it: It is feature rich, demonstrates a lot of technical excellence, and can credibly measure up against any competitor.

Yet for many users over the past decade, there was no choice: Windows was obligatory. It was exactly this hegemony that Andreessen felt his platform was upsetting.

His prediction was just a decade or so early. And instead of Java being their tag-team partner, it's JavaScript/AJAX, Flash, and the innovation and power of modern console gaming.

"I dual-boot to play games"

I hit a local department store recently to look for some educational games for my pre-school aged daughter. This location never had an extensive PC software selection, but I was still surprised to find the entire section had been removed, save for a couple of relics sitting in a discount bin.

The entire area was taken over by game console and handheld software.

Thinking this was an anomaly; I drove across town and checked their competition, and then their competition's competition, only to find the same at each: No PC software at all was for sale.

No games. No typing tutors. No foreign language training. No photo management software. No pre-school aged games.

Baffled, I hit the local EB Games location. Over the years I'd purchased dozens of PC games there, so I was shocked to find no PC software at all (the exception being a couple of ratty late-90s era boxes in a wire-mesh bin).

Determined, I ventured to the local Future Shop (the Canadian equivalent to Best Buy, and in fact the chain was acquired by Best Buy a few years ago, causing much confusion as it came in concert with the actual Best Buy chain itself) to find a small PC software section. While it was much smaller than it once was -- where once there were rows dedicated to just productivity applications, now a miniscule little section caters to the entire gamut of software -- at least it was something.

However compelling, my personal anecdote doesn't really prove much, but it does correlate with industry metrics that have shown retail PC software sales to end users to be stagnating or in freefall. Businesses keep buying their Office and Windows licenses, of course, and niche groups keep satisfying their business need, but what once was a vibrant retail market for applications and games has virtually disappeared. Some of this has been supplanted by online purchases, including some new electronic delivery method (which is how I got Half-Life 2 -- an impulse purchase is well catered to by a simple online purchase with immediate satisfaction), but much of it has just disappeared.

Consumers just aren't consuming PC software anymore.

The reasons are obvious.

Deja Vu All Over Again - The Rise of Console Gaming

On the gaming front, the PC has seen incredible competition from gaming consoles. Not only have those competitors evolved into technical heavyweights, the simplification of the entire gaming genre has equalized the playing field: Where once a mouse and a keyboard were mandatory to play any decent game, most popular games now feature simple interfaces that are equally accommodated on any platform, and the complex simulator type games, once the consistent chart toppers, are largely unloved.

You don't need a mouse to interact with an onscreen flower menu. You don't need a keyboard to communicate via a headset and in-game Voice-over-IP.

Consoles aren't the only reason for PC gaming's decline -- general internet use has taken a lot of time that people would have spent gaming, some of that time being spent being entertained by the countless Flash-based, cross-platform games available now.

Doomsayers have being declaring the death of PC gaming for years, as generations of consoles have come and then gone and Windows gaming has remained, but never has it seemed as likely to actually happen. In response, Microsoft is attempting some Windows gaming branding; perhaps realizing that it was a linchpin of their occupation of the home; but their intervention is likely too late.

So what does any of this have to do with Windows and Netscape and buggy device drivers?

One of the primary reasons many users felt tied to the Windows platform was gaming: If you wanted to play any of the prominent games at the time, that collection of slightly buggy device drivers was very important, and the game-du-jour was usually very tightly coupled with the platform. Aside from a couple of exceptions, PC gaming overwhelmingly meant Windows gaming.

The Netscape browser certainly wasn't a replacement for this. Neither was the Java platform.

This situation led many prospective Windows migrants to declare that they would make the move to Linux or the Mac or FreeBSD or whatever, if only they could run their current gaming obsession on it. Dual-booting is a half-measure that seldom held, and the direct graphics card access meant that gaming couldn't be accommodated via virtualization, so more often than not they just stuck with Windows.

"But my applications only run on Windows!"

Across the industry hundreds of thousands of solutions have migrated to the web, and if anything the pace is accelerating. Despite Microsoft submarining the overly-capable Internet Explorer team -- a team that brought us many of the innovations that we now enjoy in competing browsers -- the genie was out of the bottle: Many had experienced the incredible platform freedom, wonderful deployment model, and rich interfaces provided by web applications.

The classic computer purchase justifications (as stated by a million pleading children trying to convince their parents that a new gaming rig will be productive for the household) -- balancing the checkbook, storing recipes, authoring and sending letters (now email), maintaining databases -- can all be very competently accomplished online, from any modern browsers available on dozens of platforms. In many ways the experience is superior online, given the accessibility of the data from anywhere at any time.

Not every task can be performed online or from a web browser, and for those needs a plethora of cross-platform, often open source options have appeared (ex. GIMP, Open Office). Yet it remains that for an average user, the overwhelming percentage of their computer time now will be spent in their add-in enabled web browser, perhaps accessorized by one of countless available, many-network supporting IM clients.

Which is, of course, where we circle back to Andreessen's prediction: The most popular, and arguably capable, cross-platform browser is the Firefox browser. It is the phoenix (and was originally named firebird) that rose out of the ashes of the collapse of Netscape, the source code open sourced and revitalized with a many year reworking. While its market share numbers remain relatively small, its influence has been absolutely extraordinary. Even for sites that see 100% Internet Explorer users, the freedom and diversity offered by Firefox often leads enlightened development teams to ensure that they facilitate it just as well.

The rules of the game have completely changed. While many were prematurely declaring the end of Microsoft's dominance for years (every year for the past 7 years or so has been declared "The Year of Linux" by some open source evangelist or other), it has been years since the field has been so open for actual competition.

It has been a long time since the choice of platform held so few caveats and limitations.

We are entering a glorious time when the operating system really is an unimportant collection of device drivers, no longer driving completely unreleated application choices.

Monday, April 16 2007

Being able to quickly and easily build team projects on a newly imaged PC is a development process necessity: A new team member, with not a whit of project knowledge beyond where to find the simple build instructions, should be able to follow a sequence of clearly documented steps -- automated where possible -- painlessly generating a build.

No unnecessary mapped drives and hard-coded UNC locations. No undocumented but necessary third-party tools at hard-coded locations. No byzantine by-hand registrations and muddifications.

This holds true for open source projects as well. While a grizzled kernel hacker obviously doesn't need hand-holding, they didn't start as a grizzled kernel hacker. At some point they were new to the code, and the number of obstacles they faced in those early days were probably significant indicators of the likelihood that they would stick with it, overcoming administrative type nuisances and getting to the point where they were actually working on the code itself.

Some may see the barrier to entry that often exists as a useful filter, only letting only the best of the best through, but that contention seems dubious. More likely an onerous getting-started process simply demotivates a lot of great talents from even bothering. Being an expert C++ developer doesn't mean that one wants to spend a day messing around with cygwin packages and dependencies, setting up countless poorly or incorrectly documented environment variables and configurations.

FirefoxOn this theme, I recently took a look at the state of the bleeding edge of Firefox -- I think Firefox 3 is going to be one of the most important applications in years, and is going to completely redefine the entire industry -- and was very pleasantly surprized to find how stunningly gorgeous the build process now is for a Visual Studio-using-Windows developer.

-Download and install Mozilla Build.

-Run the appropriate start-msvc batch file (e.g. start-msvc8.bat for Visual Studio 2005 users). I updated mine to set the CL environment variable the compilation flags that I wanted, as opposed to passing them on the --enable-optimize parameter of the .mozconfig file).

-In the appropriate location -- / is fine, given that it's actually in the msys subdirectory and not really at the root, get the client.mk file via the following trivial command.

 cvs -d :pserver:anonymous@cvs-mirror.mozilla.org:/cvsroot co mozilla/client.mk

-Navigate into that folder
 
 cd mozilla

-Do a CVS get of the appropriate project (originally I was getting the source outside of the make script using the excellent TortoiseCVS, however it turns out that you can't just wholesale grab the tree, and should stick to the integrated CVS functionality).

 make -f client.mk checkout MOZ_CO_PROJECT=browser

-Configure an appropriate /mozilla/.mozconfig file (note that Windows will block setting that filename directly. Do a mv move command in the MINGW32 shell after saving from notepad or wherever. You'll likely just copy the block on the linked page for the appropriate project, however if you're adventurous you might try out the configurator tool).

-Build it!

 make -f client.mk build MOZ_CO_PROJECT=browser

This is the slickest, most painless process for such a large scale application that I've ever seen. I can just re-checkout and build daily if I'd like to be on the razor edge, though sometimes that will mean a broken build.

Now I'm running an ultra-optimized, stack-protected custom build of Firefox 3.

Type R Firefox

I'm actually delving through the code with relative ease, testing my custom changes absolutely painlessly (in my case, curiousity brought me into the javascript engine, found in the js subdirectory. While the code is inherently advanced -- it is a remarkably complex product -- it is reasonably easy to follow around and get a feel for).

Brilliant. Absolutely brilliant. Now I just have to find a way to put some obnoxious exhaust pipes on this bad boy.

Friday, April 27 2007

The success of Vista is of obvious importance to any developer targeting any Vista-only or Vista-enhanced technology, so the latest news that Vista sales are strong and higher than expected is of obvious interest.

It appears that Vista has been a stunning success for Microsoft, surprizing even the optimists inside the Redmond empire. This stands is a stark contrast with reports of general consumer apathy about this release, and many dire predictions about Vista's adoption (predictions that are dubious, given that Vista is pretty much assured a reasonable level of success given its automatic sale to virtually anyone buying a new PC. Corporations naturally delay adoption of new operating systems, as they have with every prior edition, so only a delusional expected it to storm across business desktops).

You can see their Q3 return on the SEC site. Under the Client division, you can clearly see that revenue has rocketed up, from $3.151 billion in Q3 2006, to $5.272 billion in Q3 2007.

There are a couple of massive, almost-Enronesque caveats to these numbers that deserve some serious scrutiny before you start your Vista-only product launch.

Firstly, they deferred $1.2 billion of XP sales from Q1 and Q2 2007 -- sales that qualified for the upgrade coupon -- rolling it into this result.

Client revenue increased for the three months ended March 31, 2007, primarily reflecting licensing of Windows Vista, including recognition of approximately $1.2 billion of revenue previously deferred in fiscal year 2007 pending the January 2007 release to consumers.

Counting out that rather dubious bit of accounting trickery drops the gross revenue to a growth of just $0.8 billion over the year earlier.

Secondly, some of the gain is attributed to the price premium applied to Vista Premium (effectively imposing a price increase over prior XP licenses, as most PC makers automatically deliver the Premium edition.)

During the quarter, the OEM premium mix increased 18 percentage points over the prior year to 71% driven by the demand for Windows Vista Home Premium.

Not to mention that the computer market in general has grown over a year earlier.

Based on our preliminary estimates, total worldwide PC shipments from all sources grew 10% to 12% from the third quarter of the previous year and approximately 8% to 10% from the first nine months of the previous year driven by strong consumer demand in both emerging and mature markets.

Not quite as successful as some reports are claiming: After a year of market growth, subtracting the hard-to-rationalize rolling-forward trickery, and considering that the price for the operating system was effectively raised via the Premium edition, and suddenly the situation doesn't look quite as rosey.

Rough back-of-a-napkin calculations have OS unit sales remaining relatively flat after incorporating in market growth. If this exceeded inside-of-Microsoft estimates, then clearly they're either lying, or they were expecting an implosion. Applauding about Vista's lofty percentage of OS sales should be quelled by the reality that virtually every new client PC OS sold now is Vista -- if a corporation does want XP, their recourse now is to buy a Vista license that grants them the right to install XP, though it'll carefully get added as another vote of confidence by the Microsoft beancounters.

There are several other surprizes in their Q3 report. For instance that the entertainment division (Xbox, Zune) saw revenue drop over a year earlier, and that the Online Services Business -- this is where Microsoft put a lot of attention recently, with the huge push of the Live platform -- saw a marginal revenue increase, with a significant loss increase.

The only bright spot of the whole quarterly report, from my perspective, is the business systems division: Office 2007 has seen good adoption and has very healthily contributed to earnings.

Reports that this quarterly report validates Vista's success are unfounded. Further, it puts a huge question mark over Microsoft's web and hardware initiatives (the complete failure of the diversification to actually add to the bottom line, instead of just drowning in losses -- excused early on as toothing pains, but there doesn't seem to be a point when they'll actually make money -- should raise serious concern. Microsoft is still held aloft by Windows and Office).

Thursday, May 10 2007

Moto QThe need for always available email had me recently equipping up with a Motorola Q Windows Mobile 5-based smartphone. With Exchange Direct Push email capabilities (where the device opens an idled HTTP connection, being notified expediently when new messages are available with the minimum of data throughput), it has served the core purpose admirably, and is a wonderfully handy little device.

The addition of wireless, always-available communications has made it infinitely more useful than prior abandoned outings with PDAs.

Ultimately it's a Blackberry(TM) competitor. While I'm only 60km from RIM headquarters, for me this was a better device than the more commonly chosen option. In this case the technology infrastructure didn't require a third-party to unnecessarily act as an middleman of messages.

I don't only use it for email, though. Every now and then the device serves secondary duty as a web browsing tool: The landscape QVGA screen isn't exactly copious, but it's enough for basic browsing for some sites, catching up on tech news and happenings in situations where a traditional network isn't available, and I don't want to open up a laptop.

This blog looks great on it. The "content" column perfectly filling the screen by luck rather than intent. The homepage requires an excessive 229KB of transfers, but at least most of those bytes are filled with content text. Nonetheless, I think I'm going to change the settings to show fewer days of history on the main page.

Despite the grand pronouncements by the telcos about their high speed, next generation networks, the speed is often closer to dial-up, and where throughput is high the latency is often poor, making pages with dozens of elements a time consuming affair. Even with a speedy connection, many telcos have low throughput limits with exorbitant fees beyond that.

Loading a page like Joel's discussion page -- a very basic text discussion site -- remarkably pegs in at 280KB or so of transfers (grab a copy of the extraordinary Firebug add-in for Firefox and look at the Net tab. It is often eye opening), overwhelmingly for scripts that have no use on the discussion site.

Sure, caching helps for subsequent requests, but on small devices there's often little room set aside to cache 100s of KBs of irrelevant scripts. Worse, the linked versioning -- adding a date version number as a parameter -- used by Joel and crew has seen the cached scripts invalidated frequently.

It's too bad there wasn't a, err, "function-level linking" for JavaScript, automatically eliminating all of the unused script from pages that don't require it.

One of the better sites for mobile browsing is Google: Recognizing the limits of the device (presumably by noting the Mozilla/4.0 (compatible; MSIE 4.01; Windows CE; Smartphone; 176x220) user agent string -- not sure why it says 176x220 when the actual resolution of the device is 320x200), it renders a pared-down (even more!) version of the search engine, better still automatically proxying search results through an agent that filters pages down to more mobile friendly forms. Very nice.

It's a good thing that Google filters results, as many sites just render terribly in the small confines of QVGA on a Windows Mobile 5 device.

Obviously the mobile browsing market is a tiny, but growing, contingent of users, but it is something I'm going to pay more heed to. Too often we presume that everyone has a 7Mbps high speed pipe feeding an ultra high resolution display, when that isn't always the case. As smartphones continue to take off, and providers facilitate use by easing off on the restrictions and excessive charges, it's going to become a very important market.

Monday, July 16 2007

Cueing Up An Ad

"Come here and see this cool PBS promo that was just on!" I call to my wife, tapping pause on the PVR's remote. Given my history of calling her in for replays that weren't as hilarious or amazing as I originally imagined them to be, I knew this had to be a slick presentation if I wanted to impress with my PVR-fu.

As she enters the room the PVR finally reacts to my command, pausing, but then immediately playing again: Once again I've been caught out hitting the button twice, assuming that the first request got lost in the ether -- as often happens -- when it didn't seem to respond in a timely manner.

I hit pause again and this time it immediately reacts, coming to a halt.

"I just have to cue it up," I say, buying some time.

I tap rewind. Nothing happens. Come on, I think, I've got an impatient audience here! I tap it again, and the box launches into double speed rewind as I race for the play button. It plays on demand, but now it's several minutes before the desired start point.

Repeat.

Unpredictably high user interface latency strikes again. While this Motorola PVR is an exceptionally bad culprit for random non-responsiveness, it's hardly the only example of this seemingly growing trend.

User Interface Lag

My Moto Q smart phone is a great little device that I really enjoy, but the user interface responsiveness is enormously uneven, frequently lagging several seconds behind commands. Whether waiting for it to complete an application switch, or even during basic interactions such as entering a URL in the address bar of Pocket Internet Explorer, it's often out to lunch.

Presumably the questionable multitasking of Windows Mobile completely blocks the user-interface thread when it decides to chatter with a cell tower. Nothing else can explain its behaviour.

On startup my DVD player apparently needs to initialize its own little operating system, and if there's a disc in the drive it automatically demands that it determine the contents before it will eject. It insists that it be able to put "DVD" on the front-panel before any other activity occurs, achieving silicon self-satisfaction that it accurately determined the media type.

A common scenario has us getting ready to leave the house, preschool children bubbling with energy, when we realize that we have a disc that we should return to the movie store.

Turn on the entertainment unit. Wait for DVD to pre-power initialize. Hit eject on the front panel (which automatically turns the unit on). Wait as DVD player initializes and then unnecessarily spins up the disc in the drive to read the disc type and root menu.

Finally it ejects.

It isn't just household devices that show this worrisome trend. The bank recently "upgraded" their ATM machines, bringing a colourful, graphical façade to what was once an glowing-green, ASCII, very serious interface. What once was a quick navigation through the menus now sees the painful redrawing of screens and laggardly keystroke responses.

Seconds Add Up To Minutes Add Up To Hours Add Up To...

It might only be 10 seconds or so from beginning of DVD eject procedure to the actual ejecting of said optical disc, but that's approximately 9.75 seconds more than it needs to be.

When time is short, even small delays like this can be incredibly irritating.

Despite the increasing computational capacity of our devices, the problem seems to be getting worse. User interface responsiveness seems to lie low on the list of priorities in many contemporary electronic devices.

These devices seem to be growing slower and slower, yet the processors that power them are getting dramatically more powerful.

A Supercomputer in Your Hands

Slashdot recently had a story linking to some reviews of a new Windows Mobile 6 smartphone. Several of the comments provided variations of the argument that the primary weakness noted in the review -- poor performance -- was the result of "underpowered" hardware.

Throw some more hardware at it and everything would be okay, the argument goes.

Consider that for a moment: is hardware really the problem? My Moto Q -- a device that often demonstrates terrible responsiveness (I'm not trying to pick on Motorola -- I've noticed the same behaviour with Nokia and Audiovox phones) -- is powered by an Intel XScale PXA272 processor running at 312Mhz. It comes with 64MB of RAM, 128MB of flash memory, and I've supplemented it with 1GB of miniSD flash storage.

Is that an insufficient bit of hardware to manage the awesome tasks of a smartphone with a 320x240 screen?

As a point of historic comparison, in the late 80s I was a proud owner of an Atari 520ST. It was a multimedia powerhouse powered by an 8Mhz Motorola 68000.

Despite what now is a laughably anemic CPU, it seemed infinitely capable at the time: I used it to create complex reports for high school in a full featured desktop publishing app. I did hobbyist software development in a rich IDE on it. I wasted away countless hours trolling local BBS'. It even was a wonderful game platform, running richly challenging games with gusto (games far more advanced than what you often find running in J2ME on your cell phone).

Later I upgraded to the Atari 1040STE, still with the same 8Mhz 68000, but offering expansion capacity to bring it up to a colossal 4MB of memory. This was so much memory that I usually created a memory "disk" out of 3MB of it, and still never felt limited in the 1MB.

Seldom did my ST ever feel laggy or non-responsive -- it booted close to instantly from ROMs, and the simple UI was always extremely responsive. Demo programmers had it doing tricks that still impress me to this day. Later a UNIX-style OS was ported to it, including full pre-emptive multitasking.

So how does that relic of the past compare with something like the Moto Q? Comparing straight Mhz isn't a valid comparison (for instance the ST is a CISC processor, versus the RISC XScale), so I went searching and found some Dhrystone 1.1 benchmark numbers for both the XScale at 312Mhz and the 8Mhz 68000.

8Mhz 68000 (Atari ST) - ~1,603 Dhrystones / second

312Mhz XScale PXA272 - ~731,512 Dhrystones / second

On this benchmark the PXA272 in the tiny little smartphone on my belt (yeah, I'm a nerd) is equal to 456 Atari STs. Let's look at that in a bar graph in case it isn't clear enough.

8Mhz 68000 1,603 Dhrystones / second
312Mhz XScale PXA272 731,512 Dhrystones / second

Wow.

Memory wise the Moto Q has a virtually infinite amount of memory compared to my old Atari ST.

I'm not trying to pretend that the ST of old did what a PDA of today is doing: I remember first getting access to low resolution JPEGs on a local BBS (I was a teenager and they were swimsuit photos...pretty risqué at the time. This stuff was much tamer than an issue of Maxim magazine or an "Umbrella" video), having to go through the tedious process of first "decompressing" them to a TGA, waiting as the decompression processed for sometimes minutes, and then viewing the uncompressed image. There was no way it could realtime do something as complex as rendering a JPEG.

Yet considering this enormous increase in computational power, it does seem evident that many device developers aren't respecting the time of their users, and few users are calling out terrible interfaces for being unresponsive and disrespectful. Reviewers, in particular, seem blind to responsiveness when rating devices, presumably because the artificial environment of a review can't be compared to quickly trying to respond to an email while standing in an airport terminal just as the last boarding call is made.

The Basics of a User Interface

A user interface should be predictable and consistent -- it should always respond in a short, consistent amount of time (I would honestly feel that the PVR would be better if it always took 3 seconds to react to a command, versus now when it's anywhere between 0 and 5 seconds), always allowing the user to cancel operations that they're no longer interested in.

Responding to the user's input should always be job #1.

Tuesday, September 04 2007

A recent article on the utility of multiple cores has been making the rounds. Despite being largely a copy/paste of other articles and graphics, with a smidge of editorial commentary, it is anxiously heralded by dual-core owners as purchase justification in the face of progressing technology.

[As fair disclosure, let me say that I'm about to purchase a quad-core processor based system, and this article and its sources did absolutely nothing to dissuade me from this choice]

The meat of the article (or rather the articles that are referenced by the article -- someone else did the dirty, arduous footwork work of benchmarking) is comprised of a showdown between a 2.4Ghz quad-core and a 3.0Ghz dual-core, which is reasonable given that they're comparable in price [at writing the 3.0Ghz dual-core E6850 can be had for $384 CDN, while the 2.4Ghz quad-core Q6600 is $319 CDN]. Given that many games and applications are effectively single-threaded as a legacy of lowest-common denominator development, the faster clock speed dual-core processor abstractly takes the lead in such fundamentally synthetic benchmarks for the pricepoint.

Aside from the questionable "it's good to have one extra core to allow you to kill bad processes" premise (what if those bad processes are multithreaded? Do you just have to buy bad-process-threads+1 cores? Maybe set the affinity such that you've dedicated a core solely for the task manager? In the real world of modern schedulers, the only time you can't get control of the machine to kill a rogue process is because of some absolutely atrocious elements of the implementation of Windows, and a scheduler that is effectively broken in the face of some situations. Neither is necessarily improved by more cores), what really gets me about the whole exercise is how utterly synthetic it really is, using contrived benchmarks instead of rationally considering how people actually use their PCs, and where their real need for more power comes from.

Firstly, it largely focuses on games benchmarks. Even if gaming performance is pertinent to the reader, for the majority of users playing the majority of games, their video card is far more of a bottleneck than their processor (even if their processor is a dated affair). I'm saying this as a long time computer gamer -- one that finds the stuttering framerate on even top of the line game consoles intolerable: unless you've turned every quality setting to low and you're running at 800x600, it's doubtful that you're going to even measure, much less notice, a difference between a modern 2.4Ghz core and a 3.0Ghz core. Indeed, the very first benchmark I looked at on the referenced article says exactly that: "For this test, we set Oblivion's graphical quality to "Medium" but with HDR lighting enabled and vsync disabled, at 800x600 resolution". They did that to create a scenario where the differences are measurable.

So if you plan to game in a contrived way for the purposes of demonstrating CPU differences in benchmarks, then you'd better pay attention to core speed.

In the real world of gaming, after you've adjusted the quality and resolution settings to appropriate settings for your video card, the primary slowdowns during gaming tend to come about because of external applications rudely stealing your thread quanta: I'm about to toss the grenade into the bunker in Battlefield 2 when suddenly Windows Search has decided that this is a good time to rebuild its index corpus, for instance, so instead it falls to the ground and I take out my entire squad (Seriously, Windows Search guys - when a full-screen DirectX game is running, it probably isn't a good time to decide that the PC is "idle").

For moments like that, more cores make a huge difference. Dual-cores would be sufficient for that simple scenario, but what if my PC is even more active, as it always is? Perhaps the blog updater is running an update, I'm FTPing some files, a download is happening, and I'm gaming.

Every core works towards the ultimate goal of eliminating the real world problem of cycle theft from my hardcore gaming.

Presuming that you've passed a reasonable bar -- long behind you when you're talking about a 2.4Ghz Core 2 -- more cores will realistically improve things for gamers enjoying their vice in the real world. One day we might even have a world where we don't have to shut down services and trawl taskmanager violently killing processes before launching a game, fearful that it will disrupt our immersion.

My second problem with the article is that it doesn't question what people are really waiting for nowadays. Personally I see almost no difference between virtually any mainstream PC for the overwhelming majority of day to day operations (and this is as a developer) -- most activities are so fast the difference is negligible. I just switched laptops from a single-core 1.6Ghz Pentium M to one with a Core 2 Duo T7200 -- a significant improvement -- and from a day to day perspective I've indeed notice that the new laptop has a better screen, a faster harddrive, and much better graphics, but the computational difference is largely unnoticed.

Until, humorously, I do something that is highly parallelizable, such as encoding a video pulled in from the miniDV video camera. In that case the dual-core processor strides to a massive lead over its single core predecessor. If it were a quad-core, it would storm even further ahead, even with the loss of frequency.

For something that I'm actually waiting on, more cores = more goodness.

I would definitely choose the quad-core processor for the software reality legacy that we have today, despite the many applications that in the singular fail to exploit the possibility. My conviction is amplified by the tremendous strides that application developers are making to parallelize their products. Once you've parallelized to 2 cores, it's generally a very small step to parallelize to 4 cores, or n cores for that matter.

Bring on the cores!

Friday, December 21 2007

“What gets measured gets done.”

I decided to take the new SunSpider benchmarks for a spin, generating the pretty graphs found down below. Benchmarks are always entertaining, and it was enjoyable comparing the numbers yielded under various conditions (turning SpeedStep on and off [none of the benchmarks loaded the CPU long enough for it to bother raising up from its lowered power, 66% performance relaxation state], setting CPU affinity, running it on different PCs, trying different build options on my Firefox build, etc.)

"Why benchmark at all?" one might ask. Simple: If you find the right measures, the common wisdom goes, the inputs to the measure will improve as the various players work to improve the metrics.

Whether you’re measuring bugs per developer, lines-of-code, widgets per hour…whatever: Start measuring it and invariably it’ll start moving in the desired direction, whether this actually serves your end goal or not. Often such initiatives come at the cost of the unmeasured, but over time it adapts and starts serving as a beneficial feedback.

The Assembly Line Benchmark - Widgets per Hour

The WidgetDuring the summer in my late teens I worked on an assembly line building car parts (pieces that played some sort of role in the air conditioning system – basically widgets): Put a little bracket in a metal cylinder, add a circle of fiberglass, inject some desiccant beads using a machine, add another fiberglass circle and another metal bracket to hold it all in place and then put it in another machine that squashed another cylinder onto the top. Then I sent it down the line to the welder.

Atop my machine sat a little counter that monitored my progress, carefully recording every piece assembled. While this was a less advanced era — being the prehistoric early 90s and all — and I had to manually transfer the final count to my timecard for submission, every worker was kept somewhat honest by the metrics submitted by the other workers on the line.

Clearly I couldn’t have done 2000 parts in a day if the people before me and behind me in line only reported 1000, for instance, and vice versa.

Coupled with continuous, careful QA tests and random inspections (performed by people who had their own metrics to work towards), this struck me as an excellent system because it was difficult or impossible to game, and the onerous checks ensured that it didn’t come at the expense of quality.

It certainly worked wonders on me, as I wiled away the endless summer days performing the most awesomely brainlessness of tasks by competing with my own personal productivity “records”, endlessly trying to push out more quality parts per hour day after day.

I was there and had nothing better to do, and that little counter sat looking down on me, mocking me. It dared me to do just a couple more pieces per hour, and I willingly complied.

Somewhere a paper pusher and cackling middle manager would sum up the part counts and rub their hands together in giddy glee, eager for my zombie-state quest for worth to pad their bonus cheques.

It’s good I was a summer employee, because my pace didn’t make me friends on the line.

Test Driven Development tries to create a similar spirit of metrics, giving you a goal to strive for as you build out your product. It’s a comforting bit of feedback when all of the TDD tests come back with green checkmarks. The more tests you create, the higher the absolute count of passes you can brag about when the product sails through with flying colors, easily passing 497 of the 497 tests.

Performance benchmarks serve the same purpose for the performance and efficiency domain.

Consider the initial hardware-accelerating video cards for Windows. Early on they seemed to have little or no purpose, and were almost abstract to users. Then benchmarks started appearing, giving the manufacturers something to strive towards while also providing end users with an easy way to compare and choose amongst the options. “Card {A} can only do 10,000,000 accelerated rectangles per second, while card {B} can do 12,000,000. Clearly we need to get card {B} for our rectangle displaying needs.”

Gaming the MetricsDiamond Speedstar 24x

Of course some vendors started gaming the metrics in various creative ways (see Joel's excellent essay on poorly thought out metrics). Several created products that actually recognized running benchmarks in hardware, “optimizing” (by any means possible, including simply discarding many of the benchmark commands, knowing the end user will never notice if every second rectangle or rendered text of the millions per second isn’t being rendered). Worse, the benchmarks were so atrociously artificial, bearing little similarity to actual everyday use that the direction of progress was to optimize the performance of benchmarks, often to the detriment of everyday use.

Eventually the benchmarks matured, getting better and more realistic, and the gaming was prevented or embarrassingly heralded, and it became a hugely beneficial tool in the march forward in the field. Various games have served the benchmarking role, the Doom and then Quake series being the most influential.

In the browser market, the growing interactivity of the web and the renewed competition amongst the big competitors has seen a flurry of benchmarks being widely discussed and debated, stereotyping each of the browsers into performance ghettoes. “Firefox is sooooo slow….” “IE is garbage. Opera is super speedy!”

Having some real tests is of obvious benefit to “set the record straight”, not to mention that it provides a carrot for the competitors to chase. Exactly that happened with me a while back when I came across a string concatenation benchmark, so I went in and streamlined the piece of Firefox code specifically impacted by that benchmark. My change in place, Firefox indeed did much better on that specific benchmark, though the real-world benefit was negligible.

In many ways the various web benchmarks available reminds me of the early accelerated video card benchmarks: Crude, having little or no correlation with the pain points of real-world use, and opportune for gaming and false evangelism.

WebKit's SunSpider

Which brings me to the recently released SunSpider benchmark (which is a credible contender for the widely coveted “most poorly chosen project name” award: It’s bad enough that an Apple project uses “Sun” in their product name, but it's thrice as bad when it’s a project related to JavaScript – JavaScript being another nominee).

SunSpider is very easy to run and gives quick feedback, so quite a few charts and graphs have been sprouting on blogs across the land.

JavaScript/DOM performance is a huge concern right now, as web applications are growing in richness by leaps and bounds, so there is definitely a need to be filled.

Will SunSpider be what we've all been looking for?

Here’s just such a graph, charting the stacked benchmark runtime for the current tier-1 browsers for Windows.

SunSpider Benchmark Results

Benchmarks were performed on a 4GB, Q6600 quad-core Core 2 processor machine running Vista x64. Firefox 3 was built from the current CVS (as of this morning). The Y-axis represents milliseconds.

Such a benchmark provides immediate feedback regarding the biggest bang for the buck optimization, at least in regards to improving the runtime of this particular benchmark. For IE 7 it is pretty clear that the benchmark killer is the bizarre and repeated use of string concatenation throughout the benchmark tests, particularly evident in the string-base64 and string-validate benchmark.

Naive String Concatenations

After approximately 20 seconds (okay, maybe 22 seconds) "optimizing" the base64 and validate tests to use the extremely common Array push/toString idiom that is used on pretty much any page that does more than the most trivial of string operations (my changes were rash and very simplistic, though if I were motivated — if this were production code — I could do a much better job with it), the performance had changed rather dramatically, as seen in the following graph (scroll up and down for dramatic flair).

SunSpider Benchmark Results

It's late and I'm tired, but I'd guarantee that I could dramatically decrease the remaining largest test -- string tagcloud -- but I think the point is proven.

Some will naturally draw from this the presumption that I'm just an Internet Explorer 7 fan, desperately manipulating the benchmark to best fit the strengths and avoid the weaknesses of my favourite browser.

They'd be wrong.

My browser of choice is Firefox. Not only do I not find the featureset of Internet Explorer 7 uncompelling and anemic compared to a naked copy of Firefox (not evening considering the enormous functionality offered by add-ins, such as the extraordinary Firebug), I find the performance of Microsoft's offering to be atrocious on real-world websites.

I don't like Internet Explorer on technical grounds, and I like it even less given the concerning conflict of interest it represents.

Perhaps I'm just bearing a grudge.

We're currently implementing a very rich, advanced web application, and one thing that we've found, in case after case, is that in real-world situations with extensive DOM manipulation and production JavaScript, Internet Explorer stumbles and groans under the load, while competing browsers complete the task with gusto (just rendering a dynamically loaded complex table takes 20x or more on IE than in Firefox 2. The disparity grows greater with Firefox 3). It's to the point that I can't help but wonder if Microsoft is trying to undermine the whole web thing intentionally, hoping to encourage the middle-grounders to hoard to the boards proclaiming the deficiencies of web apps, manipulated into begging for some XAML goodness.

So if I wasn't looking to defend IE7, what was my point?

Lies, Damned Lies, and Benchmarks

Maybe the motivations of the team behind this benchmark were noble, and they weren't blinded into naturally biasing the benchmark towards their own project, but I can't help but see this benchmark as an entirely artificial, naive, unrealistic benchmark that adds little to the benchmarking landscape. A cursory glance through the benchmark sees bizarre oddities that would never appear in real-world code, and a variety of implementation choices that are questionable for a benchmark (for instance test/sample data is often constructed within the timed scope of the benchmark in the SunSpider tests, as if a production website needs to create 4000 random email addresses and ZIP codes, for instance. Normally such data is constructed outside of the timed loop, for obvious reasons).

The lack of weighting, the lack of realistic test scenarios... I'm just not convinced that it holds much utility (though I do like the way they have the "driver", and the elegant and clean client-side way they aggregate the test values, and do the same for comparison. The framework is a great foundation) for cross-browser comparison. I can see use in analyzing performance differences for a single browser (the results turning Firebug on and off, for instance, were very surprizing), just not as a valid comparison between different browsers.

Just as I dramatically changed the IE results in less than a minute of code changing, I'd guarantee that I could do the same with the other outliers (in particular the longer Firefox tests).

I'm still waiting for a good, real-world benchmark. Something that simulates sites like Digg, Slashdot, Facebook, interacting with them in a way that a real world user really would.

Earlier EntriesLater Entries

Dennis Forbes - Dennis Forbes is a Toronto-based software architect and technology writer