Dennis Forbes on Pragmatic Software Development
Subscribe to RSS
 
Monday, April 21 2008

Paul Graham continues his popular series of "How To Get Rich Quick on the Internet. And Fast!" essays with his latest entry, "Be Good", in which he describes some startup attributes common among a sampling of successful internet businesses.

One such pattern, observes Paul, is that his selected success stories started with no real revenue plans beyond "get big and flip it".

No subscriptions. No advertisements. No pop-ups. No interstitials.

They started more like a charity than a real business, just bringing good to their userbase with nothing asked or expected in return. At least at first.

This isn't a new position for Mr. Graham. He has long advocated the idea that you just need to worry about getting the eyeballs and you can figure out how to make money from them later. Or better still — reading between the lines — you can let the sucker you flip the thing to worry about the gritty details of how to monetize it, suffering the consequences if the userbase burns it to the ground in defiance of any revenue generating scheme.

This isn't a surprising position for Paul to support. It is entirely aligned with his micro-VC organizations' business model, which is to take young, time- and energy-rich grads, fresh out of the college mill, and bankroll them with a small investment (which they need because they won't be earning anything from their Internet baby anytime soon), and then cash in when/if they manage to flip it to a sucker that still buys into the many eyeballs model (a strict "No Returns" policy in effect.)

Given the small investment, only a percentage of YC Combinator's `fundees' need to hit the jackpot for the strategy to succeed, at least for Paul. He's playing house odds in this startup casino.

Paul provides some examples to demonstrate his position that the charity-that'll-make-you-rich approach is the winning strategy: Google and Craigslist. Incontestably successful companies, and most would be over the moon to experience a fraction of their success.

Let's take a closer look at these examples, and see how relevant they are to Paul's central theme.

Craigslist

Craig Newmark, the founder of Craigslist, started compiling a list of upcoming local events in 1995, publishing it to subscribers via a listserver. Later, after the list was well established and had a healthy subscriber base, he started publishing entries to a website, adding functionality to allow users to email directly to categorized lists.

Today Craigslist is an internet superstar, constantly ranking among the top 50 websites worldwide. It pulls in impressive revenue numbers through a model that Craig himself describes in this entry on Yahoo! Answers.

Craigslist is a rare survivor among thousands, tens of thousands, or perhaps hundreds of thousands of exceedingly similar lists/classified upstarts, most of them run ad-free and for free (sites were often ad free as a simple side-effect of the barriers to entry to hosting ads pre-2000. At that time it wasn't as simple as signing up for an AdSense account).

A Long Day CompleteCraigslist had the perfect, rare combination of the real-world personal connections of the founder, an ideal starting locality, userbase, and a progressive evolution that allowed it to build enough momentum that eventually the network effect took over, and you'd use craigslist because everyone uses craigslist, at least in some markets.

What You Can Learn From Craigslist: Craigslist is an extreme anomaly. Holding it up as an example of a path to follow is fundamentally akin to analyzing the number picking "strategy" of the latest lottery winner. It's also an excellent example of how corrupting, and falsely compelling, a survivorship bias can be. Countless sites have followed a close to identical path, failing miserably.

Google

Google hit the scene during the portal craze. This was a period when every other search engine, failing to sufficiently profit off of search alone, started merging with a gangly bunch of dance partners. Excite hooking up with @Home, for example. To "leverage the synergy", each quickly morphed into a "destination for all things" portal, fattening up their content until they featured a landing page absolutely packed from margin to margin with text, news, stock quotes, images, comics, horoscopes, etc.

Ads were almost invisible in this era of visual pollution, and were seldom the problem. Many users were using dial-up, so the content inflation wasn't just aesthetically deplorable, but it also made the search process a slow, unpleasant experience.

Sergey and Larry had been working on some algorithms for internet search during this period, and with the dot com boom peaking they managed to pull together an impressive million dollars in financing before even launching their beta website.

When they did finally get something online, they did the absolute minimal amount possible. Later they described the utter simplicity of the first version as a function of their lack of HTML knowledge: It was the best they could do, or cared to do, at the time.

They had differentiated themselves, however inadvertently, and it worked brilliantly. They had copious long-term financing (not "flip it to someone else" financing) before even launching, so they had no need to worry about making money immediately, using the website as a technology demo that would conceivably allow them to sell search technology and services to third parties and businesses.

While the quality of the search results got the Google buzz started, the dial-up bandwidth-friendly simplicity of their offering really won people over. Yet it was a simplicity that came primarily because the company only really had one product — search — and couldn't link to hundreds of other provided services.

When everyone else went heavyweight, the minimalism of Google got it a lot of attention among technology trend makers. That exposure on sites like Slashdot — amplified when the community learned that Google ran Linux — got them their next $25 million in financing, and the rest is history.

WishesWith their advertising initiatives Google took exactly the same approach, and when everyone else was using pop-up, high-bandwidth, obnoxious ads, Google zagged and had text ads. As others have adopted Google's text ad approach, Google has started adding in animated and full graphics ads to their docket.

What You Can Learn From Google: If you have an algorithm or technology that is sufficiently impressive enough to get a million dollars in financing before you've even left the drawing board, maybe you can take lessons from Google beyond "differentiating from the competition is good" (which is fairly obvious advice.)

The Cold, Hard Truth

The vast majority of "go big of go home" web ventures will fail. It isn't a meritocracy. Luck has a lot to do with it. There is little you can learn from the success stories unless you also learn from the failures, yet aside from the huge flameouts (which had to have enough success to even be notable), most failures fizzle out and disappear without a trace.

From the opposite angle, many "grow revenue from day one" websites have succeeded admirably (I just gave a local babysitter directory $39 for 3 months of lookups), albeit not without the "lottery ticket" quick payoff that a miniscule percentage of the winner-take-all players yield.

Ethics and Morales

It's a risky and disingenuous proposition to build a website, baiting a community, on one model — the "Good" and charitable model — and then switch the userbase once the founder's numbers get drawn. It's a scummy behavior to engage in, much less evangelize. It also might have the opposite effect than intended: I might not care whether a site like a social link voting site has a revenue model, but when I was considering photo sites I immediately discarded those that followed the Paul Graham business model, considering the risks too high: Either it would eventually be forced to flail about, obnoxiously trying different approaches at making it pay, or it would fold with a "Sorry We Got Bored Our Numbers Didn't Come Up" notice one fateful day. Instead I went for one with a sustainable business model, and haven't been unhappy with my decision (even if it did cost me a bit per year for all of the features).

Bits And Bytes

  • This entry was authored in emacs.
  • A quad-core is definitely your best choice, especially given the huge price drops just announced.
  • I called the whole Riya thing perfectly.
  • The standard for comments in code shouldn't be driven by the need to provide endless guideposts for incompetent programmers. If it describes something that should be obvious by the code, you're fixing the wrong problem (which can be either unclear code, or incompetent programmers, or both).
  • Most developers don't rely upon books anymore because the overwhelming majority of technical books are garbage.
  • Bits and Bytes was a brilliant educational program on TVO in the early 80s, and it is entirely responsible for beginning my love of computer hardware and software.
  • Nassim Nicholas Taleb explores survivorship and confirmation biases excellently in his books the Black Swan and Fooled by Randomness. While I was put off by his ego, and the expansion of a paragraph-worth idea into chapters, they're still great reads.
Tuesday, March 18 2008

Back in the late 1990s, when XML was still in its formative years (a state some would argue continues to this day), XML Spy was a very welcome entrant to the developer tools market, bringing intuitive, GUI-based schema and basic transformation authoring and validation to the developer’s desktop.

While some were productive and happy with just the W3C specs and a copy of emacs, many of us only used XML intermittently, building an export, import or transformation that simply worked, promptly forgetting all of the nuances of DTDs versus XSDs versus XDRs, or the quickly changing XSL(T) specifications.

It was a great step forward in the uptake and quality of XML utilization to have such an easy to use, up-to-date tool.

At the time XML Spy was basically shareware, offering a fully featured 30-day trial, at worst popping up the occasional “please register me!” exhortation.

Many just registered it: it was an easy sell at $54 a user, less if you bought multiple copies. That’s almost disposable money, and was an easy pitch to most managers. It was easy enough saying “let’s get a copy for everyone in the group. Even for the guy in the cube near the washroom, anti-XML rage bursting from his trembling lips in a spray of spittle and phlegm.”

Time goes on and we all moved to different projects, divisions and companies, often with long gaps needing little or no in-depth XML. When those instances came up, we’d try to find an old licensed copy, or would download the latest trial, using yet another toss away email address for the validation.

And XML Spy just kept getting more expensive. The company grew and grew (note that the domain on the original archive.org link above actually expired, and now sits in the hands of a domain ad purveyor), and the dollar signs in their dreams had them imagining, apparently, of a day when millions of information workers sat toiling their days away in the pure awesomeness of XML Spy. In emacs-esque form, it had grown more and more functionality, even if many users never used it for anything more than creating and validating schemas and transformations.

By late 2000, the price of XML Spy had inflated to $149 a user. By the end of the next year it hit $399 a user. By late 2006 it was up to $499 a user (at some point dropping the space between XML and Spy, becoming XMLSpy).

As I write this it’s up to $539 a user.

Maybe XMLSpy is developed in a poorly insulated aircraft hangar in Siberia, and thus is strongly impacted by the price of oil?

XMLSpy versus Oil -- Peak XML?

A ten-fold jump in price in about 8 years seems excessive. What was once a wonderfully priced utility is now a considerably expensive development ecosystem. What was once an easy purchase (at one workplace I just paid for it myself rather than deal with the annoyance of a requisition form) is now a difficult to justify expenditure, requiring vendor comparisons, and negotiations with middle managers. When the money handlers are convinced, often it’s just for partial coverage of the development team.

You end up with the “XML guy”, rather than having a team appropriately equipped with a uniform set of tools.

Of course, clearly my complaints are off base. Altova obviously did appropriate research, and they determined that there really are people and groups who’ll happily pay more for an XML editor than they paid for their entire Visual Studio suite.

But come on, Altova – bring back a, err, “Semi-Professional” version – something with XML schema and transformation authoring and validation and nothing more. No grand vision where your product is the center point of a developer’s existence. Put a reasonable price tag on it – like $59 – and I’m sure you’ll get a lot of sales where right now you get none. I realize you probably have lots of big buildings with expensive lights, and layers and layers of bureaucracy to finance, but don’t do it all on the back of a simple little XML utility.

Friday, February 22 2008

Goto Considered Appropriate In Some Cases

One of the most referenced papers in software development has to be Dijkstra's seminal paper titled "Goto Statement Considered Harmful".

Dijkstra didn't actually author the title, but instead it was the creation of an editor en route to being printed in an ACM publication. It was changed from its original title of "A case against the goto statement".

While the core essence of the essay is indeed that the goto statement can be harmful, Djikstra wasn't making an absolute statement (as is commonly claimed, and which is an absolutism tendency of far too many in this industry), but instead was commenting on the abuse of goto that was occurring in the industry, calling for a sober evaluation of where it is appropriate, but more importantly where it is not.

Nonetheless, the meme was created and has been reused and abused in innumerable Considered Harmful declarations since.

So...how does a C# 3.0 implementation of Fibonacci differ from a C# 2.0 version?

A month or so back the development webosphere was awash with references to Scott Hanselman's excellent blog, all excitedly linking (rel="titillating"?) to his piece titled "The Weekly Source Code 13 - Fibonacci Edition". This was particularly common in the .NET community, with many linkers describing it as an elucidating example of the many advantages of .NET 3.5 / C# 3.0.

I perused the entry, always eager to absorb that sort of information, but found it less than perfect. I withheld critical comment, hoping it would all just blow away.

Then this morning I opened up Visual Studio and happened to notice a link to his entry on the Start page.

Visual Studio 2008 Start Page

Maybe it's been there for a while (the last date is pretty old) and I just didn't notice it before, but the title used on the Start page pushed me over the edge, coercing me to comment.

Recursion Considered Harmful

There are several issues I have with Scott's Fibonacci entry.

First, the C# 2.0 (henceforth I'm dumping the subversion precision on the language versions) version is oddly dumbed down: C# 2 also has ternary comparisons, and it even has anonymous functions (including closure functionality). Yet the demonstrations given contrast the simplest possible C# 2 implementation with the most obtuse C# 3 example.

Basically the only novel difference with the C# 3 example is that it uses a lambda, though of course it would be an absolutely terrible thing to use a lambda for.

It's not a very good example of the implementation differences between the versions, which is the claim made by the Visual Studio start page, and was the description often used during the dissemination of this piece.

I like C# 3, but this isn't a good demonstration of any advantage of the language.

Worse yet, the only place you'll ever see recursion used to calculate Fibonacci numbers is in "Recursion for Dummies" type examples. To understand why that is, consider Scott's C# 3 example, which he leads into with the statement "Now, here's a great way using C# 3.0".

Here's a logarithmic-scaled chart of the number of function calls necessary to calculate Fibonacci numbers in the C# 3 example Scott gave.

The Horror!

Obviously it gets unusable pretty quickly. Try calculating the 90th Fibonacci number using recursive algorithms...

In the same way that Goto can be harmful, the use of recursion is often a sign of badness, and this is no exception. Epic inefficiency is used instead of the obviously simple approach.

long CalcFibonacciNumber(long n)
{ long current = 1, previous = 0, swapholder; while (n-- > 1) { swapholder = previous; previous = current; current += swapholder; } return current; }

(Ignoring mathematical shortcuts)

Unrealistic Examples Considered Harmful

A lot of readers will be rolling their eyes right about now, muttering something along the lines of "Awww, come on...you didn't seriously think anyone thought that recursion was a good way to calculate Fibonacci numbers, did you? This is beginner's stuff, and no one really thinks that's the right way to do it!"

I'm optimistic about the profession, so no, I didn't really think it was a serious example (though I do think it nonetheless deserves some serious warnings to ensure no one becomes misled).

WARNING: The Code Contained In This Example Will Rot Your Brain. Never Do Something Like This In Real Life. Don't Let Peers See You Looking At Code Like This. Suspend All Critical Thought While Reading This Piece.

Instead it's a sample of "here's a demonstration of how to do something absolutely terrible — almost felony worthy — in a variety of programming languages....".

This is still a serious problem.

The example given is so very wrong — even if it is what's used in Recursion for Dummies books — that it makes it close to impossible to focus on the actual point being made, even if it had used comparable features of each language to demonstrate how the same task could be accomplished in each.

It reminds me of many early web service tutorials and advocacy pieces: Many used absurd examples like "a web service to add two numbers" (and amazing variations such as subtract two numbers, multiply two numbers, divide two numbers, compute the Log10 of a number, and so on. You get my point — things for which a web service would be entirely unsuited).

Stop it!

Stop with the ridiculous no-one-would-(or rather should)-ever-do-it-this-way examples. It completely undermines the value of the examples.

Surely there are realistic examples that would be more appropriate for demonstrating the advantages of lambdas (recursion {is recursion}; [goto {is recursion}], so there isn't much enlightenment provided there). How about "how to build a rudimentary regular expression parser in a variety of languages", or for a web service "pulling weather data from a remote weather station".

Something that a developer isn't going to have to slog through with their brain fighting them on every line, demanding an explanation for the terrible design or algorithm they're supposed to accept at face value.

Friday, January 04 2008

To allow me to post quick and dirty reviews of hardware that I've used in home and in business — something I've long wanted to do much more frequently, empowering searchers with another user's perspective to help guide their choices — I've set up a separate blog (courtesy of this great new custom blog software). This will allow me to hash out and publish reviews without feeling that it's contrary to the direction or tone of this blog.

This carries on the original intent of my long unchanged mini-reviews page.

First up is a review of the Comstar 500GB USB RJ45 / Ximeta ND USB Netdisk Enclosure

Tuesday, January 01 2008

It’s been a quiet year, blogwise.

It wasn’t unexpected.

Early 2007 saw the addition of my 3rd child (my second boy, giving me three children under five years old). I can’t overstate how much time and work goes into children, especially when they’re this young (every outing of any sort is a campaign of epic proportions). The reward is worth it a trillion times over, but it means that things like blogging tend to get sacrificed on the priority list, and the entries that do happen occur through creative time usage. Such as this one that’s being authored in my son’s room while I wait for him to fall asleep.

Dinosaurs at the ROM Add the fact that I’d switched from running a mostly one-man consulting company (which was yafla’s former purpose, providing business justification for the time spent on every entry – they were gaining eyeballs, clients, PageRank, professional credibility, and so on for consulting and development opportunities) to instead dedicating my time and energies exclusively to one client.

My mental efforts these days are focused on an amazing New York City back-end financial services company, imagining and building the next generation of software for the industry. While it is very rewarding professionally and financially, basic professional courtesy and confidentiality restrictions limit how much I can discuss the discoveries, trials and tribulations of that adventure.

So where does material come from then? Much of the wisdom I had to dole out had been exhausted earlier in the lifecycle of this blog (all bloggers have a finite amount of accumulated wisdom to espouse before they start recycling, or worse outsourcing, content. Beware any blogger that operates on a schedule, because their creation will almost certainly be vacuous or intellectually stolen tripe). I've since accumulated a lot of observations and suggestions about mixing work and family life in this industry, and at some point I need to put virtual pen to virtual paper, however the low hanging fruit material has already been stored in the archives.

When I agree with things I’ve read and seen elsewhere, I don’t see much point in a toss-off “I agree” post, so instead the limited content was usually seemingly negative – where I disagreed with something or someone. Like most people (excluding cults and fan clubs), I’m more motivated by disagreement than agreement, so it inspired the extra effort to find a moment to author a post.

Lacking the opportunity to post normal posts to balance things out, it gives an unsavory “critic” feel to the blog, which was never my intent.

There is hope, however!

With the dawn of a new year, yafla is going to gradually (but immediately) start morphing into something new (this isn’t a new year's resolution or anything of that sort, but was the planned timeline all along). For far too long I’ve sat on the sidelines waiting for the perfect idea for casual development to reuse a relatively well-ranked domain. Realizing that is a self-defeating bar, I’ve decided to go with an imperfect but viable secondary option.

So where is it going? Failing a market segment categorization, let’s just say Slashdot * Digg ^ Wikipedia * Reddit + StumbleUpon + Delicious ^ Blogger. There are a million and one competitors in this space already, but I’m targeting something special, implementing ideas that have clattered around in my head since the early Slashdot era (having one’s carefully crafted, timely submissions rejected by a Slashdot editor was undoubtedly the impetus behind the creation of a lot of the follow-up sites).

Daughter Playing in Leaves In a nutshell, the yafla realization will be quality above quantity, and value above distraction (I have no intent of catering to the crowd using it to avoid work. I want to cater to the people using it to further their pursuits, whatever they are pursuing, not to avoid life.) The differentiator will be the people empowered by the analytics, bringing a special perspective to information coalescence.

It’s going to start terribly and hackishly (a transparent, ultra-agile work in progress), and will probably be ignored and seldom used for a while after inception, but it will at least give me comfort that something is being done with the domain, and will most certainly provide unencumbered source material for the blog.

Let’s see what the new year brings.

Friday, December 21 2007

“What gets measured gets done.”

I decided to take the new SunSpider benchmarks for a spin, generating the pretty graphs found down below. Benchmarks are always entertaining, and it was enjoyable comparing the numbers yielded under various conditions (turning SpeedStep on and off [none of the benchmarks loaded the CPU long enough for it to bother raising up from its lowered power, 66% performance relaxation state], setting CPU affinity, running it on different PCs, trying different build options on my Firefox build, etc.)

"Why benchmark at all?" one might ask. Simple: If you find the right measures, the common wisdom goes, the inputs to the measure will improve as the various players work to improve the metrics.

Whether you’re measuring bugs per developer, lines-of-code, widgets per hour…whatever: Start measuring it and invariably it’ll start moving in the desired direction, whether this actually serves your end goal or not. Often such initiatives come at the cost of the unmeasured, but over time it adapts and starts serving as a beneficial feedback.

The Assembly Line Benchmark - Widgets per Hour

The WidgetDuring the summer in my late teens I worked on an assembly line building car parts (pieces that played some sort of role in the air conditioning system – basically widgets): Put a little bracket in a metal cylinder, add a circle of fiberglass, inject some desiccant beads using a machine, add another fiberglass circle and another metal bracket to hold it all in place and then put it in another machine that squashed another cylinder onto the top. Then I sent it down the line to the welder.

Atop my machine sat a little counter that monitored my progress, carefully recording every piece assembled. While this was a less advanced era — being the prehistoric early 90s and all — and I had to manually transfer the final count to my timecard for submission, every worker was kept somewhat honest by the metrics submitted by the other workers on the line.

Clearly I couldn’t have done 2000 parts in a day if the people before me and behind me in line only reported 1000, for instance, and vice versa.

Coupled with continuous, careful QA tests and random inspections (performed by people who had their own metrics to work towards), this struck me as an excellent system because it was difficult or impossible to game, and the onerous checks ensured that it didn’t come at the expense of quality.

It certainly worked wonders on me, as I wiled away the endless summer days performing the most awesomely brainlessness of tasks by competing with my own personal productivity “records”, endlessly trying to push out more quality parts per hour day after day.

I was there and had nothing better to do, and that little counter sat looking down on me, mocking me. It dared me to do just a couple more pieces per hour, and I willingly complied.

Somewhere a paper pusher and cackling middle manager would sum up the part counts and rub their hands together in giddy glee, eager for my zombie-state quest for worth to pad their bonus cheques.

It’s good I was a summer employee, because my pace didn’t make me friends on the line.

Test Driven Development tries to create a similar spirit of metrics, giving you a goal to strive for as you build out your product. It’s a comforting bit of feedback when all of the TDD tests come back with green checkmarks. The more tests you create, the higher the absolute count of passes you can brag about when the product sails through with flying colors, easily passing 497 of the 497 tests.

Performance benchmarks serve the same purpose for the performance and efficiency domain.

Consider the initial hardware-accelerating video cards for Windows. Early on they seemed to have little or no purpose, and were almost abstract to users. Then benchmarks started appearing, giving the manufacturers something to strive towards while also providing end users with an easy way to compare and choose amongst the options. “Card {A} can only do 10,000,000 accelerated rectangles per second, while card {B} can do 12,000,000. Clearly we need to get card {B} for our rectangle displaying needs.”

Gaming the MetricsDiamond Speedstar 24x

Of course some vendors started gaming the metrics in various creative ways (see Joel's excellent essay on poorly thought out metrics). Several created products that actually recognized running benchmarks in hardware, “optimizing” (by any means possible, including simply discarding many of the benchmark commands, knowing the end user will never notice if every second rectangle or rendered text of the millions per second isn’t being rendered). Worse, the benchmarks were so atrociously artificial, bearing little similarity to actual everyday use that the direction of progress was to optimize the performance of benchmarks, often to the detriment of everyday use.

Eventually the benchmarks matured, getting better and more realistic, and the gaming was prevented or embarrassingly heralded, and it became a hugely beneficial tool in the march forward in the field. Various games have served the benchmarking role, the Doom and then Quake series being the most influential.

In the browser market, the growing interactivity of the web and the renewed competition amongst the big competitors has seen a flurry of benchmarks being widely discussed and debated, stereotyping each of the browsers into performance ghettoes. “Firefox is sooooo slow….” “IE is garbage. Opera is super speedy!”

Having some real tests is of obvious benefit to “set the record straight”, not to mention that it provides a carrot for the competitors to chase. Exactly that happened with me a while back when I came across a string concatenation benchmark, so I went in and streamlined the piece of Firefox code specifically impacted by that benchmark. My change in place, Firefox indeed did much better on that specific benchmark, though the real-world benefit was negligible.

In many ways the various web benchmarks available reminds me of the early accelerated video card benchmarks: Crude, having little or no correlation with the pain points of real-world use, and opportune for gaming and false evangelism.

WebKit's SunSpider

Which brings me to the recently released SunSpider benchmark (which is a credible contender for the widely coveted “most poorly chosen project name” award: It’s bad enough that an Apple project uses “Sun” in their product name, but it's thrice as bad when it’s a project related to JavaScript – JavaScript being another nominee).

SunSpider is very easy to run and gives quick feedback, so quite a few charts and graphs have been sprouting on blogs across the land.

JavaScript/DOM performance is a huge concern right now, as web applications are growing in richness by leaps and bounds, so there is definitely a need to be filled.

Will SunSpider be what we've all been looking for?

Here’s just such a graph, charting the stacked benchmark runtime for the current tier-1 browsers for Windows.

SunSpider Benchmark Results

Benchmarks were performed on a 4GB, Q6600 quad-core Core 2 processor machine running Vista x64. Firefox 3 was built from the current CVS (as of this morning). The Y-axis represents milliseconds.

Such a benchmark provides immediate feedback regarding the biggest bang for the buck optimization, at least in regards to improving the runtime of this particular benchmark. For IE 7 it is pretty clear that the benchmark killer is the bizarre and repeated use of string concatenation throughout the benchmark tests, particularly evident in the string-base64 and string-validate benchmark.

Naive String Concatenations

After approximately 20 seconds (okay, maybe 22 seconds) "optimizing" the base64 and validate tests to use the extremely common Array push/toString idiom that is used on pretty much any page that does more than the most trivial of string operations (my changes were rash and very simplistic, though if I were motivated — if this were production code — I could do a much better job with it), the performance had changed rather dramatically, as seen in the following graph (scroll up and down for dramatic flair).

SunSpider Benchmark Results

It's late and I'm tired, but I'd guarantee that I could dramatically decrease the remaining largest test -- string tagcloud -- but I think the point is proven.

Some will naturally draw from this the presumption that I'm just an Internet Explorer 7 fan, desperately manipulating the benchmark to best fit the strengths and avoid the weaknesses of my favourite browser.

They'd be wrong.

My browser of choice is Firefox. Not only do I not find the featureset of Internet Explorer 7 uncompelling and anemic compared to a naked copy of Firefox (not evening considering the enormous functionality offered by add-ins, such as the extraordinary Firebug), I find the performance of Microsoft's offering to be atrocious on real-world websites.

I don't like Internet Explorer on technical grounds, and I like it even less given the concerning conflict of interest it represents.

Perhaps I'm just bearing a grudge.

We're currently implementing a very rich, advanced web application, and one thing that we've found, in case after case, is that in real-world situations with extensive DOM manipulation and production JavaScript, Internet Explorer stumbles and groans under the load, while competing browsers complete the task with gusto (just rendering a dynamically loaded complex table takes 20x or more on IE than in Firefox 2. The disparity grows greater with Firefox 3). It's to the point that I can't help but wonder if Microsoft is trying to undermine the whole web thing intentionally, hoping to encourage the middle-grounders to hoard to the boards proclaiming the deficiencies of web apps, manipulated into begging for some XAML goodness.

So if I wasn't looking to defend IE7, what was my point?

Lies, Damned Lies, and Benchmarks

Maybe the motivations of the team behind this benchmark were noble, and they weren't blinded into naturally biasing the benchmark towards their own project, but I can't help but see this benchmark as an entirely artificial, naive, unrealistic benchmark that adds little to the benchmarking landscape. A cursory glance through the benchmark sees bizarre oddities that would never appear in real-world code, and a variety of implementation choices that are questionable for a benchmark (for instance test/sample data is often constructed within the timed scope of the benchmark in the SunSpider tests, as if a production website needs to create 4000 random email addresses and ZIP codes, for instance. Normally such data is constructed outside of the timed loop, for obvious reasons).

The lack of weighting, the lack of realistic test scenarios... I'm just not convinced that it holds much utility (though I do like the way they have the "driver", and the elegant and clean client-side way they aggregate the test values, and do the same for comparison. The framework is a great foundation) for cross-browser comparison. I can see use in analyzing performance differences for a single browser (the results turning Firebug on and off, for instance, were very surprizing), just not as a valid comparison between different browsers.

Just as I dramatically changed the IE results in less than a minute of code changing, I'd guarantee that I could do the same with the other outliers (in particular the longer Firefox tests).

I'm still waiting for a good, real-world benchmark. Something that simulates sites like Digg, Slashdot, Facebook, interacting with them in a way that a real world user really would.

Friday, December 07 2007

I'd been sitting on the sidelines of the HD-DVD vs. Blu-ray spectacle, reluctant to sink cash into either hardware or media until the dust settled and one victor remained.

I'm hardly alone in this sentiment: No one wants an expensive piece of hardware sitting unused, or a media collection that is only playable on the one TV down in the basement (after reconnecting the derelict player that had been disconnected to free an HDMI port).

Sales had been relatively slow for standalone players.

Instead the most successful uptake of the new formats has been via the Sony PS3 and its built-in Blu-ray player (Sony is the principal backer and beneficiary of Blu-ray), accounting for a hefty percentage of deployed Blu-ray players worldwide, whether their owners know that they're being counted as faithful Blu-ray fans or not. For those who were aware of the feature, I'm sure it helped them justify the purchase to their parents/wives/husbands: "But it's also a next generation DVD player!" (Countless PS2s were sold on the justification that it could double as a somewhat mediocre DVD player).

HD-DVD vs Blu-ray

This vaulted Blu-ray into an early lead considering that Microsoft, despite being an HD-DVD backer, didn't incorporate HD-DVD into the XBox 360 -- there remains widespread confusion about this -- instead offering it later as an add on player.

The boards filled with the Blu-ray faithful, hopeful that they could help the format succeed to vindicate their purchase justification.

Not wanting a PS3, the motivation to upgrade just wasn't as strong as it was for, say, the desire to move from VHS to DVD. While the new formats technically offer improved resolution, and a much better video compression technology (greatly reducing irritants like posterization in dark sections), the improvement isn't dramatic compared to standard DVD run through a competent upscaling DVD player. Audio has theoretically improved on the new formats, but given the sparse availability of DTS-encoding movies on DVD media -- DTS being the higher quality alternative to Dolby Digital -- the audio capabilities of DVD was barely exploited at all already, so I don't expect much real improvement with the new formats, beyond looking better on paper.

The interactivity features of the platform have improved (even seemingly trivial things like accessing the chapter guide while the movie continues to play, the chapter guide translucently overlaid), but until the media makers start fully leveraging it, and unless you are the sort to draw a lot of value from the extras, that isn't a major selling point. DVD was a huge convenience win over VHS, with random access and no be-kind-rewind demands, but the new formats are just minor improvements over what we already have.

Which brings me to my recent desire to buy an upscaling DVD player, desiring a unit that interpolated more elegantly to HDTV resolutions.

Then I came across a Toshiba HD-DVD HD-A3 player for less than $200 (with 2 free HD-DVDs in the box, and another 5 via a mail in form).

Toshiba HD-A3

So I picked sides, and chose HD-DVD. I've thus declared fealty to the format, and will now order the loyal minion t-shirt and ballcap, and debate the point passionately whenever the opportunity arises!

My purchase justification goes as follows-

  • It's cheap. Really cheap. Comes with a couple of movies as well, which is nice. It doesn't have every feature, but it's a good start. [For instance it's missing 1080p, but that didn't disuade me: 1080p happens to be one of the most oversold, misrepresented features trumpeted today. I'd much rather have 1080/24]
  • We'll still be largely watching traditional DVDs, with the odd HD-DVD rented from zip.ca. I'm not going to cry any tears that some companies have been bribed or coerced into only supporting one format or the other -- I'll just go with the standard DVD option.
  • It's a really, really good upscaling DVD player, so I'm completely satisified with the purchase even without playing next-generation media. If Blu-ray was victorious and tossed the HD-DVD consortium into the dustbin of history, I wouldn't have purchase regret (I would feel quite differently if bought one of the $700 players).
  • The storage differential between the two formats is irrelevant for the prescribed use. While the greater capacity of Blu-ray is a win if you want it as a backup format for a PC, it isn't pertinent for a 4 hour VC1-encoded motion picture with top quality audio and sound. Indicative of this is the fact that quite a few Blu-ray releases have been encoded with the vastly inferior MPEG2 codec, wasting the extra space to use an obsolete compression technology.
  • In a few years, this will all be moot anyways, as streaming technology and capacity improves. Ultimately these are holdover technologies.

I still don't plan on amassing a media collection, but I have been enjoying the higher quality rentals -- when a given release is available on HD-DVD -- for just a small premium over a decent upscaling DVD player.

Earlier EntriesLater Entries

Dennis Forbes - Dennis Forbes is a Toronto-based software architect and technology writer