Dennis Forbes on Pragmatic Software Development http://www.yafla.com/dforbes/ Sun, 18 May 2008 12:37:40 GMT en-us Being "Bad" -or- The Startup Lottery Ticket http://www.yafla.com/dforbes/Be_Bad_Or_The_Startup_Lottery_Ticket/ http://www.yafla.com/dforbes/Be_Bad_Or_The_Startup_Lottery_Ticket/ Mon, 21 Apr 2008 22:22:00 GMT Paul Graham continues his popular series of "How To Get Rich Quick on the Internet. And Fast!" essays with his latest entry, "Be Good", in which he describes some startup attributes common among a sampling of successful internet businesses.

One such pattern, observes Paul, is that his selected success stories started with no real revenue plans beyond "get big and flip it".

No subscriptions. No advertisements. No pop-ups. No interstitials.

They started more like a charity than a real business, just bringing good to their userbase with nothing asked or expected in return. At least at first.

This isn't a new position for Mr. Graham. He has long advocated the idea that you just need to worry about getting the eyeballs and you can figure out how to make money from them later. Or better still — reading between the lines — you can let the sucker you flip the thing to worry about the gritty details of how to monetize it, suffering the consequences if the userbase burns it to the ground in defiance of any revenue generating scheme.

This isn't a surprising position for Paul to support. It is entirely aligned with his micro-VC organizations' business model, which is to take young, time- and energy-rich grads, fresh out of the college mill, and bankroll them with a small investment (which they need because they won't be earning anything from their Internet baby anytime soon), and then cash in when/if they manage to flip it to a sucker that still buys into the many eyeballs model (a strict "No Returns" policy in effect.)

Given the small investment, only a percentage of YC Combinator's `fundees' need to hit the jackpot for the strategy to succeed, at least for Paul. He's playing house odds in this startup casino.

Paul provides some examples to demonstrate his position that the charity-that'll-make-you-rich approach is the winning strategy: Google and Craigslist. Incontestably successful companies, and most would be over the moon to experience a fraction of their success.

Let's take a closer look at these examples, and see how relevant they are to Paul's central theme.

Craigslist

Craig Newmark, the founder of Craigslist, started compiling a list of upcoming local events in 1995, publishing it to subscribers via a listserver. Later, after the list was well established and had a healthy subscriber base, he started publishing entries to a website, adding functionality to allow users to email directly to categorized lists.

Today Craigslist is an internet superstar, constantly ranking among the top 50 websites worldwide. It pulls in impressive revenue numbers through a model that Craig himself describes in this entry on Yahoo! Answers.

Craigslist is a rare survivor among thousands, tens of thousands, or perhaps hundreds of thousands of exceedingly similar lists/classified upstarts, most of them run ad-free and for free (sites were often ad free as a simple side-effect of the barriers to entry to hosting ads pre-2000. At that time it wasn't as simple as signing up for an AdSense account).

A Long Day CompleteCraigslist had the perfect, rare combination of the real-world personal connections of the founder, an ideal starting locality, userbase, and a progressive evolution that allowed it to build enough momentum that eventually the network effect took over, and you'd use craigslist because everyone uses craigslist, at least in some markets.

What You Can Learn From Craigslist: Craigslist is an extreme anomaly. Holding it up as an example of a path to follow is fundamentally akin to analyzing the number picking "strategy" of the latest lottery winner. It's also an excellent example of how corrupting, and falsely compelling, a survivorship bias can be. Countless sites have followed a close to identical path, failing miserably.

Google

Google hit the scene during the portal craze. This was a period when every other search engine, failing to sufficiently profit off of search alone, started merging with a gangly bunch of dance partners. Excite hooking up with @Home, for example. To "leverage the synergy", each quickly morphed into a "destination for all things" portal, fattening up their content until they featured a landing page absolutely packed from margin to margin with text, news, stock quotes, images, comics, horoscopes, etc.

Ads were almost invisible in this era of visual pollution, and were seldom the problem. Many users were using dial-up, so the content inflation wasn't just aesthetically deplorable, but it also made the search process a slow, unpleasant experience.

Sergey and Larry had been working on some algorithms for internet search during this period, and with the dot com boom peaking they managed to pull together an impressive million dollars in financing before even launching their beta website.

When they did finally get something online, they did the absolute minimal amount possible. Later they described the utter simplicity of the first version as a function of their lack of HTML knowledge: It was the best they could do, or cared to do, at the time.

They had differentiated themselves, however inadvertently, and it worked brilliantly. They had copious long-term financing (not "flip it to someone else" financing) before even launching, so they had no need to worry about making money immediately, using the website as a technology demo that would conceivably allow them to sell search technology and services to third parties and businesses.

While the quality of the search results got the Google buzz started, the dial-up bandwidth-friendly simplicity of their offering really won people over. Yet it was a simplicity that came primarily because the company only really had one product — search — and couldn't link to hundreds of other provided services.

When everyone else went heavyweight, the minimalism of Google got it a lot of attention among technology trend makers. That exposure on sites like Slashdot — amplified when the community learned that Google ran Linux — got them their next $25 million in financing, and the rest is history.

WishesWith their advertising initiatives Google took exactly the same approach, and when everyone else was using pop-up, high-bandwidth, obnoxious ads, Google zagged and had text ads. As others have adopted Google's text ad approach, Google has started adding in animated and full graphics ads to their docket.

What You Can Learn From Google: If you have an algorithm or technology that is sufficiently impressive enough to get a million dollars in financing before you've even left the drawing board, maybe you can take lessons from Google beyond "differentiating from the competition is good" (which is fairly obvious advice.)

The Cold, Hard Truth

The vast majority of "go big of go home" web ventures will fail. It isn't a meritocracy. Luck has a lot to do with it. There is little you can learn from the success stories unless you also learn from the failures, yet aside from the huge flameouts (which had to have enough success to even be notable), most failures fizzle out and disappear without a trace.

From the opposite angle, many "grow revenue from day one" websites have succeeded admirably (I just gave a local babysitter directory $39 for 3 months of lookups), albeit not without the "lottery ticket" quick payoff that a miniscule percentage of the winner-take-all players yield.

Ethics and Morales

It's a risky and disingenuous proposition to build a website, baiting a community, on one model — the "Good" and charitable model — and then switch the userbase once the founder's numbers get drawn. It's a scummy behavior to engage in, much less evangelize. It also might have the opposite effect than intended: I might not care whether a site like a social link voting site has a revenue model, but when I was considering photo sites I immediately discarded those that followed the Paul Graham business model, considering the risks too high: Either it would eventually be forced to flail about, obnoxiously trying different approaches at making it pay, or it would fold with a "Sorry We Got Bored Our Numbers Didn't Come Up" notice one fateful day. Instead I went for one with a sustainable business model, and haven't been unhappy with my decision (even if it did cost me a bit per year for all of the features).

Bits And Bytes

  • This entry was authored in emacs.
  • A quad-core is definitely your best choice, especially given the huge price drops just announced.
  • I called the whole Riya thing perfectly.
  • The standard for comments in code shouldn't be driven by the need to provide endless guideposts for incompetent programmers. If it describes something that should be obvious by the code, you're fixing the wrong problem (which can be either unclear code, or incompetent programmers, or both).
  • Most developers don't rely upon books anymore because the overwhelming majority of technical books are garbage.
  • Bits and Bytes was a brilliant educational program on TVO in the early 80s, and it is entirely responsible for beginning my love of computer hardware and software.
  • Nassim Nicholas Taleb explores survivorship and confirmation biases excellently in his books the Black Swan and Fooled by Randomness. While I was put off by his ego, and the expansion of a paragraph-worth idea into chapters, they're still great reads.
]]>
The Hyperinflation of XMLSpy http://www.yafla.com/dforbes/The_Hyperinflation_Of_XMLSpy/ http://www.yafla.com/dforbes/The_Hyperinflation_Of_XMLSpy/ Tue, 18 Mar 2008 12:00:00 GMT Back in the late 1990s, when XML was still in its formative years (a state some would argue continues to this day), XML Spy was a very welcome entrant to the developer tools market, bringing intuitive, GUI-based schema and basic transformation authoring and validation to the developer’s desktop.

While some were productive and happy with just the W3C specs and a copy of emacs, many of us only used XML intermittently, building an export, import or transformation that simply worked, promptly forgetting all of the nuances of DTDs versus XSDs versus XDRs, or the quickly changing XSL(T) specifications.

It was a great step forward in the uptake and quality of XML utilization to have such an easy to use, up-to-date tool.

At the time XML Spy was basically shareware, offering a fully featured 30-day trial, at worst popping up the occasional “please register me!” exhortation.

Many just registered it: it was an easy sell at $54 a user, less if you bought multiple copies. That’s almost disposable money, and was an easy pitch to most managers. It was easy enough saying “let’s get a copy for everyone in the group. Even for the guy in the cube near the washroom, anti-XML rage bursting from his trembling lips in a spray of spittle and phlegm.”

Time goes on and we all moved to different projects, divisions and companies, often with long gaps needing little or no in-depth XML. When those instances came up, we’d try to find an old licensed copy, or would download the latest trial, using yet another toss away email address for the validation.

And XML Spy just kept getting more expensive. The company grew and grew (note that the domain on the original archive.org link above actually expired, and now sits in the hands of a domain ad purveyor), and the dollar signs in their dreams had them imagining, apparently, of a day when millions of information workers sat toiling their days away in the pure awesomeness of XML Spy. In emacs-esque form, it had grown more and more functionality, even if many users never used it for anything more than creating and validating schemas and transformations.

By late 2000, the price of XML Spy had inflated to $149 a user. By the end of the next year it hit $399 a user. By late 2006 it was up to $499 a user (at some point dropping the space between XML and Spy, becoming XMLSpy).

As I write this it’s up to $539 a user.

Maybe XMLSpy is developed in a poorly insulated aircraft hangar in Siberia, and thus is strongly impacted by the price of oil?

XMLSpy versus Oil -- Peak XML?

A ten-fold jump in price in about 8 years seems excessive. What was once a wonderfully priced utility is now a considerably expensive development ecosystem. What was once an easy purchase (at one workplace I just paid for it myself rather than deal with the annoyance of a requisition form) is now a difficult to justify expenditure, requiring vendor comparisons, and negotiations with middle managers. When the money handlers are convinced, often it’s just for partial coverage of the development team.

You end up with the “XML guy”, rather than having a team appropriately equipped with a uniform set of tools.

Of course, clearly my complaints are off base. Altova obviously did appropriate research, and they determined that there really are people and groups who’ll happily pay more for an XML editor than they paid for their entire Visual Studio suite.

But come on, Altova – bring back a, err, “Semi-Professional” version – something with XML schema and transformation authoring and validation and nothing more. No grand vision where your product is the center point of a developer’s existence. Put a reasonable price tag on it – like $59 – and I’m sure you’ll get a lot of sales where right now you get none. I realize you probably have lots of big buildings with expensive lights, and layers and layers of bureaucracy to finance, but don’t do it all on the back of a simple little XML utility.

]]>
Fibonacci Numbers, Recursion, And Terrible Examples http://www.yafla.com/dforbes/Fibonacci_Numbers_Recursion_And_Terrible_Examples/ http://www.yafla.com/dforbes/Fibonacci_Numbers_Recursion_And_Terrible_Examples/ Fri, 22 Feb 2008 12:00:00 GMT Goto Considered Appropriate In Some Cases

One of the most referenced papers in software development has to be Dijkstra's seminal paper titled "Goto Statement Considered Harmful".

Dijkstra didn't actually author the title, but instead it was the creation of an editor en route to being printed in an ACM publication. It was changed from its original title of "A case against the goto statement".

While the core essence of the essay is indeed that the goto statement can be harmful, Djikstra wasn't making an absolute statement (as is commonly claimed, and which is an absolutism tendency of far too many in this industry), but instead was commenting on the abuse of goto that was occurring in the industry, calling for a sober evaluation of where it is appropriate, but more importantly where it is not.

Nonetheless, the meme was created and has been reused and abused in innumerable Considered Harmful declarations since.

So...how does a C# 3.0 implementation of Fibonacci differ from a C# 2.0 version?

A month or so back the development webosphere was awash with references to Scott Hanselman's excellent blog, all excitedly linking (rel="titillating"?) to his piece titled "The Weekly Source Code 13 - Fibonacci Edition". This was particularly common in the .NET community, with many linkers describing it as an elucidating example of the many advantages of .NET 3.5 / C# 3.0.

I perused the entry, always eager to absorb that sort of information, but found it less than perfect. I withheld critical comment, hoping it would all just blow away.

Then this morning I opened up Visual Studio and happened to notice a link to his entry on the Start page.

Visual Studio 2008 Start Page

Maybe it's been there for a while (the last date is pretty old) and I just didn't notice it before, but the title used on the Start page pushed me over the edge, coercing me to comment.

Recursion Considered Harmful

There are several issues I have with Scott's Fibonacci entry.

First, the C# 2.0 (henceforth I'm dumping the subversion precision on the language versions) version is oddly dumbed down: C# 2 also has ternary comparisons, and it even has anonymous functions (including closure functionality). Yet the demonstrations given contrast the simplest possible C# 2 implementation with the most obtuse C# 3 example.

Basically the only novel difference with the C# 3 example is that it uses a lambda, though of course it would be an absolutely terrible thing to use a lambda for.

It's not a very good example of the implementation differences between the versions, which is the claim made by the Visual Studio start page, and was the description often used during the dissemination of this piece.

I like C# 3, but this isn't a good demonstration of any advantage of the language.

Worse yet, the only place you'll ever see recursion used to calculate Fibonacci numbers is in "Recursion for Dummies" type examples. To understand why that is, consider Scott's C# 3 example, which he leads into with the statement "Now, here's a great way using C# 3.0".

Here's a logarithmic-scaled chart of the number of function calls necessary to calculate Fibonacci numbers in the C# 3 example Scott gave.

The Horror!

Obviously it gets unusable pretty quickly. Try calculating the 90th Fibonacci number using recursive algorithms...

In the same way that Goto can be harmful, the use of recursion is often a sign of badness, and this is no exception. Epic inefficiency is used instead of the obviously simple approach.

long CalcFibonacciNumber(long n)
{ long current = 1, previous = 0, swapholder; while (n-- > 1) { swapholder = previous; previous = current; current += swapholder; } return current; }

(Ignoring mathematical shortcuts)

Unrealistic Examples Considered Harmful

A lot of readers will be rolling their eyes right about now, muttering something along the lines of "Awww, come on...you didn't seriously think anyone thought that recursion was a good way to calculate Fibonacci numbers, did you? This is beginner's stuff, and no one really thinks that's the right way to do it!"

I'm optimistic about the profession, so no, I didn't really think it was a serious example (though I do think it nonetheless deserves some serious warnings to ensure no one becomes misled).

WARNING: The Code Contained In This Example Will Rot Your Brain. Never Do Something Like This In Real Life. Don't Let Peers See You Looking At Code Like This. Suspend All Critical Thought While Reading This Piece.

Instead it's a sample of "here's a demonstration of how to do something absolutely terrible — almost felony worthy — in a variety of programming languages....".

This is still a serious problem.

The example given is so very wrong — even if it is what's used in Recursion for Dummies books — that it makes it close to impossible to focus on the actual point being made, even if it had used comparable features of each language to demonstrate how the same task could be accomplished in each.

It reminds me of many early web service tutorials and advocacy pieces: Many used absurd examples like "a web service to add two numbers" (and amazing variations such as subtract two numbers, multiply two numbers, divide two numbers, compute the Log10 of a number, and so on. You get my point — things for which a web service would be entirely unsuited).

Stop it!

Stop with the ridiculous no-one-would-(or rather should)-ever-do-it-this-way examples. It completely undermines the value of the examples.

Surely there are realistic examples that would be more appropriate for demonstrating the advantages of lambdas (recursion {is recursion}; [goto {is recursion}], so there isn't much enlightenment provided there). How about "how to build a rudimentary regular expression parser in a variety of languages", or for a web service "pulling weather data from a remote weather station".

Something that a developer isn't going to have to slog through with their brain fighting them on every line, demanding an explanation for the terrible design or algorithm they're supposed to accept at face value.

]]>
Hardware Reviews http://www.yafla.com/dforbes/Hardware_Reviews/ http://www.yafla.com/dforbes/Hardware_Reviews/ Fri, 04 Jan 2008 00:39:00 GMT To allow me to post quick and dirty reviews of hardware that I've used in home and in business — something I've long wanted to do much more frequently, empowering searchers with another user's perspective to help guide their choices — I've set up a separate blog (courtesy of this great new custom blog software). This will allow me to hash out and publish reviews without feeling that it's contrary to the direction or tone of this blog.

This carries on the original intent of my long unchanged mini-reviews page.

First up is a review of the Comstar 500GB USB RJ45 / Ximeta ND USB Netdisk Enclosure

]]>
2007 An Introspective Blog Year In Review http://www.yafla.com/dforbes/2007_An_Introspective_Blog_Year_In_Review/ http://www.yafla.com/dforbes/2007_An_Introspective_Blog_Year_In_Review/ Tue, 01 Jan 2008 22:00:00 GMT It’s been a quiet year, blogwise.

It wasn’t unexpected.

Early 2007 saw the addition of my 3rd child (my second boy, giving me three children under five years old). I can’t overstate how much time and work goes into children, especially when they’re this young (every outing of any sort is a campaign of epic proportions). The reward is worth it a trillion times over, but it means that things like blogging tend to get sacrificed on the priority list, and the entries that do happen occur through creative time usage. Such as this one that’s being authored in my son’s room while I wait for him to fall asleep.

Dinosaurs at the ROM Add the fact that I’d switched from running a mostly one-man consulting company (which was yafla’s former purpose, providing business justification for the time spent on every entry – they were gaining eyeballs, clients, PageRank, professional credibility, and so on for consulting and development opportunities) to instead dedicating my time and energies exclusively to one client.

My mental efforts these days are focused on an amazing New York City back-end financial services company, imagining and building the next generation of software for the industry. While it is very rewarding professionally and financially, basic professional courtesy and confidentiality restrictions limit how much I can discuss the discoveries, trials and tribulations of that adventure.

So where does material come from then? Much of the wisdom I had to dole out had been exhausted earlier in the lifecycle of this blog (all bloggers have a finite amount of accumulated wisdom to espouse before they start recycling, or worse outsourcing, content. Beware any blogger that operates on a schedule, because their creation will almost certainly be vacuous or intellectually stolen tripe). I've since accumulated a lot of observations and suggestions about mixing work and family life in this industry, and at some point I need to put virtual pen to virtual paper, however the low hanging fruit material has already been stored in the archives.

When I agree with things I’ve read and seen elsewhere, I don’t see much point in a toss-off “I agree” post, so instead the limited content was usually seemingly negative – where I disagreed with something or someone. Like most people (excluding cults and fan clubs), I’m more motivated by disagreement than agreement, so it inspired the extra effort to find a moment to author a post.

Lacking the opportunity to post normal posts to balance things out, it gives an unsavory “critic” feel to the blog, which was never my intent.

There is hope, however!

With the dawn of a new year, yafla is going to gradually (but immediately) start morphing into something new (this isn’t a new year's resolution or anything of that sort, but was the planned timeline all along). For far too long I’ve sat on the sidelines waiting for the perfect idea for casual development to reuse a relatively well-ranked domain. Realizing that is a self-defeating bar, I’ve decided to go with an imperfect but viable secondary option.

So where is it going? Failing a market segment categorization, let’s just say Slashdot * Digg ^ Wikipedia * Reddit + StumbleUpon + Delicious ^ Blogger. There are a million and one competitors in this space already, but I’m targeting something special, implementing ideas that have clattered around in my head since the early Slashdot era (having one’s carefully crafted, timely submissions rejected by a Slashdot editor was undoubtedly the impetus behind the creation of a lot of the follow-up sites).

Daughter Playing in Leaves In a nutshell, the yafla realization will be quality above quantity, and value above distraction (I have no intent of catering to the crowd using it to avoid work. I want to cater to the people using it to further their pursuits, whatever they are pursuing, not to avoid life.) The differentiator will be the people empowered by the analytics, bringing a special perspective to information coalescence.

It’s going to start terribly and hackishly (a transparent, ultra-agile work in progress), and will probably be ignored and seldom used for a while after inception, but it will at least give me comfort that something is being done with the domain, and will most certainly provide unencumbered source material for the blog.

Let’s see what the new year brings.

]]>
Benchmark Driven Development - Lies, Damn Lies, and Benchmarks http://www.yafla.com/dforbes/Benchmark_Driven_Development/ http://www.yafla.com/dforbes/Benchmark_Driven_Development/ Fri, 21 Dec 2007 08:00:00 GMT “What gets measured gets done.”

I decided to take the new SunSpider benchmarks for a spin, generating the pretty graphs found down below. Benchmarks are always entertaining, and it was enjoyable comparing the numbers yielded under various conditions (turning SpeedStep on and off [none of the benchmarks loaded the CPU long enough for it to bother raising up from its lowered power, 66% performance relaxation state], setting CPU affinity, running it on different PCs, trying different build options on my Firefox build, etc.)

"Why benchmark at all?" one might ask. Simple: If you find the right measures, the common wisdom goes, the inputs to the measure will improve as the various players work to improve the metrics.

Whether you’re measuring bugs per developer, lines-of-code, widgets per hour…whatever: Start measuring it and invariably it’ll start moving in the desired direction, whether this actually serves your end goal or not. Often such initiatives come at the cost of the unmeasured, but over time it adapts and starts serving as a beneficial feedback.

The Assembly Line Benchmark - Widgets per Hour

The WidgetDuring the summer in my late teens I worked on an assembly line building car parts (pieces that played some sort of role in the air conditioning system – basically widgets): Put a little bracket in a metal cylinder, add a circle of fiberglass, inject some desiccant beads using a machine, add another fiberglass circle and another metal bracket to hold it all in place and then put it in another machine that squashed another cylinder onto the top. Then I sent it down the line to the welder.

Atop my machine sat a little counter that monitored my progress, carefully recording every piece assembled. While this was a less advanced era — being the prehistoric early 90s and all — and I had to manually transfer the final count to my timecard for submission, every worker was kept somewhat honest by the metrics submitted by the other workers on the line.

Clearly I couldn’t have done 2000 parts in a day if the people before me and behind me in line only reported 1000, for instance, and vice versa.

Coupled with continuous, careful QA tests and random inspections (performed by people who had their own metrics to work towards), this struck me as an excellent system because it was difficult or impossible to game, and the onerous checks ensured that it didn’t come at the expense of quality.

It certainly worked wonders on me, as I wiled away the endless summer days performing the most awesomely brainlessness of tasks by competing with my own personal productivity “records”, endlessly trying to push out more quality parts per hour day after day.

I was there and had nothing better to do, and that little counter sat looking down on me, mocking me. It dared me to do just a couple more pieces per hour, and I willingly complied.

Somewhere a paper pusher and cackling middle manager would sum up the part counts and rub their hands together in giddy glee, eager for my zombie-state quest for worth to pad their bonus cheques.

It’s good I was a summer employee, because my pace didn’t make me friends on the line.

Test Driven Development tries to create a similar spirit of metrics, giving you a goal to strive for as you build out your product. It’s a comforting bit of feedback when all of the TDD tests come back with green checkmarks. The more tests you create, the higher the absolute count of passes you can brag about when the product sails through with flying colors, easily passing 497 of the 497 tests.

Performance benchmarks serve the same purpose for the performance and efficiency domain.

Consider the initial hardware-accelerating video cards for Windows. Early on they seemed to have little or no purpose, and were almost abstract to users. Then benchmarks started appearing, giving the manufacturers something to strive towards while also providing end users with an easy way to compare and choose amongst the options. “Card {A} can only do 10,000,000 accelerated rectangles per second, while card {B} can do 12,000,000. Clearly we need to get card {B} for our rectangle displaying needs.”

Gaming the MetricsDiamond Speedstar 24x

Of course some vendors started gaming the metrics in various creative ways (see Joel's excellent essay on poorly thought out metrics). Several created products that actually recognized running benchmarks in hardware, “optimizing” (by any means possible, including simply discarding many of the benchmark commands, knowing the end user will never notice if every second rectangle or rendered text of the millions per second isn’t being rendered). Worse, the benchmarks were so atrociously artificial, bearing little similarity to actual everyday use that the direction of progress was to optimize the performance of benchmarks, often to the detriment of everyday use.

Eventually the benchmarks matured, getting better and more realistic, and the gaming was prevented or embarrassingly heralded, and it became a hugely beneficial tool in the march forward in the field. Various games have served the benchmarking role, the Doom and then Quake series being the most influential.

In the browser market, the growing interactivity of the web and the renewed competition amongst the big competitors has seen a flurry of benchmarks being widely discussed and debated, stereotyping each of the browsers into performance ghettoes. “Firefox is sooooo slow….” “IE is garbage. Opera is super speedy!”

Having some real tests is of obvious benefit to “set the record straight”, not to mention that it provides a carrot for the competitors to chase. Exactly that happened with me a while back when I came across a string concatenation benchmark, so I went in and streamlined the piece of Firefox code specifically impacted by that benchmark. My change in place, Firefox indeed did much better on that specific benchmark, though the real-world benefit was negligible.

In many ways the various web benchmarks available reminds me of the early accelerated video card benchmarks: Crude, having little or no correlation with the pain points of real-world use, and opportune for gaming and false evangelism.

WebKit's SunSpider

Which brings me to the recently released SunSpider benchmark (which is a credible contender for the widely coveted “most poorly chosen project name” award: It’s bad enough that an Apple project uses “Sun” in their product name, but it's thrice as bad when it’s a project related to JavaScript – JavaScript being another nominee).

SunSpider is very easy to run and gives quick feedback, so quite a few charts and graphs have been sprouting on blogs across the land.

JavaScript/DOM performance is a huge concern right now, as web applications are growing in richness by leaps and bounds, so there is definitely a need to be filled.

Will SunSpider be what we've all been looking for?

Here’s just such a graph, charting the stacked benchmark runtime for the current tier-1 browsers for Windows.

SunSpider Benchmark Results

Benchmarks were performed on a 4GB, Q6600 quad-core Core 2 processor machine running Vista x64. Firefox 3 was built from the current CVS (as of this morning). The Y-axis represents milliseconds.

Such a benchmark provides immediate feedback regarding the biggest bang for the buck optimization, at least in regards to improving the runtime of this particular benchmark. For IE 7 it is pretty clear that the benchmark killer is the bizarre and repeated use of string concatenation throughout the benchmark tests, particularly evident in the string-base64 and string-validate benchmark.

Naive String Concatenations

After approximately 20 seconds (okay, maybe 22 seconds) "optimizing" the base64 and validate tests to use the extremely common Array push/toString idiom that is used on pretty much any page that does more than the most trivial of string operations (my changes were rash and very simplistic, though if I were motivated — if this were production code — I could do a much better job with it), the performance had changed rather dramatically, as seen in the following graph (scroll up and down for dramatic flair).

SunSpider Benchmark Results

It's late and I'm tired, but I'd guarantee that I could dramatically decrease the remaining largest test -- string tagcloud -- but I think the point is proven.

Some will naturally draw from this the presumption that I'm just an Internet Explorer 7 fan, desperately manipulating the benchmark to best fit the strengths and avoid the weaknesses of my favourite browser.

They'd be wrong.

My browser of choice is Firefox. Not only do I not find the featureset of Internet Explorer 7 uncompelling and anemic compared to a naked copy of Firefox (not evening considering the enormous functionality offered by add-ins, such as the extraordinary Firebug), I find the performance of Microsoft's offering to be atrocious on real-world websites.

I don't like Internet Explorer on technical grounds, and I like it even less given the concerning conflict of interest it represents.

Perhaps I'm just bearing a grudge.

We're currently implementing a very rich, advanced web application, and one thing that we've found, in case after case, is that in real-world situations with extensive DOM manipulation and production JavaScript, Internet Explorer stumbles and groans under the load, while competing browsers complete the task with gusto (just rendering a dynamically loaded complex table takes 20x or more on IE than in Firefox 2. The disparity grows greater with Firefox 3). It's to the point that I can't help but wonder if Microsoft is trying to undermine the whole web thing intentionally, hoping to encourage the middle-grounders to hoard to the boards proclaiming the deficiencies of web apps, manipulated into begging for some XAML goodness.

So if I wasn't looking to defend IE7, what was my point?

Lies, Damned Lies, and Benchmarks

Maybe the motivations of the team behind this benchmark were noble, and they weren't blinded into naturally biasing the benchmark towards their own project, but I can't help but see this benchmark as an entirely artificial, naive, unrealistic benchmark that adds little to the benchmarking landscape. A cursory glance through the benchmark sees bizarre oddities that would never appear in real-world code, and a variety of implementation choices that are questionable for a benchmark (for instance test/sample data is often constructed within the timed scope of the benchmark in the SunSpider tests, as if a production website needs to create 4000 random email addresses and ZIP codes, for instance. Normally such data is constructed outside of the timed loop, for obvious reasons).

The lack of weighting, the lack of realistic test scenarios... I'm just not convinced that it holds much utility (though I do like the way they have the "driver", and the elegant and clean client-side way they aggregate the test values, and do the same for comparison. The framework is a great foundation) for cross-browser comparison. I can see use in analyzing performance differences for a single browser (the results turning Firebug on and off, for instance, were very surprizing), just not as a valid comparison between different browsers.

Just as I dramatically changed the IE results in less than a minute of code changing, I'd guarantee that I could do the same with the other outliers (in particular the longer Firefox tests).

I'm still waiting for a good, real-world benchmark. Something that simulates sites like Digg, Slashdot, Facebook, interacting with them in a way that a real world user really would.

]]>
Picking A Side In The HD-DVD vs Blu-ray War http://www.yafla.com/dforbes/Picking_A_Side_In_The_HD-DVD_vs_Blu-ray_War/ http://www.yafla.com/dforbes/Picking_A_Side_In_The_HD-DVD_vs_Blu-ray_War/ Fri, 07 Dec 2007 17:09:00 GMT I'd been sitting on the sidelines of the HD-DVD vs. Blu-ray spectacle, reluctant to sink cash into either hardware or media until the dust settled and one victor remained.

I'm hardly alone in this sentiment: No one wants an expensive piece of hardware sitting unused, or a media collection that is only playable on the one TV down in the basement (after reconnecting the derelict player that had been disconnected to free an HDMI port).

Sales had been relatively slow for standalone players.

Instead the most successful uptake of the new formats has been via the Sony PS3 and its built-in Blu-ray player (Sony is the principal backer and beneficiary of Blu-ray), accounting for a hefty percentage of deployed Blu-ray players worldwide, whether their owners know that they're being counted as faithful Blu-ray fans or not. For those who were aware of the feature, I'm sure it helped them justify the purchase to their parents/wives/husbands: "But it's also a next generation DVD player!" (Countless PS2s were sold on the justification that it could double as a somewhat mediocre DVD player).

HD-DVD vs Blu-ray

This vaulted Blu-ray into an early lead considering that Microsoft, despite being an HD-DVD backer, didn't incorporate HD-DVD into the XBox 360 -- there remains widespread confusion about this -- instead offering it later as an add on player.

The boards filled with the Blu-ray faithful, hopeful that they could help the format succeed to vindicate their purchase justification.

Not wanting a PS3, the motivation to upgrade just wasn't as strong as it was for, say, the desire to move from VHS to DVD. While the new formats technically offer improved resolution, and a much better video compression technology (greatly reducing irritants like posterization in dark sections), the improvement isn't dramatic compared to standard DVD run through a competent upscaling DVD player. Audio has theoretically improved on the new formats, but given the sparse availability of DTS-encoding movies on DVD media -- DTS being the higher quality alternative to Dolby Digital -- the audio capabilities of DVD was barely exploited at all already, so I don't expect much real improvement with the new formats, beyond looking better on paper.

The interactivity features of the platform have improved (even seemingly trivial things like accessing the chapter guide while the movie continues to play, the chapter guide translucently overlaid), but until the media makers start fully leveraging it, and unless you are the sort to draw a lot of value from the extras, that isn't a major selling point. DVD was a huge convenience win over VHS, with random access and no be-kind-rewind demands, but the new formats are just minor improvements over what we already have.

Which brings me to my recent desire to buy an upscaling DVD player, desiring a unit that interpolated more elegantly to HDTV resolutions.

Then I came across a Toshiba HD-DVD HD-A3 player for less than $200 (with 2 free HD-DVDs in the box, and another 5 via a mail in form).

Toshiba HD-A3

So I picked sides, and chose HD-DVD. I've thus declared fealty to the format, and will now order the loyal minion t-shirt and ballcap, and debate the point passionately whenever the opportunity arises!

My purchase justification goes as follows-

  • It's cheap. Really cheap. Comes with a couple of movies as well, which is nice. It doesn't have every feature, but it's a good start. [For instance it's missing 1080p, but that didn't disuade me: 1080p happens to be one of the most oversold, misrepresented features trumpeted today. I'd much rather have 1080/24]
  • We'll still be largely watching traditional DVDs, with the odd HD-DVD rented from zip.ca. I'm not going to cry any tears that some companies have been bribed or coerced into only supporting one format or the other -- I'll just go with the standard DVD option.
  • It's a really, really good upscaling DVD player, so I'm completely satisified with the purchase even without playing next-generation media. If Blu-ray was victorious and tossed the HD-DVD consortium into the dustbin of history, I wouldn't have purchase regret (I would feel quite differently if bought one of the $700 players).
  • The storage differential between the two formats is irrelevant for the prescribed use. While the greater capacity of Blu-ray is a win if you want it as a backup format for a PC, it isn't pertinent for a 4 hour VC1-encoded motion picture with top quality audio and sound. Indicative of this is the fact that quite a few Blu-ray releases have been encoded with the vastly inferior MPEG2 codec, wasting the extra space to use an obsolete compression technology.
  • In a few years, this will all be moot anyways, as streaming technology and capacity improves. Ultimately these are holdover technologies.

I still don't plan on amassing a media collection, but I have been enjoying the higher quality rentals -- when a given release is available on HD-DVD -- for just a small premium over a decent upscaling DVD player.

]]>
Mainstream Gaming Goes Multicore http://www.yafla.com/dforbes/Mainstream_Gaming_Goes_Multicore/ http://www.yafla.com/dforbes/Mainstream_Gaming_Goes_Multicore/ Fri, 16 Nov 2007 16:45:42 GMT The just released first-person shooter Call of Duty 4 hit store shelves a couple of weeks ago, targeting a variety of platforms, surprisingly including the PC. I grabbed a copy to supplement my Battlefield 2 outings -- I still enjoy the odd late-night frag-fest to ease me gently into sleep.

I was pleased to find it priced at <$50 at my local electronic superstore: 20 years ago, terrible games for a ColecoVision cost $70-$80, so it's amazing to pay less for such an incredible product. It's even more amazing after factoring in 20 years of inflation.

CoD4 is a remarkable feat of software engineering, demonstrating significant technical excellence. As a business-class software developer, it's a bit humbling. It's pretty clear that the team is extraordinarily capable. The game is riddled with remarkable details that are easy to miss, but they're so impressive you're surprised that they didn't stop the action and highlight it to make sure the player savoured every laboriously created effect.

Probably worth mentioning that It also happens to be a lot of fun to play!

Featuring unparalleled visuals, incredibly intense combat, an interesting and immersing storyline, and a lot of gameplay variety, the single player game has been a pleasant surprise. I usually skip the often trite single-player games in first-person shooters, heading directly online for some multiplayer combat. For this one the single-player game has captured every moment of the few chances I get to game.

It isn't a perfect game, though. Like many games in the genre, the player is tightly coralled along a heavily orchestrated path (standing in stark contrast with even dated games like Operation Flashpoint, which allowed the player to basically do whatever they wanted within the confines of the island). The game features invisible triggers that cause predictable outcomes: When I pass this line the two guys will appear behind that door, and then I'll move right and the guy will run out at the top and the five guys will march in the door at the end of the hall.

It does randomize to a small degree, but I can't imagine that there's a lot of replayability.

One noteworthy attribute of the PC version -- what inspired me to write about it on here given the various entries about quad-cores over the past while -- is its ability to scale out across multiple cores. The following is a graph of CPU usage on my otherwise unladen quad-core PC.

Call of Duty 4 Multicore CPU Usage

The drop-off at the end was the period of time while it switched out to the desktop so I could take a screenshot of the task manager. While actual gameplay was underway, the application consumed about 75% of 3 cores at times, peaking up to 75% on the fourth core, using this to apply the physics model used extensively throughout the game. Further evidence of the technical excellence is the fact that they didn't just spin out threads for each core and then busy wait for the duration (which would have fixed each at 100%) -- as many games sadly do (Netscape Navigator's download manager used to do that) -- but instead they have some finite amount of calculations they perform and then they properly yield to any other tasks on the system that might have work to do.

A brilliant game, making great use of modern hardware.

]]>
Virtuous Vista - Taking the x64 Plunge http://www.yafla.com/dforbes/Virtuous_Vista_Taking_The_x64_Plunge/ http://www.yafla.com/dforbes/Virtuous_Vista_Taking_The_x64_Plunge/ Mon, 12 Nov 2007 18:00:00 GMT Some recent hardware upgrades left one of my PCs in a fragile state: While I had managed to avoid the time sucking irritation of an OS reinstall over generations of hardware changes, the latest hardware swaps -- or rather wholesale migrations -- pushed the crufty stack of refusing-to-vacate legacy drivers and confused minibuses over the edge.

It had become a fragile platform, with frequent crashes and unexplained behaviour.

Not surprising, given that the OS install had seen the platform go through a myriad of trasitions, including from a VIA chipset to an nforce chipset to a VIA chipset and then back to an nforce chipset; from nvidia to ATI to nvidia graphics.

Really it had dealt with change remarkably well, at least relative to the tradition of Windows being tightly bound to the installation hardware, with the slightest deviation demanding a reinstall.

Like many users, I'd been avoiding Vista. Actually that isn't entirely true: It wasn't that I was avoiding Vista, but rather that the benefits it offered just weren't compelling. Not compelling enough to reinstall a working system, or even to deal with the inevitable dev team just had to move stuff around to set their fingerprint search for traditional features.

When Flip 3D is a highlighted feature, you know that things can't be that great.

Nonetheless, given that a rebuild was in order anyways, and given that a professional in this field should generally ride the leading the edge of technology (though nothing like the days of old -- once upon a time we had to eagerly consume every Microsoft beta as a necessity for job security, whereas nowadays technologies like WPF are a complete and utter non-issue in the general tech industry), I figured I'd force myself to embrace Vista, warts and all.

I'd eat my industry's dog food.

To make things more interesting, I decided to go full bore and install the x64 version of Vista. I had a fully 64-bit capable processor, and despite only having 2GB of RAM (with two slots eagerly waiting to be populated), it would nonetheless allow me to start playing around with the extended x64 instructions and registers.

x64 is much, much more than just bigger pointers, so I wanted to play around with them in my favourite development environment (like many such transitions, it will take a few years before the benefits of x64 are fully realized, though in the .NET world the 64-bit runtime immediately takes advantage of the new features for platform neutral assemblies.)

The install and transition went very well. While I had to disable UAC quickly to make the experience marginally pleasant (I'd experienced Vista enough in dual-boots and virtual machines to quickly learn to despise this security "feature"), generally it just went well.

On the modern nforce4-based motherboard (featuring onboard sound and networking, all drivers installed by the single package of nvidia drivers), with a Q6600 quad-core processor and contemporary nvidia video card, everything simply worked after running a couple of driver packages. Even my dual-layer DVD burner just worked. All of the attached Canon printers, including the multifunction with faxing and scanning, auto-installed and immediately worked. Visual Studio 2005 worked great with the Vista patch, and while it's a 32-bit app it capably generates great 64-bit builds. Awesome!

Now I had IIS 7 to play around with, getting ready for the same in the server sphere. The integrated pipeline and web.config based configuration really is a wonderful step forward for the platform.

Several things didn't work. CameraWindow won't work with my Canon Digital Rebel XT, though by switching to PictBridge on the camera I found that Vista itself did a great job downloading and organizing the photos (and Windows Photo Gallery is a universe better than the terrible "Picture and Fax Viewer", not just for navigating and organizing photos, but moreso because the venerable XP viewer did a horrible job resizing photos for display at monitor resolutions, managing to make great shots look terrible). The bundled burning software that came with my DVD drive wouldn't install, but I was delighted to find that Vista actually had pretty decent burning capabilities built in, competently writing 9GB backups to my DL discs.

I hadn't really exploited the benefits of the 64-bit address space at this point, but then I booted up a copy of Battlefield 2 for some late night mental-diversion fragging.

Imagine my surprise when this game that ran very well on a vastly less powerful 1GB XP system now yielded frequent stutters while significant memory paging occurred. Despite having twice the RAM to draw from in the rebuilt machine (if you add in the system baseline, it really had about 2.7x as much RAM to draw from), somehow it was now hitting the ceiling on a machine that was almost completely dedicated to it.

So I grabbed 2GB more PC6400 DDR2 from a local retailer, plugged it in, and with 4GB everything is running pretty smoothly. Another $120 to upgrade the memory to have the same experience that I had with 1/4 the memory in XP.

Vista was hardly a surprise to me given that I'd been playing with it in various forms since the Longhorn days, and the technical feature list is entirely underwhelming compared to the early vaporware promises, however it isn't the travesty many are making it out to be. While it definitely needs work, it isn't going anywhere, and Microsoft isn't going to rollback to XP.

If you have modern, supported hardware, Vista x64 is a solid choice.

]]>
The Templates Behind The New Blog Engine http://www.yafla.com/dforbes/The_Templates_Behind_The_New_Blog_Engine/ http://www.yafla.com/dforbes/The_Templates_Behind_The_New_Blog_Engine/ Tue, 06 Nov 2007 12:00:00 GMT I recently opted to throw together my own blog software (after going through the standard Build or Buy analysis), expediting deployment as a means of forcing follow-thru. The goals of this micro-project were to improve the authoring and content management experience, to improve searchability of the content (without having to cast content out from the blog to a static form), and to improve the usability and navigation from the user's perspective (for instance the classic "date" navigator common on most blogs is something that I've opted to remove).

Despite having close to no time to allocate to this task, my tendency to over-engineer still showed through: The easiest option would have been a content-management system defined entirely in code (it's as easy for me to change and deploy code than it is to change templates and metadata), and of course to build it for a single author. Instead it supports many blogs through the same URLRedirector, blog aggregations (where a blog is a publication of a set of blogs, each with distinct authors) each using its own templates and configurations.

Which brings me to templates -- failing to find a decent Smarty-type templating system for .NET (basic ASPX is really a templating system, but I'm speaking more towards something that can enumerate sections, retrieving data based upon an object structure of relationships and containment).

So I had to build a basic templating system, yielding the templates that follow. The first for HTML output--

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta>
<title>{#blog.Title} {#docTitle}</title>
<link rel="stylesheet" type="text/css" media="screen, projection" 
   href="http://www.yafla.com/dforbes/style/css/blog.css"></link>
<script type="text/javascript" src="http://www.haloscan.com/load/dforbes"> </script>
</head>
<body>
<div class="clsHeader">
<div class="clsBlogHeader"><a href="{#blog.BaseUrl}">{#blog.Title}</a></div>
<div class="clsSubheader">{#blog.Description}</div>
</div>
<div class="clsBody">
{foreach $entry in $entries}
<div class="clsEntry">
   <div class="clsDate">{#entry.EntryContent.PublishDateUTC|dddd, MMMM dd yyyy}</div>
   <div class="clsTitle"><a href="{#entry.Permalink}">{#entry.EntryContent.EntryTitle}</a></div>
   <div class="clsBody">{#entry.EntryContent.EntryContent}</div>
   <div class="clsKeywords">{foreach $keyword in $entry.EntryKeywords}&nbsp;
      <a href="{#blog.BaseUrl}{#keyword.KeywordText|escape}">{#keyword.KeywordText}</a>&nbsp;{/foreach}
   </div>
   <div class="clsPermalink">
      <a href="javascript:HaloScan('{#entry.MappingId}');" target="_self">
      <script type="text/javascript">postCount('{#entry.MappingId}'); </script></a>&nbsp;
      <a href="{#entry.Permalink}">permalink</a>
   </div>
{foreach $relatedentry in $entry.RelatedEntries}
{ifcond $LoopFirst = "True"}
<center>
<div class="clsRelatedEntries">
Related Entries
{/ifcond}
<div class="clsRelatedEntry">
   <a href="{#relatedentry.Permalink}">{#relatedentry.EntryContent.EntryTitle}</a>
</div>
{ifcond $LoopLast = "True"}
</div>
</center>
{/ifcond}
{/foreach}
</div>
{/foreach}
<div class="clsAdBlock">
{#adBlockHorizontal}
</div>
<div class="clsNavigator">
   <span class="clsNavigateEarlier">{#moveEarlier}</span><span class="clsNavigateLater">{#moveLater}
   </span>
</div>
</div>
<br/>
<div class="clsAttribution">
   <a href="mailto:{#entry.EntryContent.ContentAuthor.EmailAddress}">
      {#entry.EntryContent.ContentAuthor.Name}
   </a> - 
   {#entry.EntryContent.ContentAuthor.Description}
</div>
</body>
</html>

The next template is for RSS consumers--

<rss version="2.0">
  <channel>
    <title>{#blog.Title|escape}</title>
    <link>{#blog.BaseUrl}</link>
    <description>{#blog.Description|escape}</description>
    <lastBuildDate>{#buildDate|r}</lastBuildDate>
    <language>en-us</language>
		{foreach $entry in $entries}
		<item>
		  <title>{#entry.EntryContent.EntryTitle|escape}</title>
		  <link>{#entry.Permalink}</link>
		  <guid>{#entry.Permalink}</guid>
		  <pubDate>{#entry.EntryContent.PublishDateUTC|r}</pubDate>
		  <description><![CDATA[{#entry.EntryContent.EntryContent}]]></description>
		</item>
		{/foreach}
  </channel>
</rss>

All in all, I think it works pretty good, and I can successfully run the W3C validations on the vast majority of generated pages and get the comforting green checkmark.

]]>
The New Blog Software Goes Live http://www.yafla.com/dforbes/The_New_Blog_Software_Goes_Live/ http://www.yafla.com/dforbes/The_New_Blog_Software_Goes_Live/ Tue, 30 Oct 2007 06:01:44 GMT My original foray into the land of blogging was delayed while I stumbled towards the goal of building my own blogging software: like many software developers, I have a sometimes irrational desire to build it myself rather than admit “defeat” and use one of the many (and in the realm of blogging, there are many) available products.

I took a couple of stabs at building it myself originally, but due to another common foible – a tendency to over-engineer (I couldn’t simply write some blog software to post and publish my own thoughts. No…it had to be a full multi-author aggregation and collaboration suite, meaning that weeks went by while I mentally debated the database model for such a machination) – it just never seemed to get finished.

Other priorities always trumped it, and the little time I did allot towards this goal saw me solving absurd edge conditions.

I finally set a deadline for myself, and when I couldn’t find the time to finish anything before my marker (billable hours always came first), I went and bought a copy of Radio Userland and started publishing content the blog way.

That worked well enough for a while, but Radio Userland is a venerable publishing tool that is really showing its age. Authoring to it is a less than pleasant experience – which has been a huge contributor towards the dearth of content (it’s always a bit of a roll of the dice to see which characters it randomly replaces in posts, or which carefully authored HTML blocks it’s decided to mangle) – and simple tasks like cross-linking posts (e.g. a “related posts” sidebar to allow users to easily see follow-ups) was just far too manual to be worth the bother.

Now that I have a powerful, fully dedicated server, it’s also grossly under-featured for users, making the experience of consuming and navigating through the information far less usable than it should be.

So I’ve gone and built my own blogging software, this time quickly bringing it to a sort of beta release.

Given that this is the venue with which I will publicize a ton of changes elsewhere on the site, I really considered this a roadblock on the critical path to the release of other web application functionality elsewhere on yafla.

With some focus, it took only a couple of hours this time, mostly accomplished while putting my toddler son to bed over the weekend. It was so ridiculously quick and easy that I kick myself for not having done it sooner.

I’m extremely pleased about the functionality built out (hey it isn't rocket science, and definitely falls within the realm of "trivial", but there's lots of little "gotchas" with software like this), though most of the kudos go towards .NET 2 and SQL Server 2005: A couple of tools that make short work of what would once have been an enormous task, bringing a robust, secure, high performance web application to a usable stage in less time than it takes to watch the Lord of the Rings trilogy.

Right now you’ll probably notice that – at this moment at least – the HTML version of the blog looks absolutely terribly. That is somewhat by design (or rather an intentional time compromise)…momentarily. I’m working on the template (it’s of course parameterized template driven), and wanted to force myself to follow through by deploying (perhaps prematurely).

So what are the features of the blog software?

Well, firstly I migrated 100% of the old content over (including metadata such as categorization), running it all through Tidy first to try to make it a little more XHTML legitimate. Using an identifier mapping structure, every single link to the legacy content still works (which was important to me: I didn’t want to give link followers the frustrating “We moved everything so have fun trying to find it” 404 experience).

Everything works via URL remapping, and for now I’ve set it to redirect from old links to the new links where possible. E.g. http://www.yafla.com/dforbes/categories/softwareDevelopment/2005/09/28.html redirects to http://www.yafla.com/dforbes/Clean_Code. All new entries Will follow that more transparent and obvious structure.

But the URLs aren’t limited to just single documents – All entries in June of 2006 can be accessed via http://www.yafla.com/dforbes/2006/06. Add in a category and you can refine further – http://www.yafla.com/dforbes/2006/06/.NET (or http://www.yafla.com/dforbes/.NET/2006/06. Whatever makes you happy).

Want that in RSS form?   http://www.yafla.com/dforbes/2006/06/.NET/rss.xml.  Add in the day if you wanted to refine further.

Of course, no longer are entries limited to the archaic “categories”. Now they’re basically keywords, so if you want to see the posts where I’ve abused categories and multi-tagged, take a look at

http://www.yafla.com/dforbes/.NET/SQL/Blogging/SoftwareDevelopment/Personal/IT/

Yikes!

So the tagging will be much more logical now that there aren’t broad categories, and given that anyone can filter content however they want (stick rss.xml on the end and you can get a feed of whatever you want).

There’s also search, though I’m not comfortable enough with the finality of the API to publish anything about that.

Entries now have versioning, given that I want to be more transparent with edits that I make (I’m endlessly doing minor corrections and improving wording, and for those who consider that deceptive there’ll be a little version history to see what changed and when, along with a label of why the change was made). All links are auto-parsed and logged, so every entry has a list of posts that link into it, making for much more elegant self follow-ups without resorting to post-editing some “UPDATE: See also…“ notes into old entries, and without resorting to the ugliness of trackbacks.

Extensive caching ensures that it’s still spritely and capable of handling peak loads with no fuss.

Oh, and the system supports many blogs by many authors, including publishing multiple authors into one system…so I still over-engineered, but in the end it was workable and I’m extremely happy with the core structure.

Great things lie ahead.

]]>
The Legacy PC Gets An Upgrade to a Quad Core CPU http://www.yafla.com/dforbes/The_Legacy_PC_Gets_An_Upgrade_to_a_Quad_Core_CPU/ http://www.yafla.com/dforbes/The_Legacy_PC_Gets_An_Upgrade_to_a_Quad_Core_CPU/ Mon, 17 Sep 2007 14:26:10 GMT

One of my PCs is a bit of a Frankenstein, having gone through countless small upgrades over the years.

A video card here. Some memory modules there. A replacement primary harddrive here (thank you g4u). A supplementary hard drive there. Half a dozen different CD and then DVD and then Dual-Layer DVD burners.

Every now and then it'd see a larger upgrade that mandated a motherboard replacement alongside a new CPU. Often that would require new memory modules as well. Maybe even a new power supply as connection standards changed.

Motherboard replacements have always been the most disruptive, and it's been interesting to watch as each has negated the need for some add-in or other. First the USB+firewire board got punted, having been replaced by onboard functionality. Then the network card. Then the Soundblaster card. The only true add-in card usually needed nowadays is the video card, and I'm sure it's only a matter of time before the on-board video reaches a credible level of performance, eliminating even that.

I've pursued this piecemeal approach to upgrading primarily because it minimized the software disruption in my life, usually requiring just a quick module swap, some driver updates, and it's up and running again. I actually enjoy the modular, hybrid-PC pursuit, individually scoping out and replacing components with the best bang-per-dollar option available at the time. It's a bit of a hobby.

[Clearly I'm not alone: A local "Tiger Direct" store opened recently in my town, featuring a huge floorspace stocked with esoteric power supplies, mod cases, and other components for DIY builders. I'm surprized that the demand is still there, having thought that the self-builder was an endangered species]

I've been negligent, however. Over the past while this PC had seen little attention. Running on an extremely dated Athlon XP 1800+ (overclocked to equal a 2200+), with a "measly" 1GB of DDR1 RAM and a dated collection of complimentary components, it had fallen so far behind the times that it has dropped far off the current CPU charts. While it served its casual gaming task well (the video card is quite contemporary, and given that few games are constrained by the CPU, it held its own), and admirably provided the network storage for photos and videos, its anemic standings were a bit embarrassing. Sure, it didn't need to be decent given the various home and business laptops -- powerful, modern units that saw most of my computing activity -- but I felt like I was letting it down.

So following up the entry from a couple of weeks ago, I finally got around to ordering a new CPU and motherboard on Tuesday, ordering a retail boxed Intel Core 2 Quad Q6600 2.4Ghz processor from Direct Canada for the extraordinarily low price of $279.99 CAD. I'd been directed to their site from a search-engine yielded link to "Shopbot.ca", so I was a bit wary placing my order with this unfamiliar provider, but at 1pm the next day the box arrived at my door, amazingly delivered less than 24 hours after I ordered, coming from a shop 3000km away. I'm very satisfied with the price and speed. (I received no considerations for that comment, and know nothing about the shop beyond the fact that they sold me a killer piece of hardware at a great price, delivering it very quickly. Your mileage may vary.)

In the end I discovered that some new memory modules would be in order to fully yield the speed (going with 2GB to correlate with the oft claimed speed advantage that often flies in complete contradiction to actual memory usage metering). Oh, and a new case as it might make the whole process a little easier.

In the end, the only legacy pieces that made the migration to the "upgraded" box are the hard drives, and the video card.

Minutes later the full-retail copy of Windows was running the right drivers, and after a quick re-activation it was storming along.

I booted up.

In a word (and a punctuation) - Wow!

What a tremendous amount of computational power on the cheap. Day to day activity really feels no different than it did before -- browsing is the same fast browsing that it was before, and given that I don't try to use Excel as a warehousing database, Office seems the same as well. Battlefield 2 plays the same given that I have the same video card, albeit now with absolutely zero stutters or hiccups as other threads demanding timeslices are generally satisfied by one of the other cores.

For the things that actually keep me waiting -- encoding a home video from the MiniDV, or building firefox from CVS, as I do regularly -- the improvement is enormous. Not only are these operations massively sped up by the four cores available to them, better still I can configure them to only use one, two, or three threads of parallel executions (via the -j build option for Firefox, for instance), constraining them as a coarse fix for the deficiencies of the Windows scheduler. I can now run a full Firefox 3 build in just 12 minutes with full parallelism, or run it (or other demanding applications) with little or no impact in the usability and functionality of this PC for other tasks.

Parallel Building Firefox
Full Build Times on Quad Core Processor
(Bars represent time. Shorter=better)

 -j1 (default)    24 minutes 12 seconds
 -j2 (1 to 2 threads)    16 minutes 52 seconds
 -j4 (1 to 4 threads)    14 minutes 34 seconds

The build continued to speed up with more possible parallel operations, albeit with a decreased rate of return, with the fastest test build occuring in just over 12 minutes with the highest option tested: -j12. Having more parallel operations than cores can yield benefits when it increases the time utilization of a saturated resource, which in this case was the hard drive. At this point the cores were left twiddling their thumbs waiting for the storage to catch up.

Limiting the build process to two cores via the process CPU affinity had it CPU starved beyond -j2, yielding no benefit via more parallelism.

You can find a stacked graph detailing core processor usage for the above -j4 run (on 4 cores) at http://www.yafla.com/dforbes/images/Firefox_build_j4_4core.png. You can also look at a chart of building Firefox using the -j4 option, but setting the processor affinity to only allow the build access to two cores.

Not only is the build performance fantastic, but better still I can throttle it back to only run at most two parallel operations (-j2), getting a build in a still impressive 17 minutes while leaving two cores completely available for other tasks, like browsing the web with full responsiveness. I can even launch Battlefield 2, and remarkably it plays flawlessly...despite the fact that a full-scale, parallel build is going on in the background.

(Sidenote: Threads can still be left stalled, stranded waiting for a shared resource like the limited memory bandwidth and I/O paths, for instance. In the sample above my build was on a second harddrive -- a configuration that I recommend for all power users -- and clearly the other shared resources didn't impact the game to a perceivable degree)

What a revolution in computer usage. What a discount-priced computational powerhouse.

]]>
More Cores For Everyone! http://www.yafla.com/dforbes/More_Cores_For_Everyone/ http://www.yafla.com/dforbes/More_Cores_For_Everyone/ Tue, 04 Sep 2007 17:09:04 GMT

A recent article on the utility of multiple cores has been making the rounds. Despite being largely a copy/paste of other articles and graphics, with a smidge of editorial commentary, it is anxiously heralded by dual-core owners as purchase justification in the face of progressing technology.

[As fair disclosure, let me say that I'm about to purchase a quad-core processor based system, and this article and its sources did absolutely nothing to dissuade me from this choice]

The meat of the article (or rather the articles that are referenced by the article -- someone else did the dirty, arduous footwork work of benchmarking) is comprised of a showdown between a 2.4Ghz quad-core and a 3.0Ghz dual-core, which is reasonable given that they're comparable in price [at writing the 3.0Ghz dual-core E6850 can be had for $384 CDN, while the 2.4Ghz quad-core Q6600 is $319 CDN]. Given that many games and applications are effectively single-threaded as a legacy of lowest-common denominator development, the faster clock speed dual-core processor abstractly takes the lead in such fundamentally synthetic benchmarks for the pricepoint.

Aside from the questionable "it's good to have one extra core to allow you to kill bad processes" premise (what if those bad processes are multithreaded? Do you just have to buy bad-process-threads+1 cores? Maybe set the affinity such that you've dedicated a core solely for the task manager? In the real world of modern schedulers, the only time you can't get control of the machine to kill a rogue process is because of some absolutely atrocious elements of the implementation of Windows, and a scheduler that is effectively broken in the face of some situations. Neither is necessarily improved by more cores), what really gets me about the whole exercise is how utterly synthetic it really is, using contrived benchmarks instead of rationally considering how people actually use their PCs, and where their real need for more power comes from.

Firstly, it largely focuses on games benchmarks. Even if gaming performance is pertinent to the reader, for the majority of users playing the majority of games, their video card is far more of a bottleneck than their processor (even if their processor is a dated affair). I'm saying this as a long time computer gamer -- one that finds the stuttering framerate on even top of the line game consoles intolerable: unless you've turned every quality setting to low and you're running at 800x600, it's doubtful that you're going to even measure, much less notice, a difference between a modern 2.4Ghz core and a 3.0Ghz core. Indeed, the very first benchmark I looked at on the referenced article says exactly that: "For this test, we set Oblivion's graphical quality to "Medium" but with HDR lighting enabled and vsync disabled, at 800x600 resolution". They did that to create a scenario where the differences are measurable.

So if you plan to game in a contrived way for the purposes of demonstrating CPU differences in benchmarks, then you'd better pay attention to core speed.

In the real world of gaming, after you've adjusted the quality and resolution settings to appropriate settings for your video card, the primary slowdowns during gaming tend to come about because of external applications rudely stealing your thread quanta: I'm about to toss the grenade into the bunker in Battlefield 2 when suddenly Windows Search has decided that this is a good time to rebuild its index corpus, for instance, so instead it falls to the ground and I take out my entire squad (Seriously, Windows Search guys - when a full-screen DirectX game is running, it probably isn't a good time to decide that the PC is "idle").

For moments like that, more cores make a huge difference. Dual-cores would be sufficient for that simple scenario, but what if my PC is even more active, as it always is? Perhaps the blog updater is running an update, I'm FTPing some files, a download is happening, and I'm gaming.

Every core works towards the ultimate goal of eliminating the real world problem of cycle theft from my hardcore gaming.

Presuming that you've passed a reasonable bar -- long behind you when you're talking about a 2.4Ghz Core 2 -- more cores will realistically improve things for gamers enjoying their vice in the real world. One day we might even have a world where we don't have to shut down services and trawl taskmanager violently killing processes before launching a game, fearful that it will disrupt our immersion.

My second problem with the article is that it doesn't question what people are really waiting for nowadays. Personally I see almost no difference between virtually any mainstream PC for the overwhelming majority of day to day operations (and this is as a developer) -- most activities are so fast the difference is negligible. I just switched laptops from a single-core 1.6Ghz Pentium M to one with a Core 2 Duo T7200 -- a significant improvement -- and from a day to day perspective I've indeed notice that the new laptop has a better screen, a faster harddrive, and much better graphics, but the computational difference is largely unnoticed.

Until, humorously, I do something that is highly parallelizable, such as encoding a video pulled in from the miniDV video camera. In that case the dual-core processor strides to a massive lead over its single core predecessor. If it were a quad-core, it would storm even further ahead, even with the loss of frequency.

For something that I'm actually waiting on, more cores = more goodness.

I would definitely choose the quad-core processor for the software reality legacy that we have today, despite the many applications that in the singular fail to exploit the possibility. My conviction is amplified by the tremendous strides that application developers are making to parallelize their products. Once you've parallelized to 2 cores, it's generally a very small step to parallelize to 4 cores, or n cores for that matter.

Bring on the cores!

]]>
Summer Nears Its End http://www.yafla.com/dforbes/Summer_Nears_Its_End/ http://www.yafla.com/dforbes/Summer_Nears_Its_End/ Mon, 27 Aug 2007 12:46:51 GMT

Summer is waning here in the Northern hemisphere. 

While it's sad that the warm weather and summer activities will soon be packed in the garage for another year, it's almost the time for fall fairs, rich soups, apple picking, walks in the gorgeous escarpment country when the leaves have changed color, pumpkins and costumes.

IMG_2741

'Tis a wonderful season ending, to be replaced by another great time of year.

With the decrease in outdoor activities, I'll be posting more frequently. I've been kicking SQL Server 2008 around, and look forward to writing about it (I'm excited about its new hierarchical functionality, which has echos of versatile high performance hierarchies), along with many other thoughts that have percolated in my head.

]]>
Cell Phones and Radioactive Imaging - Measuring Flow http://www.yafla.com/dforbes/Cell_Phones_and_Radioactive_Imaging_Measuring_Flow/ http://www.yafla.com/dforbes/Cell_Phones_and_Radioactive_Imaging_Measuring_Flow/ Mon, 13 Aug 2007 15:02:45 GMT

Stuck in traffic a few days back, my idle mind wandered to the technical feasibility of pervasive, real-time traffic flow monitoring, and how this information could be communicated and utilized.

Such perfect, real-time information could help to redistribute the roadway load for the benefit of all (or, more realistically, let the suckers boil in the midday sun while the information insiders zip around congestion points), reducing transit times and energy use, and perhaps providing emergency services with optimized transport mapping, improving their efficacy.

Something had to be better than the sparse, time-lagged reporting of the radio station, or blindly rolling down an onramp, curious why a string of cars were dangerously reversing up the shoulder, to find the entire highway at a standstill, as I had that Friday afternoon.

There are quite a few implementation options apparent to a traffic layman like myself: Cameras with AI counting cars and estimating their speed. Underground (or overhead) magnetic sensors, or underground weight sensors. Laser relays for single lane roads.

The basic problem with such solutions, however, is that they tend to be expensive to install and maintain, and from that they tend to be infrequently deployed, at best spaced at distances that greatly reduce their utility (e.g. "between highway marker 70 and 112 there is some sort of disturbance"). Add to that the communications network required to relay these telemetrics.

Having worked in the telemetrics/remote monitoring industry before (in the late-90s), I was contemplating how cellular data technologies were just become feasible for such remote monitoring communications when the thought occurred to me: Most every cell phone now is constantly communicating digitally with its base station. Further, every cell phone can be either triangulated to a location, or more recently knows its precise position with the use of GPS. A tremendous percentage of cars on the road have at least one cell phone in them, the phone company (or anyone listening in on the conversation) capable of tracking location and speed, easily overlaying that over a mapping system to determine roadway flow.

Imagine an entire roadway system that overlaid the millions of cell phones moving around, easily visualization slow downs and congestion. It would be similar to the medical procedure where they inject radioactive particles into a patient's blood system, determining flow throughout the body by measuring their movement.

It turns out that I'm not the first to think of this. A quick Google search upon getting home made it apparent that there are several commercial products that do something similar. Nonetheless I thought it a fascinating example of passive data collection, deriving secondary advantages out of widely deployed technologies like cell phones.

]]>