Jeff Atwood, of Coding Horror fame, recently rebutted my post "Beginners and Hacks", which itself was a reply to his post "C# and the Compilation Tax".
Jeff makes some great points, but at the outset I have to disagree with his statement "The present model of software development is clearly monkeys all the way down. And if you're offended to be lumped in with the infinite monkey brigade, I'd say that's incontestable proof that you're one of us."
No, Jeff, I don't develop via the Infinite Monkeys Model. It disturbs me that any professional in this industry would volunteer for such a pejorative.
While humility is often a good thing, there is a limit. Every developer can't be Linus Torvalds or John Carmack, but every single developer should still have professional self-respect, and a desire to do and be the best that they can.
As for my denial of membership in the worldwide IMB representing "incontestable proof" that I'm among that group, that comment had me reminiscing about a shop I worked in about a decade ago: A new hire had proposed a questionable set of development changes, some of which I was passionately opposed to. He dismissed such disagreement via a hilarious bit of circular reasoning--
a) If you passionately disagree, you are being
"defensive"
b) If you're being defensive per the definition given in a),
it must be because you are wrong.
It's a simple, comforting way of dismissing opposing perspectives:
Everyone who disagrees is just being defensive because they're
wrong. It was so remarkable that it has always stuck with me as an
example of self-delusional perception.
Jeff goes on to compare his apparent utter dependence on continuous compilation code checking with squiggle-line spell-checking. Even if I were to accept that simile, which I don't at all, let's humor that comparison for a moment.
I've written about the importance of correct spelling before, and have lauded the integration of automatic, continuous spellchecking in Firefox. I'm typing this entry in Microsoft Word, which has helpfully alerted me to several misspellings (mostly the result of typos).
I greatly appreciate these tools, and how they help me with the craft of writing.
Yet I'm not a professional writer. I am, in actuality, a hack and a beginner.
By noting that differentiation, am I then saying that a professional, dedicated-to-the-craft writer would actively abhor such a tool (see the Frank Navasky character from You've Got Mail as just such an anti-technology luddite)?
Of course not, and that is not and has never been the argument I'm making. Those who jump to such a conclusion are just being defensive, and thus, we have learned, must be wrong. No I'm not calling for editing in notepad, or making shoes like we made them 150 years ago.
Instead I'd wager that you'd find the average professional writer, dedicated to the craft of putting words to print, has dramatically less dependency on such accoutrements than "beginners and hacks": They have elevated their creations to the point where something as rudimentary as spelling no longer represents a significant part of their "problem". They compose their creations so carefully that they're less likely to have such errors in the first place: When every line is a conscientious, careful, considered work of art, it's less likely that a typo-detection utility is as important.
For a blowhard blogger like me, vomiting paragraphs of raw thought into an editor, this sort of handholding is much more important, and the use of spell-checking actually speaks directly to my point. Writing is not my craft, and these literary creations aren't craftsmanship. I've even been known to mix up it's and its on occasion, to the delight of my critics.
This brings us to the crux of the whole "debate": It was never about the advanced functionality of tools, or even the use of said features or whether they "annoy" me or not, but instead I'm speaking to a growing trend of laziness and carelessness in coding, where developers emit screens of code (probably gloating about their remarkable LOC achievements), and after spending as much time fixing up the many automatically detected errors they spend weeks trying to diagnose the much more insidious logic, design and usage errors that almost certainly permeate their creation.
If their work is so carelessly authored that they consider continuous automated correctness checks a heavily leaned upon, necessary feature of their environment, then I wouldn't put much stock in the quality otherwise.
That is the problem that I argued against, simply stating that when you feel naked and abandoned without these assistants, finding yourself automatically doing frequent compilations to catch egregious mistakes, then you've probably lost touch of the craft, and one's work isn't getting the loving attention it deserves.
yafla has moved to some new, dedicated hardware, opening up some tremendous possibilities.
Some very exciting changes are afoot!
I'm going through the process of upgrading some Infragistics NetAdvantage 2007 v1 components to 2007 v2, one step in the upgrade process being the uninstallation of v1. The uninstaller has now been running for some 65 minutes, saturating both the hard drive and the CPU during the entirety of that time.
What possible explanation is there for this? Remove some registrations, delete some files and directories. Done. Where's the big complexity?
"But it's doing complex things!" a friend of MSIEXEC might retort (this is hardly the first time I've encountered outrageous installer times). Like what? Calculating the next Mersenne Prime?
In the time that it has run it could read and written my entire hard-drive several times over, and from a computational perspective it has now processed trillions of CPU operations. Trillions.
Given the basic metrics, there is simply no rational explanation beyond absolutely mind-boggling inefficiency. Par for the course, unfortunately.
One of my PCs is a bit of a Frankenstein, having gone through countless small upgrades over the years.
A video card here. Some memory modules there. A replacement primary harddrive here (thank you g4u). A supplementary hard drive there. Half a dozen different CD and then DVD and then Dual-Layer DVD burners.
Every now and then it'd see a larger upgrade that mandated a motherboard replacement alongside a new CPU. Often that would require new memory modules as well. Maybe even a new power supply as connection standards changed.
Motherboard replacements have always been the most disruptive, and it's been interesting to watch as each has negated the need for some add-in or other. First the USB+firewire board got punted, having been replaced by onboard functionality. Then the network card. Then the Soundblaster card. The only true add-in card usually needed nowadays is the video card, and I'm sure it's only a matter of time before the on-board video reaches a credible level of performance, eliminating even that.
I've pursued this piecemeal approach to upgrading primarily because it minimized the software disruption in my life, usually requiring just a quick module swap, some driver updates, and it's up and running again. I actually enjoy the modular, hybrid-PC pursuit, individually scoping out and replacing components with the best bang-per-dollar option available at the time. It's a bit of a hobby.
[Clearly I'm not alone: A local "Tiger Direct" store opened recently in my town, featuring a huge floorspace stocked with esoteric power supplies, mod cases, and other components for DIY builders. I'm surprized that the demand is still there, having thought that the self-builder was an endangered species]
I've been negligent, however. Over the past while this PC had seen little attention. Running on an extremely dated Athlon XP 1800+ (overclocked to equal a 2200+), with a "measly" 1GB of DDR1 RAM and a dated collection of complimentary components, it had fallen so far behind the times that it has dropped far off the current CPU charts. While it served its casual gaming task well (the video card is quite contemporary, and given that few games are constrained by the CPU, it held its own), and admirably provided the network storage for photos and videos, its anemic standings were a bit embarrassing. Sure, it didn't need to be decent given the various home and business laptops -- powerful, modern units that saw most of my computing activity -- but I felt like I was letting it down.
So following up the entry from a couple of weeks ago, I finally got around to ordering a new CPU and motherboard on Tuesday, ordering a retail boxed Intel Core 2 Quad Q6600 2.4Ghz processor from Direct Canada for the extraordinarily low price of $279.99 CAD. I'd been directed to their site from a search-engine yielded link to "Shopbot.ca", so I was a bit wary placing my order with this unfamiliar provider, but at 1pm the next day the box arrived at my door, amazingly delivered less than 24 hours after I ordered, coming from a shop 3000km away. I'm very satisfied with the price and speed. (I received no considerations for that comment, and know nothing about the shop beyond the fact that they sold me a killer piece of hardware at a great price, delivering it very quickly. Your mileage may vary.)
In the end I discovered that some new memory modules would be in order to fully yield the speed (going with 2GB to correlate with the oft claimed speed advantage that often flies in complete contradiction to actual memory usage metering). Oh, and a new case as it might make the whole process a little easier.
In the end, the only legacy pieces that made the migration to the "upgraded" box are the hard drives, and the video card.
Minutes later the full-retail copy of Windows was running the right drivers, and after a quick re-activation it was storming along.
I booted up.
In a word (and a punctuation) - Wow!
What a tremendous amount of computational power on the cheap. Day to day activity really feels no different than it did before -- browsing is the same fast browsing that it was before, and given that I don't try to use Excel as a warehousing database, Office seems the same as well. Battlefield 2 plays the same given that I have the same video card, albeit now with absolutely zero stutters or hiccups as other threads demanding timeslices are generally satisfied by one of the other cores.
For the things that actually keep me waiting -- encoding a home video from the MiniDV, or building firefox from CVS, as I do regularly -- the improvement is enormous. Not only are these operations massively sped up by the four cores available to them, better still I can configure them to only use one, two, or three threads of parallel executions (via the -j build option for Firefox, for instance), constraining them as a coarse fix for the deficiencies of the Windows scheduler. I can now run a full Firefox 3 build in just 12 minutes with full parallelism, or run it (or other demanding applications) with little or no impact in the usability and functionality of this PC for other tasks.
The build continued to speed up with more possible parallel operations, albeit with a decreased rate of return, with the fastest test build occuring in just over 12 minutes with the highest option tested: -j12. Having more parallel operations than cores can yield benefits when it increases the time utilization of a saturated resource, which in this case was the hard drive. At this point the cores were left twiddling their thumbs waiting for the storage to catch up.
Limiting the build process to two cores via the process CPU affinity had it CPU starved beyond -j2, yielding no benefit via more parallelism.
You can find a stacked graph detailing core processor usage for the above -j4 run (on 4 cores) at http://www.yafla.com/dforbes/images/Firefox_build_j4_4core.png. You can also look at a chart of building Firefox using the -j4 option, but setting the processor affinity to only allow the build access to two cores.
Not only is the build performance fantastic, but better still I can throttle it back to only run at most two parallel operations (-j2), getting a build in a still impressive 17 minutes while leaving two cores completely available for other tasks, like browsing the web with full responsiveness. I can even launch Battlefield 2, and remarkably it plays flawlessly...despite the fact that a full-scale, parallel build is going on in the background.
(Sidenote: Threads can still be left stalled, stranded waiting for a shared resource like the limited memory bandwidth and I/O paths, for instance. In the sample above my build was on a second harddrive -- a configuration that I recommend for all power users -- and clearly the other shared resources didn't impact the game to a perceivable degree)
What a revolution in computer usage. What a discount-priced computational powerhouse.
My original foray into the land of blogging was delayed while I stumbled towards the goal of building my own blogging software: like many software developers, I have a sometimes irrational desire to build it myself rather than admit “defeat” and use one of the many (and in the realm of blogging, there are many) available products.
I took a couple of stabs at building it myself originally, but due to another common foible – a tendency to over-engineer (I couldn’t simply write some blog software to post and publish my own thoughts. No…it had to be a full multi-author aggregation and collaboration suite, meaning that weeks went by while I mentally debated the database model for such a machination) – it just never seemed to get finished.
Other priorities always trumped it, and the little time I did allot towards this goal saw me solving absurd edge conditions.
I finally set a deadline for myself, and when I couldn’t find the time to finish anything before my marker (billable hours always came first), I went and bought a copy of Radio Userland and started publishing content the blog way.
That worked well enough for a while, but Radio Userland is a venerable publishing tool that is really showing its age. Authoring to it is a less than pleasant experience – which has been a huge contributor towards the dearth of content (it’s always a bit of a roll of the dice to see which characters it randomly replaces in posts, or which carefully authored HTML blocks it’s decided to mangle) – and simple tasks like cross-linking posts (e.g. a “related posts” sidebar to allow users to easily see follow-ups) was just far too manual to be worth the bother.
Now that I have a powerful, fully dedicated server, it’s also grossly under-featured for users, making the experience of consuming and navigating through the information far less usable than it should be.
So I’ve gone and built my own blogging software, this time quickly bringing it to a sort of beta release.
Given that this is the venue with which I will publicize a ton of changes elsewhere on the site, I really considered this a roadblock on the critical path to the release of other web application functionality elsewhere on yafla.
With some focus, it took only a couple of hours this time, mostly accomplished while putting my toddler son to bed over the weekend. It was so ridiculously quick and easy that I kick myself for not having done it sooner.
I’m extremely pleased about the functionality built out (hey it isn't rocket science, and definitely falls within the realm of "trivial", but there's lots of little "gotchas" with software like this), though most of the kudos go towards .NET 2 and SQL Server 2005: A couple of tools that make short work of what would once have been an enormous task, bringing a robust, secure, high performance web application to a usable stage in less time than it takes to watch the Lord of the Rings trilogy.
Right now you’ll probably notice that – at this moment at least – the HTML version of the blog looks absolutely terribly. That is somewhat by design (or rather an intentional time compromise)…momentarily. I’m working on the template (it’s of course parameterized template driven), and wanted to force myself to follow through by deploying (perhaps prematurely).
So what are the features of the blog software?
Well, firstly I migrated 100% of the old content over (including metadata such as categorization), running it all through Tidy first to try to make it a little more XHTML legitimate. Using an identifier mapping structure, every single link to the legacy content still works (which was important to me: I didn’t want to give link followers the frustrating “We moved everything so have fun trying to find it” 404 experience).
Everything works via URL remapping, and for now I’ve set it to redirect from old links to the new links where possible. E.g. http://www.yafla.com/dforbes/categories/softwareDevelopment/2005/09/28.html redirects to http://www.yafla.com/dforbes/Clean_Code. All new entries Will follow that more transparent and obvious structure.
But the URLs aren’t limited to just single documents – All entries in June of 2006 can be accessed via http://www.yafla.com/dforbes/2006/06. Add in a category and you can refine further – http://www.yafla.com/dforbes/2006/06/.NET (or http://www.yafla.com/dforbes/.NET/2006/06. Whatever makes you happy).
Want that in RSS form? http://www.yafla.com/dforbes/2006/06/.NET/rss.xml. Add in the day if you wanted to refine further.
Of course, no longer are entries limited to the archaic “categories”. Now they’re basically keywords, so if you want to see the posts where I’ve abused categories and multi-tagged, take a look at
http://www.yafla.com/dforbes/.NET/SQL/Blogging/SoftwareDevelopment/Personal/IT/
Yikes!
So the tagging will be much more logical now that there aren’t broad categories, and given that anyone can filter content however they want (stick rss.xml on the end and you can get a feed of whatever you want).
There’s also search, though I’m not comfortable enough with the finality of the API to publish anything about that.
Entries now have versioning, given that I want to be more transparent with edits that I make (I’m endlessly doing minor corrections and improving wording, and for those who consider that deceptive there’ll be a little version history to see what changed and when, along with a label of why the change was made). All links are auto-parsed and logged, so every entry has a list of posts that link into it, making for much more elegant self follow-ups without resorting to post-editing some “UPDATE: See also…“ notes into old entries, and without resorting to the ugliness of trackbacks.
Extensive caching ensures that it’s still spritely and capable of handling peak loads with no fuss.
Oh, and the system supports many blogs by many authors, including publishing multiple authors into one system…so I still over-engineered, but in the end it was workable and I’m extremely happy with the core structure.
Great things lie ahead.
I recently opted to throw together my own blog software (after going through the standard Build or Buy analysis), expediting deployment as a means of forcing follow-thru. The goals of this micro-project were to improve the authoring and content management experience, to improve searchability of the content (without having to cast content out from the blog to a static form), and to improve the usability and navigation from the user's perspective (for instance the classic "date" navigator common on most blogs is something that I've opted to remove).
Despite having close to no time to allocate to this task, my tendency to over-engineer still showed through: The easiest option would have been a content-management system defined entirely in code (it's as easy for me to change and deploy code than it is to change templates and metadata), and of course to build it for a single author. Instead it supports many blogs through the same URLRedirector, blog aggregations (where a blog is a publication of a set of blogs, each with distinct authors) each using its own templates and configurations.
Which brings me to templates -- failing to find a decent Smarty-type templating system for .NET (basic ASPX is really a templating system, but I'm speaking more towards something that can enumerate sections, retrieving data based upon an object structure of relationships and containment).
So I had to build a basic templating system, yielding the templates that follow. The first for HTML output--
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta>
<title>{#blog.Title} {#docTitle}</title>
<link rel="stylesheet" type="text/css" media="screen, projection"
href="http://www.yafla.com/dforbes/style/css/blog.css"></link>
<script type="text/javascript" src="http://www.haloscan.com/load/dforbes"> </script>
</head>
<body>
<div class="clsHeader">
<div class="clsBlogHeader"><a href="{#blog.BaseUrl}">{#blog.Title}</a></div>
<div class="clsSubheader">{#blog.Description}</div>
</div>
<div class="clsBody">
{foreach $entry in $entries}
<div class="clsEntry">
<div class="clsDate">{#entry.EntryContent.PublishDateUTC|dddd, MMMM dd yyyy}</div>
<div class="clsTitle"><a href="{#entry.Permalink}">{#entry.EntryContent.EntryTitle}</a></div>
<div class="clsBody">{#entry.EntryContent.EntryContent}</div>
<div class="clsKeywords">{foreach $keyword in $entry.EntryKeywords}
<a href="{#blog.BaseUrl}{#keyword.KeywordText|escape}">{#keyword.KeywordText}</a> {/foreach}
</div>
<div class="clsPermalink">
<a href="javascript:HaloScan('{#entry.MappingId}');" target="_self">
<script type="text/javascript">postCount('{#entry.MappingId}'); </script></a>
<a href="{#entry.Permalink}">permalink</a>
</div>
{foreach $relatedentry in $entry.RelatedEntries}
{ifcond $LoopFirst = "True"}
<center>
<div class="clsRelatedEntries">
Related Entries
{/ifcond}
<div class="clsRelatedEntry">
<a href="{#relatedentry.Permalink}">{#relatedentry.EntryContent.EntryTitle}</a>
</div>
{ifcond $LoopLast = "True"}
</div>
</center>
{/ifcond}
{/foreach}
</div>
{/foreach}
<div class="clsAdBlock">
{#adBlockHorizontal}
</div>
<div class="clsNavigator">
<span class="clsNavigateEarlier">{#moveEarlier}</span><span class="clsNavigateLater">{#moveLater}
</span>
</div>
</div>
<br/>
<div class="clsAttribution">
<a href="mailto:{#entry.EntryContent.ContentAuthor.EmailAddress}">
{#entry.EntryContent.ContentAuthor.Name}
</a> -
{#entry.EntryContent.ContentAuthor.Description}
</div>
</body>
</html>
The next template is for RSS consumers--
<rss version="2.0">
<channel>
<title>{#blog.Title|escape}</title>
<link>{#blog.BaseUrl}</link>
<description>{#blog.Description|escape}</description>
<lastBuildDate>{#buildDate|r}</lastBuildDate>
<language>en-us</language>
{foreach $entry in $entries}
<item>
<title>{#entry.EntryContent.EntryTitle|escape}</title>
<link>{#entry.Permalink}</link>
<guid>{#entry.Permalink}</guid>
<pubDate>{#entry.EntryContent.PublishDateUTC|r}</pubDate>
<description><![CDATA[{#entry.EntryContent.EntryContent}]]></description>
</item>
{/foreach}
</channel>
</rss>
All in all, I think it works pretty good, and I can successfully run the W3C validations on the vast majority of generated pages and get the comforting green checkmark.
One of the most referenced papers in software development has to be Dijkstra's seminal paper titled "Goto Statement Considered Harmful".
Dijkstra didn't actually author the title, but instead it was the creation of an editor en route to being printed in an ACM publication. It was changed from its original title of "A case against the goto statement".
While the core essence of the essay is indeed that the goto statement can be harmful, Djikstra wasn't making an absolute statement (as is commonly claimed, and which is an absolutism tendency of far too many in this industry), but instead was commenting on the abuse of goto that was occurring in the industry, calling for a sober evaluation of where it is appropriate, but more importantly where it is not.
Nonetheless, the meme was created and has been reused and abused in innumerable Considered Harmful declarations since.
A month or so back the development webosphere was awash with references to Scott Hanselman's excellent blog, all excitedly linking (rel="titillating"?) to his piece titled "The Weekly Source Code 13 - Fibonacci Edition". This was particularly common in the .NET community, with many linkers describing it as an elucidating example of the many advantages of .NET 3.5 / C# 3.0.
I perused the entry, always eager to absorb that sort of information, but found it less than perfect. I withheld critical comment, hoping it would all just blow away.
Then this morning I opened up Visual Studio and happened to notice a link to his entry on the Start page.

Maybe it's been there for a while (the last date is pretty old) and I just didn't notice it before, but the title used on the Start page pushed me over the edge, coercing me to comment.
There are several issues I have with Scott's Fibonacci entry.
First, the C# 2.0 (henceforth I'm dumping the subversion precision on the language versions) version is oddly dumbed down: C# 2 also has ternary comparisons, and it even has anonymous functions (including closure functionality). Yet the demonstrations given contrast the simplest possible C# 2 implementation with the most obtuse C# 3 example.
Basically the only novel difference with the C# 3 example is that it uses a lambda, though of course it would be an absolutely terrible thing to use a lambda for.
It's not a very good example of the implementation differences between the versions, which is the claim made by the Visual Studio start page, and was the description often used during the dissemination of this piece.
I like C# 3, but this isn't a good demonstration of any advantage of the language.
Worse yet, the only place you'll ever see recursion used to calculate Fibonacci numbers is in "Recursion for Dummies" type examples. To understand why that is, consider Scott's C# 3 example, which he leads into with the statement "Now, here's a great way using C# 3.0".
Here's a logarithmic-scaled chart of the number of function calls necessary to calculate Fibonacci numbers in the C# 3 example Scott gave.

Obviously it gets unusable pretty quickly. Try calculating the 90th Fibonacci number using recursive algorithms...
In the same way that Goto can be harmful, the use of recursion is often a sign of badness, and this is no exception. Epic inefficiency is used instead of the obviously simple approach.
long CalcFibonacciNumber(long n)
{ long current = 1, previous = 0, swapholder; while (n-- > 1) { swapholder = previous; previous = current; current += swapholder; } return current; }
(Ignoring mathematical shortcuts)
A lot of readers will be rolling their eyes right about now, muttering something along the lines of "Awww, come on...you didn't seriously think anyone thought that recursion was a good way to calculate Fibonacci numbers, did you? This is beginner's stuff, and no one really thinks that's the right way to do it!"
I'm optimistic about the profession, so no, I didn't really think it was a serious example (though I do think it nonetheless deserves some serious warnings to ensure no one becomes misled).
WARNING: The Code Contained In This Example Will Rot Your Brain. Never Do Something Like This In Real Life. Don't Let Peers See You Looking At Code Like This. Suspend All Critical Thought While Reading This Piece.
Instead it's a sample of "here's a demonstration of how to do something absolutely terrible — almost felony worthy — in a variety of programming languages....".
This is still a serious problem.
The example given is so very wrong — even if it is what's used in Recursion for Dummies books — that it makes it close to impossible to focus on the actual point being made, even if it had used comparable features of each language to demonstrate how the same task could be accomplished in each.
It reminds me of many early web service tutorials and advocacy pieces: Many used absurd examples like "a web service to add two numbers" (and amazing variations such as subtract two numbers, multiply two numbers, divide two numbers, compute the Log10 of a number, and so on. You get my point — things for which a web service would be entirely unsuited).
Stop it!
Stop with the ridiculous no-one-would-(or rather should)-ever-do-it-this-way examples. It completely undermines the value of the examples.
Surely there are realistic examples that would be more appropriate for demonstrating the advantages of lambdas (recursion {is recursion}; [goto {is recursion}], so there isn't much enlightenment provided there). How about "how to build a rudimentary regular expression parser in a variety of languages", or for a web service "pulling weather data from a remote weather station".
Something that a developer isn't going to have to slog through with their brain fighting them on every line, demanding an explanation for the terrible design or algorithm they're supposed to accept at face value.