Tuesday, November 06 2007

I recently opted to throw together my own blog software (after going through the standard Build or Buy analysis), expediting deployment as a means of forcing follow-thru. The goals of this micro-project were to improve the authoring and content management experience, to improve searchability of the content (without having to cast content out from the blog to a static form), and to improve the usability and navigation from the user's perspective (for instance the classic "date" navigator common on most blogs is something that I've opted to remove).

Despite having close to no time to allocate to this task, my tendency to over-engineer still showed through: The easiest option would have been a content-management system defined entirely in code (it's as easy for me to change and deploy code than it is to change templates and metadata), and of course to build it for a single author. Instead it supports many blogs through the same URLRedirector, blog aggregations (where a blog is a publication of a set of blogs, each with distinct authors) each using its own templates and configurations.

Which brings me to templates -- failing to find a decent Smarty-type templating system for .NET (basic ASPX is really a templating system, but I'm speaking more towards something that can enumerate sections, retrieving data based upon an object structure of relationships and containment).

So I had to build a basic templating system, yielding the templates that follow. The first for HTML output--

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta>
<title>{#blog.Title} {#docTitle}</title>
<link rel="stylesheet" type="text/css" media="screen, projection" 
   href="http://www.yafla.com/dforbes/style/css/blog.css"></link>
<script type="text/javascript" src="http://www.haloscan.com/load/dforbes"> </script>
</head>
<body>
<div class="clsHeader">
<div class="clsBlogHeader"><a href="{#blog.BaseUrl}">{#blog.Title}</a></div>
<div class="clsSubheader">{#blog.Description}</div>
</div>
<div class="clsBody">
{foreach $entry in $entries}
<div class="clsEntry">
   <div class="clsDate">{#entry.EntryContent.PublishDateUTC|dddd, MMMM dd yyyy}</div>
   <div class="clsTitle"><a href="{#entry.Permalink}">{#entry.EntryContent.EntryTitle}</a></div>
   <div class="clsBody">{#entry.EntryContent.EntryContent}</div>
   <div class="clsKeywords">{foreach $keyword in $entry.EntryKeywords}&nbsp;
      <a href="{#blog.BaseUrl}{#keyword.KeywordText|escape}">{#keyword.KeywordText}</a>&nbsp;{/foreach}
   </div>
   <div class="clsPermalink">
      <a href="javascript:HaloScan('{#entry.MappingId}');" target="_self">
      <script type="text/javascript">postCount('{#entry.MappingId}'); </script></a>&nbsp;
      <a href="{#entry.Permalink}">permalink</a>
   </div>
{foreach $relatedentry in $entry.RelatedEntries}
{ifcond $LoopFirst = "True"}
<center>
<div class="clsRelatedEntries">
Related Entries
{/ifcond}
<div class="clsRelatedEntry">
   <a href="{#relatedentry.Permalink}">{#relatedentry.EntryContent.EntryTitle}</a>
</div>
{ifcond $LoopLast = "True"}
</div>
</center>
{/ifcond}
{/foreach}
</div>
{/foreach}
<div class="clsAdBlock">
{#adBlockHorizontal}
</div>
<div class="clsNavigator">
   <span class="clsNavigateEarlier">{#moveEarlier}</span><span class="clsNavigateLater">{#moveLater}
   </span>
</div>
</div>
<br/>
<div class="clsAttribution">
   <a href="mailto:{#entry.EntryContent.ContentAuthor.EmailAddress}">
      {#entry.EntryContent.ContentAuthor.Name}
   </a> - 
   {#entry.EntryContent.ContentAuthor.Description}
</div>
</body>
</html>

The next template is for RSS consumers--

<rss version="2.0">
  <channel>
    <title>{#blog.Title|escape}</title>
    <link>{#blog.BaseUrl}</link>
    <description>{#blog.Description|escape}</description>
    <lastBuildDate>{#buildDate|r}</lastBuildDate>
    <language>en-us</language>
		{foreach $entry in $entries}
		<item>
		  <title>{#entry.EntryContent.EntryTitle|escape}</title>
		  <link>{#entry.Permalink}</link>
		  <guid>{#entry.Permalink}</guid>
		  <pubDate>{#entry.EntryContent.PublishDateUTC|r}</pubDate>
		  <description><![CDATA[{#entry.EntryContent.EntryContent}]]></description>
		</item>
		{/foreach}
  </channel>
</rss>

All in all, I think it works pretty good, and I can successfully run the W3C validations on the vast majority of generated pages and get the comforting green checkmark.

   
Friday, February 22 2008

Goto Considered Appropriate In Some Cases

One of the most referenced papers in software development has to be Dijkstra's seminal paper titled "Goto Statement Considered Harmful".

Dijkstra didn't actually author the title, but instead it was the creation of an editor en route to being printed in an ACM publication. It was changed from its original title of "A case against the goto statement".

While the core essence of the essay is indeed that the goto statement can be harmful, Djikstra wasn't making an absolute statement (as is commonly claimed, and which is an absolutism tendency of far too many in this industry), but instead was commenting on the abuse of goto that was occurring in the industry, calling for a sober evaluation of where it is appropriate, but more importantly where it is not.

Nonetheless, the meme was created and has been reused and abused in innumerable Considered Harmful declarations since.

So...how does a C# 3.0 implementation of Fibonacci differ from a C# 2.0 version?

A month or so back the development webosphere was awash with references to Scott Hanselman's excellent blog, all excitedly linking (rel="titillating"?) to his piece titled "The Weekly Source Code 13 - Fibonacci Edition". This was particularly common in the .NET community, with many linkers describing it as an elucidating example of the many advantages of .NET 3.5 / C# 3.0.

I perused the entry, always eager to absorb that sort of information, but found it less than perfect. I withheld critical comment, hoping it would all just blow away.

Then this morning I opened up Visual Studio and happened to notice a link to his entry on the Start page.

Visual Studio 2008 Start Page

Maybe it's been there for a while (the last date is pretty old) and I just didn't notice it before, but the title used on the Start page pushed me over the edge, coercing me to comment.

Recursion Considered Harmful

There are several issues I have with Scott's Fibonacci entry.

First, the C# 2.0 (henceforth I'm dumping the subversion precision on the language versions) version is oddly dumbed down: C# 2 also has ternary comparisons, and it even has anonymous functions (including closure functionality). Yet the demonstrations given contrast the simplest possible C# 2 implementation with the most obtuse C# 3 example.

Basically the only novel difference with the C# 3 example is that it uses a lambda, though of course it would be an absolutely terrible thing to use a lambda for.

It's not a very good example of the implementation differences between the versions, which is the claim made by the Visual Studio start page, and was the description often used during the dissemination of this piece.

I like C# 3, but this isn't a good demonstration of any advantage of the language.

Worse yet, the only place you'll ever see recursion used to calculate Fibonacci numbers is in "Recursion for Dummies" type examples. To understand why that is, consider Scott's C# 3 example, which he leads into with the statement "Now, here's a great way using C# 3.0".

Here's a logarithmic-scaled chart of the number of function calls necessary to calculate Fibonacci numbers in the C# 3 example Scott gave.

The Horror!

Obviously it gets unusable pretty quickly. Try calculating the 90th Fibonacci number using recursive algorithms...

In the same way that Goto can be harmful, the use of recursion is often a sign of badness, and this is no exception. Epic inefficiency is used instead of the obviously simple approach.

long CalcFibonacciNumber(long n)
{ long current = 1, previous = 0, swapholder; while (n-- > 1) { swapholder = previous; previous = current; current += swapholder; } return current; }

(Ignoring mathematical shortcuts)

Unrealistic Examples Considered Harmful

A lot of readers will be rolling their eyes right about now, muttering something along the lines of "Awww, come on...you didn't seriously think anyone thought that recursion was a good way to calculate Fibonacci numbers, did you? This is beginner's stuff, and no one really thinks that's the right way to do it!"

I'm optimistic about the profession, so no, I didn't really think it was a serious example (though I do think it nonetheless deserves some serious warnings to ensure no one becomes misled).

WARNING: The Code Contained In This Example Will Rot Your Brain. Never Do Something Like This In Real Life. Don't Let Peers See You Looking At Code Like This. Suspend All Critical Thought While Reading This Piece.

Instead it's a sample of "here's a demonstration of how to do something absolutely terrible — almost felony worthy — in a variety of programming languages....".

This is still a serious problem.

The example given is so very wrong — even if it is what's used in Recursion for Dummies books — that it makes it close to impossible to focus on the actual point being made, even if it had used comparable features of each language to demonstrate how the same task could be accomplished in each.

It reminds me of many early web service tutorials and advocacy pieces: Many used absurd examples like "a web service to add two numbers" (and amazing variations such as subtract two numbers, multiply two numbers, divide two numbers, compute the Log10 of a number, and so on. You get my point — things for which a web service would be entirely unsuited).

Stop it!

Stop with the ridiculous no-one-would-(or rather should)-ever-do-it-this-way examples. It completely undermines the value of the examples.

Surely there are realistic examples that would be more appropriate for demonstrating the advantages of lambdas (recursion {is recursion}; [goto {is recursion}], so there isn't much enlightenment provided there). How about "how to build a rudimentary regular expression parser in a variety of languages", or for a web service "pulling weather data from a remote weather station".

Something that a developer isn't going to have to slog through with their brain fighting them on every line, demanding an explanation for the terrible design or algorithm they're supposed to accept at face value.

   
Thursday, May 12 2011

Max Schireson, president of 10gen (the makers of MongoDB), has made the case for document-based data systems – such as MongoDB and CouchDB – by arguing against the heavily-normalized relational model.

Max offers up his entry as a challenge to the “relational-is-always-best set”, asking them to prove that the complexity of storing data in a relational form is worth the trouble, at least for the scenario he describes.

Given that I’ve been anointed as an anti-NoSQL crusader on a number of occasions, I feel obligated to argue on behalf of the relational model, which I will do in a later entry.

Despite being a big fan of MongoDB. As I have done many times in the past I encourage everyone to download and play around with the excellent MongoDB product. Do yourself a favour by running through the tutorial.

All things have a place.

Recognizing the MySpace Angle And Its Inverse

I once sat in a meeting where a peer described the purportedly intractable complexity of a task they were failing at. They did this by drawing the various actors on the whiteboard and then detailing their many complex relationships.

Image the best path-finding algorithm. Now imagine the opposite: The least efficient, most unnecessarily sloppy routing imaginable.

That was how complexity was deceptively exaggerated, with absurdly circuitous relationship lines weaving to and fro. It was comical.

That memory came to mind, and how the deception goes both ways, while reading Max’s entry, and again when reading the linked entry by MongoDBer Kyle Banker.

When comparing the document model with the relational model, many if not all examples seem to contrast a complex relational model – one that encapsulates an end-to-end platform for a whole domain – against a trivial island of a tiny subset of data in a document structure. The former usually built to support entire operations and systems, while the latter tends to be crafted for one single purpose (like "allow customer services to look at an order", as was used in Max's scenario).

Max highlights relational complexity by pointing to an Oracle end-to-end order reference platform containing “126 tables”. Kyle does the same thing when comparing a simple could-be-one-single-row document (which humorous includes four relationships, which to resolve would require four expensive round-trips to the MongoDB server given the platform’s bizarre lack of server joins) against a complex catalogue schema. Both explain their arguably deceptive comparisons with statements like “Of course, this is not a complete representation of a product”…

I would argue that in such a case such a comparison shouldn’t be made at all. Why contrast an incomplete example of a document-based implementation – simplistic in its useless innocence – against a fully scoped relational platform?

It is the “MySpace angle” used to hide the ugly reality of technology. If you have a MongoDB simile of the compared product, have at it, but simply hiding the ugly details and zooming in on a non-functional, cherry-picked subset just misleads potential suitors.

Realtors use this trick when taking photos of homes, showing just enough of the grass while avoiding nearby structures. Your mind naturally extrapolates; imagining expanses of lush green fields, when in reality there’s usually another house imposing itself four feet over.

In Defense of Relational Databases

I have a full workload right now, but in the near future, during a mental lull, I will respond to Max. There is a very compelling counterargument to be made.

 C#  .NET  MongoDB  NoSQL 
   


About the Author
Dennis Forbes Dennis Forbes is a Toronto-based software architect. While focused primarily on the .NET and SQL Server worlds, Dennis frequently ventures outside of this comfort zone into game development and image processing. He has been published in several industry magazines, has been quoted in the Wall Street Journal and has been interviewed by NPR.

He is a vice president and lead software architect at an innovative New York City hedge fund back-office services firm.

Dennis has been working on solutions for the financial, telecommunications, and power generation markets for over 15 years.





 
Earlier EntriesLater Entries

Dennis Forbes