Dennis Forbes on Pragmatic Software Development
Subscribe to RSS
 
Thursday, June 08 2006

[The static location of this piece can be found at this address]

FxCop As a Code Quality Tool

For the past while I've been using Visual Studio Team Edition for Software Developers, one of its benefits over the Professional Edition being the inclusion of static code analysis functionality right in the IDE.

This functionality comes via the FxCop codeset, which is an excellent -- albeit unpolished -- freely available tool for analyzing the probable code quality of Intermediate Language assemblies, testing code to ensure compliance with naming standards, best practices, and highlighting areas of code that are suspect. While it's less than pleasant starting FxCop analysis from scratch on long existing project -- to be met with hundreds upon hundreds of error messages -- it's a painless process if you add it to your quality checks early on.

The standalone FxCop is largely the same as the VSTE version, and in some ways is superior. For instance that it retains the ability to actually pass configuration settings to rules, rather than accepting whatever the defaults for the rule are.

Cyclomatic Complexity

One of the few differences between the standalone application and the VSTE-included version are the addition of several new maintenance checks in the Team Edition code, one of the most useful being the cyclomatic complexity checks. Cyclomatic complexity, for those who haven't come across it before, is often used to roughly gauge the complexity of a piece of code, to determine likely candidates for refactoring, and to identify what will likely become a maintenance problem in the future. Finding the most complex pieces of code often brings you to the buggiest code as well.

Given that I still use FxCop, both the .NET 1.1 and .NET 2.0 versions (not least because the integrated version offers no ability to configure settings for rules, instead only allowing you to wholesale enable or disable. This eliminates the ability to set thresholds for tests such as the cyclomatic complexity rules), the lack of consistency between the two versions was an annoying gap.

Introducing Cyclomatic Complexity Analysis For FxCop

So I implemented a simple cyclomatic counting rule for the standalone FxCop. While in there, I added checks for statement count (the number of intermediate language "statements", which can be indicative of overly complex methods), and callout count (e.g. callouts to other methods, again which can be an indicator of overly complex/convoluted methods).

As one added benefit, I added the ability to log all of these metrics to an SQL-capable OleDB destination (e.g. SQL Server, Access, etc). If you configured an OLEDB connection string, as detailed below, you can do data analysis after a run to create pretty reports of the complexity distributions of your projects, and so on. 

Download Links

yafla FxCop Rules for .NET 1.1 (e.g. FxCop 1.32)
yafla FxCop Rules for .NET 2.0 (e.g. FxCop 1.35)

Caveats

Like any tool of this type, there is only a moderate correlation between the metrics measured and actual code quality or maintainability: It is entirely possible that the optimal implementation is a highly-complex, lengthy method. This tool only provides guidance, helping to determine which code should get a complexity analysis, however from there experience and good judgement have to be applied to determine if it's really a fault. If you're using the .NET 2.0 version of FxCop, make use of the SuppressMessage attribute on methods that are necessarily highly complex.

Instructions

Drop yaflaRules.dll in your FxCop Rules subdirectory (e.g. C:\\program files\\Microsoft FxCop 1.32\\Rules).

If you want more advanced settings, configure FxCop with your targets and selected rules and then save the project file. Open the newly created .FxCop file in an editor (for instance notepad) and find the <Settings /> element. Expand it to an opening and closing tag (e.g. <Settings></Settings>), and between it add

<Rule TypeName="MethodComplexity"></Rule>

Between the Rule element add any of the following entries as Name attributes of an Entry element (as exampled following) -

Connection String - an OleDb connection string determining where it will log metrics. e.g. Provider=SQLNCLI;Server=(local);Database=Analysis;Trusted_Connection=yes;
Target Table - The target table for metric logging. Default - MethodComplexity
Cyclomatic Critical Error - Level at which a critical error is triggered. Default - 60
Cyclomatic Error - Level at which an error is triggered. Default - 50
Cyclomatic Critical Warning - Level at which a critical warning is triggered. Default - 45
Cyclomatic Warning - Level at which a warning is triggered. Default - 40
Cyclomatic Information - Level at which an infromation event is triggered. Default - 20
Cyclomatic Recommended - Recommended level. Default - 20
Statements Critical Error - Statement count at which a critical error is triggered. Default - 500
Statements Error - Statement count at which an error is triggered. Default - 350
Statements Critical Warning - Statement count at which a critical warning is triggered. Default - 250
Statements Warning - Statement count at which a warning is triggered. Default - 200
Statements Information - Statement count at which an information event is triggered. Default - 150
Statements Recommended - Recommended maximum statement count per method. Default - 100
Callouts Critical Error - Callout count at which a critical error is triggered. Default - 100
Callouts Error - Callout count at which an error is triggered. Default - 75
Callouts Critical Warning - Callout count at which a critical warning is triggered. Default - 50
Callouts Warning - Callout count at which a warning is triggered. Default - 40
Callouts Information - Callout count at which an information event is triggered. Default - 30
Callouts Recommended - Recommended maximum callout count per method. Default - 30

For instance, you might end up with a <Settings> element that looks like the following:

<Settings><Rule TypeName="MethodComplexity"><Entry Name="Connection String">Provider=SQLNCLI;Server=(local);Database=Analysis;Trusted_Connection=yes;</Entry><Entry Name="Callouts Warning">100</Entry><Entry Name="Cyclomatic Critical Warning">500</Entry></Rule></Settings>

If you opt to take advantage of metrics logging, the destination table (which will be default will be MethodComplexity, unless overridden with the Target Table name entry) requires the following columns:

ContainingType - text (e.g. nvarchar(255))
MethodName - text (e.g. nvarchar(255))
Cyclomatic - int
Statements - int
Callouts - int

e.g.
CREATE TABLE [dbo].[MethodComplexity](
 [ContainingType] [nvarchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
 [MethodName] [nvarchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
 [Cyclomatic] [int] NOT NULL,
 [Statements] [int] NOT NULL,
 [Callouts] [int] NOT NULL
) ON [PRIMARY]


Hopefully someone finds this interesting. It scratched my itch.

Wednesday, June 07 2006
A frequent complaint these days is the feeling of being overwhelmed with information: We're getting hundreds of emails, dozens of voice mails, dozens of phone calls, post it notes, feed updates, correspondence, thousands of bookmarks and sites we've visited that we know had good infromation but we just can't find them again, pamphlets and brochures, and it goes on and on.

It gets to the point that it seems like an unmanagable, overwhelming mess.

Often it isn't the tools or the medium, but rather the way that we use it, that causes the problem. Email in particular is frequently misused, and gets maligned for being a productivity waste when it's really flawed usage that's the problem. As such, here's a couple of email tips from an avowed email lover.
  1. Give a concise, but detailed, subject line that accurately conveys exactly what the message is about. This is a huge issue with email, with many apparently hoping to add a note of suspense to their emails by giving vague or misleading subject lines (e.g. "Notes" or "Ideas" or "Feedback" or "Hrmmmm", instead of "SQL Server Presentation Summary Notes" or "New Product Ideas for Mobile Battery Market" or "Comments regarding Q1 2006 Finance Summary"). You should customize reply subject lines as well, adding specific suffixes if your reply deals with particulars (e.g. "RE: Lunch Party - Drink Menu" if you're replying in specific about the drink menu"). Any email client worth its weight in electrons threads emails by a hidden message ID, so you shouldn't worry about fragmentation.

    Your subject line should be a critical piece of the communication, allowing the recipient to determine how to fit in their communications flow.

  2. Provide an "executive summary" for longer emails, comprised of 10 or less sentences. It should accurately be a subset of the larger message, minus technical details or discussion points that might not be applicable for all recipients. Everyone appreciates such summaries, and it can help fend off the anti-email crusaders who discourage email to avoid the responsibility of reading them.
flowers These practices primarily benefit the recipients of your missives, however they do benefit you as well. You'll have better organization of your sent items, for instance, not to mention that in the future you will go through your emails, amazed that you were the author, trying to quickly figure out what each of them was about in the search for something in particular.

On the theme of efficient communications, I caught a post a few days back where the author detailed how they categorize their RSS feeds into the "20% that matter" and the "80% that don't". This perplexed me, because if 80% don't matter, then why subscribe to them in the first place? Is it some sort of "junk collection" of the internet kind, where there's a feeling of accomplishment having more and more irrelevant information pouring in every day?

Personally I don't use an RSS reader, and I subscribe to no one -- I don't need to know every random thought that goes through Robert Scoble's mind (personally I think most of his entries are noise, which is how I feel about most frequently updated blogs. There's no way I want a little feed icon blinking every time some three line snippet pours out), and even worthwhile writers like Joel Spolsky or Seth Godin don't demand immediate attention. Instead, every couple of days, or for some months, I browse around to all of the sites that I'm a fan of, quickly scanning past all of the floatsam for something worth reading.

This isn't to say that feed readers are bad: Like everything else it's the usage that really matters. Yet if people really, truly think that anxiously watching countless blogs is critical to their industry or technical knowledge, they're focused on entirely the wrong thing (unless they're in the blog industry and they rely upon commenting on other people's comments).

Speaking of being focused on the wrong thing, while doing my bi-weekly dive through the sites, I caught a post by the esteemed Erik Sink - WPF for Laggards - where he discussed WPF - Windows Presentation Foundation. Going through various names and feature lists over the years, this is a new way of developing for the Windows platform, and it will change how a lot of us build software. Is it important to know, however (e.g. are Windows developers not up on WPF "laggards"?). Of course it isn't, and in many ways details of it are just communications noise that distracts people from the incredible amount of knowledge they need to do their job today.

When WPF is realized, eventually, sometime next year, and as it finally makes its way into the tools that we use, it'll be worth paying attention to it. Otherwise there is little or no advantage -- though often there's a signficant time and focus cost -- to jump on the bandwagon before its time.
Friday, June 02 2006

Like most professionals in the technology field, I jump to Google and other search engines fairly frequently, in pursuit of hints and documentation to help with various technology dilemmas. A quick search on the web and the archives of newsgroups usually saves a tremendous amount of documentation diving and experimentation.

In return for the huge benefit that other people's documented successes, failures, and experiences bring, it has long motivated me to "pay it forward" by posting technology information online myself, hoping to help some future information seekers (on a similar vein, whenever I get a worthwhile answer on the newsgroups, I usually make it a habit of hanging around and answering several questions myself, returning the favour to the community). If the search logs are to be believed, over the years quite a few people have found pertinent information here regarding their software development problems or questions.

By the River

Lately I've been noticing a decreasing utility-versus-search time ratio, however, with quality declining largely as a result of a growing number of high-pagerank sites feeding cloaked/phantom pages to the Google search engine. Google sees a question/response that correlates closely with what the information seeker is asking, yet a visit by a real user (rather than a search engine spider) quickly finds only the question, with the answer suppressed until the user a) goes through an irritating, arduous process to sign up as a member on yet another infrequently visited edge site that'll likely sell their email address and bombard them with endless ads, b) signs up for some sort of pay membership. Given that many of these sites are simply siphoning their content from the Usenet or other forums, I'm never going to bother with either option, instead hitting back and following the next link, often to find the same sort of nonsense.

Somehow a small number of these phantom page sites, most of them seemingly linked to by no one legitimate, have taken over the top rankings on Google for a huge range of technical searches. Somehow they haven't been banned by Google yet, despite the fact that cloaked pages are expressly forbidden (if the search engine sees the answer, then any random visitor should immediately see the answer by following the search link, as the search engine hint implies that the immediate page contains the result).

If they feel justified in forcing registration to read often coopted content, or the right to charge a membership, I have absolutely no problem with that -- in fact I think the net would be a better place if there were more commercial opportunities encouraging even more intellectual investments. However they shouldn't fraudulently mislead search engines, and search users, and instead should rely upon normal advertising and word-of-excellence for their great utility. Otherwise they should fold, joining the heap of useless websites that could only fool users into visiting.

Don't waste our bloody time! Google shouldn't be acting by implicit complicity in these irritating schemes.

Speaking of the problem of apparent phantom pages, today I happened to be looking for CodeSmith, a free (albeit crippleware for as long as I can remember, and not freeware as the author continually claims) code generation tool that I had fond memories of several years back. Naturally I begin the search with codesmith freeware.

codesmith search

Great, so the page in question apparently talks about the freeware version of CodeSmith. Only it doesn't, and the text in question doesn't appear anywhere on the linked page. Go ahead -- look at that page and search the source for freeware.

In reality the obsolete and deprecated freeware/crippleware version exists on a totally different page (one which doesn't seem to be referenced anywhere else on the CodeSmith page). So why is my time being wasted with the first, desperate-to-turn-a-sale page? Why is Google entirely misleading me about the contents of said page? This sort of bait-and-switch has to stop.

Completely offtopic, but forcing people to register and anxiously wait for a download key to download a crippled, time-limited version is enormously irritating. It really pushes my patience when I just want to validate a product, almost certain of it turning to a multi-license sale, and finding that I'm forced to go through some B.S. that will inevitably yield annoying sales emails and followups.

Just let me download the demo, and if it's good I'll buy. If it isn't, you don't deserve the right to harrass me with promotions and petitions.

Sunday, May 28 2006

I've been playing with Team Foundation Server, Whidbey (Visual Studio 2005), and Yukon (SQL Server 2005) since early in the beta cycles. All three of them are remarkable products, with enormous advances over their predecessors (in the case of TFS, I'm spuriously considering Visual SourceSafe the predecessor, although TFS is a elephant compared to the mouse of VSS), and all of them should be critical components for anyone developing in the Microsoft camp.

All three of them also happen to be a little unpolished, with odd little quirks and errata, hilariously incomplete documentation, and a tendency towards resource hoggishness.

One thing I've found remarkable, however, given that the three of them have been in final form for anywhere from two months to over half a year, is how little real information and first-hand accounts are available online. I'm continually hitting roadblocks where there are marginal functions or incomplete documentation, and it's surprizing to find zero references to the same problems or questions on any of the normal forums (e.g. Google Groups, online searches, etc). Among the development community, outside of the desperate-to-get-anointed-free-support-MVP crowd, they just don't have the aura of excitement they probably deserve.

Given that there are literally millions of developers and technology hobbyists out there, it's usually the case that any problems one faces are well trodden, and a quick search on the newsgroup usually yields exactly the answer one needs, so this dearth of time-travel support really is disconcerting.

The only conclusion I can draw is that there simply aren't that many developers seriously using these technologies. Visual Studio 2005 is of course seeing some use, but there are still huge armies of developers sticking with 2003 (given the break between .NET 1.1 and 2.0). A lot of SQL shops are still taking a wait-and-see approach with 2005. Team Foundation Server, primarily because of the cost of the Team editions, and the cost of a TFS Server license if you grow past a 5-user team, seems to be fairly rare.

Saturday, May 27 2006

Some educational shows for development shops and development managers can be found, surprizingly, on the Food Network (US, Canada, not to mention that many are played on, and often originate from, various other "lifestyle" type channels).

Some of these shows are homegrown, such as Restaurant Makeover and Opening Soon, while others are imports, like the excellent Ramsay's Kitchen Nightmares, Jamie Oliver's School Dinners, and Jamie's Great Escape.

You've probably gained the impression that I'm an epicurean, interested in the operations of the restaurants, and probably dreaming of the day when I can open my own ("We'll make the best French onion soup ever!"). While I do like well-prepared menu delights, the food is the least interesting part of these shows, in my opinion. And I have zero interest in opening a restaurant (the dream-crushed rate among restaruranteurs has to rank among the worst for passion pursuits), and like small-talk as much as I like getting a tax bill.

Instead the real message of these shows boil down to -

  • Passion - When you don't have passion, it's hard to enjoy yourself, much less produce a good product. Whether it's a cook that's using mix to make soup (a "copy/paste" chef), or a software developer judiciously copy-pasting, doing the minimum possible to stay employed, dreaming of whatever comes after the work day ends.
  • Communications - Open, honest communications is critical in a team, keeping everyone on the same page, letting everyone contribute to the success.
  • Simplicity - a bad core product isn't made better by embellishments and complexity. The more focused a product is, the more likely it will be of quality.
  • Realism - the end result of realism is usually simplicity, and it's a realization of what your strengths and domain really are, allowing you to narrow your focus. Trying to cater to all guests, or in the case of software to build solutions that handle any problem, is bound to lead to a third-rate solution (or meal) for a wide audience, but a first-rate solution for no one.

Situations analogous to the software development process endlessly play out between chefs and his staff (team managers/leads and their team members), the chef and the front-room staff (team managers/leads and business partners), and the restaurant and customers (the organization and end users). Many times the solutions parallel how the similar situation would be solved in the software development field.

If you relax to television on occasion, and mourn the summertime (speaking to the Northern hemisphere in that statement) dearth of original programming, check out some of these shows for an informative eye-opener.

We're the chefs and menu planners and sous chefs and pastry chefs of the digital world.

Tuesday, May 23 2006

For close to two months now, I've been rather negligent of this blog. The reasons are numerous, however the following is a list of the primary causes.

  • My wife is back to work as a laboratory scientist, now that maternity leave is complete, so "free time" (if such a thing exists with two small children) is getting squeezed entirely out, and...
  • ...Professionally I've been extraordinarily busy, pursuing some new business avenues and opportunities, making it very difficult to allocate time to finishing articles-in-progress. 

    A partial motivation in maintaining a blog/original content system at the outset was to get some "cheap" (if the time dedicated to creating content was valueless) PR to drum up some consulting/software development customers, however that necessity has largely disappeared (and it was only intended as a fail-safe anyways. I never had to actively look for clients, instead relying upon business contacts and word of mouth. I've actually had to turn away most blog sourced  business due to a lack of capacity). Furthermore, as a PR vehicle for 360notes.com, I think the product itself will earn far more attention than any pimping in these entries ever would.
  • Lastly, but certainly not least, the incredible success of the DNS entries makes everything else almost seem anticlimactic.

    I remember when I first started posting online papers, getting giddy to see that a half a dozen people read them in a week (and I carefully did reverse-IPs to see where they came from, following every referral back to the source), which I knew by downloading and looking at the logs every 15 minutes. As time went on, however, and readership increased, the "dose" required to have any motivational effect inflated, such that having several thousand distinct viewers (e.g. 10,000+ "hits", however nebulous that metric is) in a given day starts to almost seem like a failure (I see newspaper articles gushing about whatever human interest blog of the day caught their eye, and it makes me cynical seeing that they only have 600,000 visitors in a month. "That's only 20,000 visitors a day!"). It's strangely discouraging to think that new efforts will yield only a small portion of the attention the disposable DNS entries did.

    I'm completely over the "hit craving" stage that most bloggers/original content producers go through, and almost entirely disregard the stats. From this perspective, and hoping that I can find a small amount of available time, I'm going to finish up some long-in-the-making articles, along with some other content that I've been wanting to explore. Through it all I promise to disregard the stats.

Thanks for reading along, and have a fantastic day and week ahead!

Dennis

Tuesday, May 23 2006

I've harped on the idea of securing your data several times over the past month. Not only is it a theoretical risk, but the data vulnerability hits seem to keep on coming. This time it was an employee that had a production database, containing the identity-theft vulnerable data for tens of millions of Americans. Apart from the fact that production, critical data was on a roaming PC, it seems likely by the response that the data isn't encrypted or protected in any meaningful way.

It's sad given that this is hardly the first time this has happened, and it'll inevitably keep on happening.

Earlier EntriesLater Entries

Dennis Forbes - Dennis Forbes is a Toronto-based software architect and technology writer