Dennis Forbes on Software and Technology   Subscribe to RSS


About the Author
Dennis Forbes Dennis Forbes is a Toronto-based software architect. While focused primarily on the .NET and SQL Server worlds, Dennis frequently ventures outside of this comfort zone into game development and image processing. He has been published in several industry magazines, has been quoted in the Wall Street Journal and has been interviewed by NPR.

He is a vice president and lead software architect at an innovative New York City hedge fund back-office services firm.

Dennis has been working on solutions for the financial, telecommunications, and power generation markets for over 15 years.




The Feed Bag
Feb 24 - TED

 
Tuesday, March 09 2010

Digg And Cassandra, sitting in a B-Tree

Digg recently started transitioning parts of their platform to the Cassandra open-source, Facebook-originated NoSQL solution.

They're the perfect customer for NoSQL: The value per user and transaction is very low, demanding solutions that allow them to scale at minimal cost; some data loss or inconsistency can be accepted; and a lot of the data can be effectively siloed into islands.

Nonetheless, the article they posted about the move is filled with the sort of thinking that has littered the web with misinformation about the relational database.

The fundamental problem is endemic to the relational database mindset, which places the burden of computation on reads rather than writes

The relational database "mindset" imposes no such burden.

It's All About Finding Balance

Indexes, for instance, are a rudimentary tool of every competent database user. Each additional index adds an expense to every write to the table, forcing row changes to update every index in addition to the base table, in return easing certain read scenarios.

You apply as appropriate, striving for the perfect balance between read and write performance.

I posted parts I and II of a very simple "introductory to databases" article back in 2005 (never getting around to finishing part III), and I strongly encourage it for anyone who doesn't understand how indexes work, or how important concepts like covering indexes are (which I'll touch upon later in regards to the Digg scenario).

Many relational database users make heavy use of triggers and cascade activities that slow writes while lubricating reads. While many are wary of triggers in general (especially where business logic gets embedded in the data layer), this is common in the relational database world and makes an appearance in most solutions.

For Digg's particular scenario, however, the RDBMS analogy to their NoSQL approach is a basic materialized view (aka indexed view), which is a feature of most RDBMS products, from big to small.

Implementing materialized views adds a sometimes substantial cost to writes in return for supercharged reads. If I have a particular set of joins and functions that are queried often, I can materialize the view with the appropriate indexes and every change to any of the source tables automatically, as an added cost of the DML, updates the materialized view as well.

Some RDBMS systems support deferred materialized view updates where it automatically queues up the view changes without adding cost to the origination tables.

This is very old hat for virtually anyone competent with relational databases, though real-time materialized views need to be used judiciously because they fall under the auspices of ACID and can front-load write operations significantly.

Digg Don't Do Indexing (properly)

Ignoring that obvious solution of materialized views (which, to be fair, aren't natively supported by MySQL despite being a basic feature of most other database products), it is revealed that they aren't using the database in the appropriate manner — or that MySQL is simply a broken platform and is turning people against the RDBMS when really they should be against MySQL — when they note that they are manually performing the joins in PHP, claiming that the join takes too long to run as a simply query.

A likely contributor to their poor performance is that while they've made the artificial key "id" their clustered index, their userid/friendid index is only a unique index, and I suspect, from the operation of their site, that they are likely making use of the denormalized friendname column in their consumer as well, forcing a full row lookup for every match.

If they retrieve columns outside of a non-clustered index (the most common mistake is doing a "SELECT *" when you don't actually care about all of the columns), on every lookup match the database server pulls the row id (in this case the primary key) and then has to do another lookup for the actual row data. In their case — given that the relationship is unique — they should have made the compound key of user_id/friend_id the clustered index and eliminated the id column altogether.

This oversight means that instead of doing a simple partial index scan by user_id and pulling the limited set, the query engine is forced to pull the list of rows, and then lookup each and every row individually. So someone with 400 friends yields 400 IO cycles, versus 1 with a proper index.

The same problem exists in the Diggs table, but is made worse. The userid index is of limited value given that again it only helps them look up the surrogate record key (again, why not a primary key on itemid/userid with a secondary index of userid/itemid? Surrogate keys are usually a mistake if there's another unique key on the table, though of course it depends upon the scenario: foreign-keys or numerous secondary indexes might make such a simple key the best choice). The query engine is forced to lookup the records by either the itemid or the userid (by the friendid) and then lookup the root record, and then compare the corresponding value.

So many developers are so blissfully ignorant of how databases work, quick to ascribe their own shortcomings to the platform. Most will wave their hands and talk about how hard to come by a "good DBA" is, which is akin to pushing brutal bubble-sort algorithms and just distributing them across a MapReduce deployment, claiming that a good "sort algorithm guy" is hard to find and "scaling out" is what the big boys do.

So they could see a major performance improvement by indexing properly (I'm allowing that maybe they just gave a bad example, though their atrocious query performance seems to validate its accuracy), but even then looking up hundreds of seemingly randomly distributed records can be a costly exercise.

Change Is In The Air

Let's step back for a minute and ignore materialized views and appropriately created and used indexes and look at the core performance issue that Digg faced — looking up several hundred rows in the Friends table, and interrogating the Diggs table by userid/itemid for the same. Presume that the dataset is very large and it can't be cached in memory, which should be a normal design assumption.

Why is looking up several hundred randomly distributed records such a big deal?

Hard Drive

That's why. Most hard drives can only manage to seek to different locations on the disk about a hundred times per second. If you're relying on Amazon's EBS you have it even worse, with an esimated 72 IOPS per second.

That's slow.

Imagine that the query engine has a hundred row locations in hand; It would take it a full second to jump over the disk to gather up the data necessary to retrieve the contents of those rows. That's a best case scenario because in the real world it usually has to walk the index b-trees, find the matching data, and in Diggs' inappropriately indexed table case do yet another lookup to find the actual row itself.

This is why database systems often completely ignore indexes if the estimated match count exceeds a relatively small percentage of the data, anemic storage systems forcing them to do expensive operations like full scans because in the end it's a cheaper choice. Why it often just reads and filters a burst of MBs of data rather than select a few sparse records from an index.

It's why it's desirable to have the data in RAM, and why database servers should be loaded with copious memory. [Sidenote: It's also why denormalizing can paradoxically slow down a database in many scenarios because it grows a database beyond RAM unnecessarily. In the Digg case note the username and friendname fields in the Friends table]

The IOPS weak-point is why most enterprise databases add SANs with ranks and ranks of hard drives, ganging them together in such a way that many seeks occur simultaneously, vastly increasing the I/O rate.

A more attainable and far more disruptive advance is moving into reality, however, and that is SSDs.

Take a look at the Anandtech review of the OCZ Vertex LE 100GB MLC SSD. In particular look at the 4KB random read - MB/s results on page 10. Near the bottom are a couple of magnetic disks, including the esteemed VelociRaptor, which are absolutely decimated by the SSDs.

That is the test that is most applicable to the Digg scenario, and it is clearly evident how big of an impact it would have on their situation.

Instead of 100 IOPS, they would be looking at 15,000 IOPS. Put 6 of these in a RAID-10 array and you'd have a yield of 45,000 IOPS and reliability. Even without learning how to properly index they could see an easy 5000x performance improvement in that class of RDBMS queries. Add a materialized view and...the speed would be so obscene it would get banned from the App Store.

Those units are just $400 a piece, and the technology keeps getting bigger and faster and cheaper. SSDs are a deeply, deeply disruptive change, especially to the large-scale database world.

The drive I mentioned is an MLC unit that isn't intended for the enterprise market, but in some ways it fits the same role as NoSQL — less reliable, but it gets the job done. The nature of the Digg table (that it is largely an additive table with likely little churn) is the perfect use-case for an MLC SSDs.

And really, 100GB is a lot of space for an operational database, even for a social media site. While it isn't appropriate for Facebook's 25TB "figure out how to sell you junk you don't want" daily activity log, it is certainly adequate for all of the Diggs and Friend relationships Digg would need, especially when removing denormalization that was put in place because of the poor IOPS of magnetic disks. And of course with the magic of RAID you can scale it up to whatever heights you'd like.

For $400.

Soon we'll have even faster, larger drives that are cheaper, and so on. The nature of flash technology is that they can keep making it more and more parallel, so the IOPS are going to keep going up and up and up.

Optimizing against slow seek times is an activity that is quickly going to be a negative return activity. Many who embrace NoSQL are seeking a solution to yesterday's problem. Digg, for instance, yields their entire NoSQL benefit from optimizing data locality — that all data for a given need is nicely bunched together, which of course is what materialized views do as well.

There Are Incredible MPP Options in the RDBMS World...But They'll Cost You

The people who really demand high levels of database performance usually have a lot of money. Which is why many of the products that deliver options like column-oriented storage (an implementation detail of a RDBMS that is primarily suited to very large-scale column aggregations. It isn't suitable for a OLTP DB), or MPP (Massively Parallel Processing), cost absurdly high amounts.

Greenplum, Vertica, TeraData, parAccel, Oracle RAC, Sybase ASE, DB2 MPP...these things are often priced out of all but the largest enterprise's reach.

Look at the pricing of the upcoming release of SQL Server 2008 R2, in particular the Parallel Data Warehouse product that brings MPP to that server. $58K per processor, which obviously excludes it from contention for the vast majority of applications.

Come on.

If there is one thing that I would like to see come out of the NoSQL advocacy movement, it would be that mainstream databases feel the pressure to push down the functionality that they currently limit to the people with the biggest bank accounts (which they sell using the "how much do you have?" pricing model).

  SQL   NoSQL 
Tuesday, March 02 2010

Getting Defensive

I work in the financial industry. RDBMS’ and the Structured Query Language (SQL) can be found at the nucleus of most of our solutions.

The same was true when I worked in the insurance, telecommunication, and power generation industries.

So it piqued my interest when a peer recently forwarded an article titled “The end of SQL and relational databases”, adding the subject line “We’re living in the past”.

[Though as Michael Stonebraker points out, SQL the query language actually has remarkably little to actually to do with the debate. It would be more clearly called NoACID]

That series focuses on NoSQL as the challenger to the throne.  It isn’t alone as the past year has yielded a bountiful crop of articles and blog entries declaring the imminent death of the decrepit relational database at the hands of this new innovation.

Most get posted with incendiary, absolute statements against the RDBMS.

The ACIDy, Transactional, RDBMS doesn’t scale, and it needs to be relegated to the proper dustbin before it does any more damage to engineers trying to write scalable software.

And they usually see later edits that blunt the original euphoria.

postnote: This isn’t about a complete death of the RDBMS. Just the death of the idea that it’s a tool meant for all your structured data storage needs.

Indeed.

Few hold the RDBMS as the only tool for all of your structured or unstructured data storage needs, though that strawman makes an appearance in many NoSQL advocacy pieces, adding some unintentional comedy (“irony”) given that the same entries usually call for the death of the RDBMS, with NoSQL declared the one true way to store and retrieve data.

Page 493 (as labelled by page) of the article “The Paradoxical Success of Aspect-Oriented Programming” includes a fantastic quote and graphic from an IEEE editorial by James Bezdek in IEEE Transactions on Fuzzy Systems.

[I quote indirectly given that the original source isn’t publicly available]

Every new technology begins with naive euphoria — its inventor(s) are usually submersed in the ideas  themselves; it is their immediate colleagues that experience most of  the wild enthusiasm. Most technologies are overpromised, more often  than not simply to generate funds to continue the work, for funding is an integral part of scientific development; without it, only the most  imaginative and revolutionary ideas make it beyond the embryonic stage. Hype is a natural handmaiden to overpromise, and most technologies build rapidly to a peak of hype. Following this, there is almost always an  overreaction to ideas that are not fully developed, and this inevitably leads to a crash of sorts, followed by a period of wallowing in the depths of cynicism. Many new technologies evolve to this point, and then fade away. The ones that survive do so because someone finds a good use (= true user benefit) for the basic ideas.

In the case of the NoSQL hype, it isn’t generally the inventors over-stating its relevance — most of them are quite brilliant, pragmatic devs — but instead it is loads and loads of terrible-at-SQL developers who hope this movement invalidates their weakness.

Some sort of Fight Club ground zero wiping of the records, rewriting the rules of the game.

It doesn’t.

Nonetheless there is indisputably a lot of fantastic work happening among the NoSQL camp, with a very strong focus on scalability.

So what is scalability, anyways?

Scalability is a poorly-defined concept that, more often than not, is twisted to suit the speaker’s agenda. Scalability is often the excuse to engage in absurd hypotheticals to sell a particular blend of fanaticism.

Putting aside wordplay — or perhaps to engage in some of my own — scalability is pragmatically the measure of a solution’s ability to grow to the highest realistic level of usage in an achievable fashion, while maintaining acceptable service levels.

Imagine the scenario that you’ve built an internal help ticket tracking system for your branch office of Money Bags Corporation. If you had to describe the data needs in three points, they would be-

  • Data is highly interrelated (relational)
  • High-value users and transactions
  • Data consistency and reliability is a primary concern

You decide to go against the hype and build it on a classic RDBMS system.

Will it scale to the real-world requirements?

There are some real scalability concerns with old school relational database systems. Adam Wiggins does a pretty good job of covering the techniques to scale a SQL database, though I strongly disagree with his end assertion.

You face those concerns on that glorious day the CEO calls to tell you that the board is super excited about your team’s help ticket system, built on SQL Server, and they want you to deploy it corporation wide. For data consistency purposes they want a single instance, instead of alternative deployment scenarios like pushing out an instance (“shard”) for each division.

Can you make it work?

When Money Is No Object

Of course you can. Even on the maligned Windows platform.

From a vertical scaling perspective — it’s the easiest and often the most computationally effective way to scale (albeit being very inefficient from a cost perspective) — you have the capacity to deploy your solution on powerful systems with armies of powerful cores, hundreds of GBs of memory, operating against SAN arrays with ranks and ranks of SSDs.

The computational and I/O capacity possible on a single “machine” are positively enormous. The storage system, which is the biggest limiting factor on most database platforms, is ridiculously scalable, especially in the bold new world of SSDs (or flash cards like the FusionIO).

Such a platform can yield very satisfactory performance for tens or hundreds of thousands of active users in most usage and application scenarios (where generally clients talk to a farm of middleware servers).

Of course if you index poorly or create some horrendous joins you can screw it up, but with competency it will be good times for all. Even with billions upon billions of help tickets.

For the purposes of the application, the scalability requirement is completely satisfied — total scalability is achieved in the context of the application.

But it doesn’t end there.

From a horizontal scaling perspective you can partition the data across many machines, ideally configuring each machine in a failover cluster so you have complete redundancy and availability. With Oracle RAC and Sybase ASE you can even add the classic clustering approach.

Such a solution — even on a stodgy old RDBMS — is scalable far beyond any real world need because you’ve built a system for a large corporation, deployed in your own datacenter, with few constraints beyond the limits of technology and the platform.

Your solution will cost hundreds of thousands of dollars (if not millions) to deploy, but that isn’t a critical blocking point for most enterprises.

This sort of scaling that is at the heart of virtually every bank, trading system, energy platform, retailing system, and so on.

To claim that SQL systems don’t scale, in defiance of such obvious and overwhelming evidence, defies all reason.

And you don't need to spend a million dollars. A mid-level Dell server can easily handle the vast majority of real-world database needs: No, your project likely isn't going to have the needs of Twitter, Flickr, or Facebook. You can grab a four CPU Dell server hosting a total of 24 cores of latest-tech computing power, with 128GB of RAM, for around $15,000. That is beefier than the systems that ran many enterprises just a few short years ago.

Artificially Limited Scalability

Imagine that you’re a start-up building your big new Social Media site

Obviously you don’t have your own datacenter, but instead you’re going with cloud servers to host your solution.

You don’t have the option (much less the finances) to buy and install a Unisys 7600R, or even a loaded Dell R905. You don’t have TBs of memory or massive I/O at your disposal.

Instead you have to go with the options available on a host like Amazon’s EC2, where the most powerful choice available is the High-Memory Quadruple Extra Large (!) option at $2.40 / hour (at writing), or about $21,024 a year, which is a fairly reasonable rate given that an equivalent purchased server would run you about ten thousand dollars up front.

This is very powerful compared to their historic maxed-out image — the puny large image that used to represent the top end — and is large compared to the max of many other cloud hosts, yet it is entry level in the RDBMS database world.

I/O on the EBS has been measured with a throughput in the 30MB/second range  with about 72 IOPS per volume, which is one-half the speed that my Atom-based home NAS achieves. You can stripe multiple volumes into a software RAID array, but you quickly limit the I/O available to your instance.

For comparison we’re currently looking at an entry level $8K 36TB iSCSI device that would offer our database a dedicated 400MB/second throughput and about 1500 IOPS, and this is for a pretty humble low-criticality need with low-end magnetic drives.

As a speculative start-up you don’t want to commit $20K/year to have a single instance hanging around, especially given that your traffic is extremely variable and most of the time it will sit idle. You want to run the smallest database layer possible, ramping up if the need (fingers crossed) arises.

In an ideal world you could float along on a small instance economically until that big day when you get mentioned on Digg, at which point you spool up ten extra large instances, turning them off when the need passes.

These financial and artificial limits explain the strong interest in technologies that allows you to spin up and cycle down as needed. It’s why the old guard has largely remained quiet (because it solves a problem that they don’t have, notwithstanding any manufactured “my friend has a super-duper 512CPU Sun box and it is always overloaded!” scenarios), while a million hopeful start-ups with their small EC2 instances are loudly bleating about the limits of scalability with SQL systems.

The Needs of a Bank Aren’t Universal

The world of financial firms and retailers and other RDBMS users is very different than the popular social media scenario usually played out.

If you had to describe your social media data needs in three points, they would be-

  • Largely unrelated islands of data
  • Very low value user/transaction value
  • Data integrity is not critical. If you lose a Status Update, or several thousand of them, it will likely go unnoticed, or at least won't cause a major situation.

MySQL originally lacked many traditionally mandatory RDBMS elements, such as transactions, without which it is extremely difficult to maintain a high level of data integrity. That didn’t dissuade many of its boosters who declared that it was an unnecessary cost for the purposes that they used it.

They were right.  As MySQL has moved towards the values of traditional databases, it has moved away from its original bag-of-data values.

The truth is that you don’t need ACID for Facebook status updates or tweets or Slashdots comments. So long as your business and presentation layers can robustly deal with inconsistent data, it doesn’t really matter. It isn't ideal, obviously, and preferrably you see zero data loss, inconsistency, or service interruption, however accepting data loss or inconsistency (even just temporary) as a possibility, breaking free of by far the biggest scaling "hindrance" of the RDBMS world, can yield dramatic flexibility.

This is the case for many social media sites: data integrity is largely optional, and the expense to guarantee it is an unnecessary expenditure. When you yield pennies for ad clicks after thousands of users and hundreds of thousands of transactions, you start to look to optimize.

The same efficiency applies to highly relational schemas — if you can just serialize object graphs and that’s all you need, why bother normalizing? Many would argue that it’s a premature optimization, but if it’s all you need it might be the best choice.

Both of those decisions would be outrageously negligent in many other industries, but the rules that apply for a banking system have woefully little applicability to a social media site.

SQL is Scalable and NoSQL Isn’t For Everyone

The point is one that I think all rational people already realize: The ACID RDBMS isn’t appropriate for every need, nor is the NoSQL solution.

A social media site is not an inventory system. A banking account management system is not a social news aggregator.

Picking and choosing database terminology from the Wikipedia entry on RDBMS’ doesn’t equip the speaker with an expert level of knowledge to declare the truth about the database industry.

Scalability noise based upon the limitations of a cloud vendor’s offerings needs to be put into context: They don’t apply to most of the users of relational databases.

MySQL isn’t the vanguard of the RDBMS world. Issues and concerns with it on high load sites have remarkably little relevance to other database systems.

And of course the SQL/RDBMS world is changing (sidenote: Few love SQL, but I’ve yet to see a viable replacement). Wouldn’t it be a grand world where every desktop (platforms that spend about 99% of their time completely idle) in a corporation was a part of the corporate cloud, all seamlessly acting as a part of the corporate information system in a reliable, redundant way? A simple SQL statement silently and transparently fulfilled by hundreds of distributed systems?

We’ll get there.

Aside: I'm currently building a solution (to fill this space) that significantly leans on Project Voldemort. I have somehow managed to remain rational.

Postnote

This is one of those rants that strangely gets attention, with several taking it as anti-NoSQL, or even pro-RDBMS, I assume because positions so often seem to be polarized. It is neither, which is quite evident if read with an unbiased mind: Defending the real world practical scalability of the maligned RDBMS merely brings accuracy to the debate. Several have asked if I'm merely attacking a strawman: Aside from several specific links that I gave above (I am remiss to add more as I've engaged in the blog-to-blog arguments too many times before), I find it hard to believe that these people take part in any technology discussion forum or group, where NoSQL is being quite widely, and often without question, held as successor to the RDBMS...the new evolution of database systems.

The motivation of the post is that the discussion is, by nature of the venue, hijacked by people building or hoping to build very large scale web properties (all hoping to be the next Facebook), and the values and judgments of that arena are then cast across the entire database industry — which comprises a set of solutions that absolutely dwarf the edge cases of social media — which is really...extraordinary. It's a bit like moving to the bottom of the ocean and declaring that everyone should start using submarines to commute.

There have been edge conditions in the database world for as long as there has been an industry. High performance logging/data acquisition (often distributed), for instance, has always been a case where traditional RDBMS systems aren't suited, and thus should be jettisoned. The industry didn't rewrite the rules because of those fringe cases, however, for good reason.

  SQL   NoSQL 
Wednesday, February 24 2010

More Apple/Android Junk?

I have so many things I’d like to write about – topics having nothing whatsoever to do with Apple or Android, like IoC, ARM assembly, rational NoSQL, and of course pragmatic software architecture – but various mobile issues keep mentally demanding that I hop up on a soapbox about them for a bit. I have to get this out of the way.

Mobile computing is absolutely the most important realm for this industry over the coming decade, so pay attention to what is happening because it really, really matters.

CIBC – Leader and Innovator, or Me-Too Wannabe?

CIBC app
The CIBC Platinum Card

CIBC, a large Canadian bank, just launched a nationwide advertising campaign to promote their newly released iPhone banking application. You can see the video on YouTube.

It's notable that CIBC isn't targeting iPhone-only venues with these ads, which they could easily and cost-effectively do, but instead they are promoting this during primetime Olympics coverage. They're putting it front and center on their website.

If you caught the commercial you might have mistook it for an Apple ad, given that the strongest takeaway is a subtle "there's an app for that" message, followed by the implicit declaration that iPhone customers – a small minority of CIBC customers – are the elite, their walled garden needing its own special flowers.

Maybe Apple subsidized the ads and the product itself. It’s a little surprising for a large and profitable bank to look for ad subsidies, but it’s the only conclusion I can draw.

It’s either that or CIBC has an Apple fanboy wreaking havoc from the executive level.

In any case, big deal: CIBC makes an app for the iPhone (first bank in Canada to do so, they proudly boast). Just serving customers, right?

Consider that CIBC has never offered a rich-client Windows application for banking, which is a statement that is true for every Canadian bank as far as I know.

They will let you download data to Quicken or Money or what have you, on whatever platform you’d like, but if you want to bank electronically as an end-user the cross-platform web browser is and has always been your electronic banking tool, even when it limited them to a very simplistic interface.

They knew not to fall into the ActiveX quagmire like, say, South Korea. The banks have always supported just about any modern client equally.

Think about that for a bit: They have never directly supported the rich interface of the overwhelmingly dominant client platform for PCs.

And for very good reasons.

Yet now they have special, premiere support for one far-from-dominant Smartphone.

Given that history of device and platform treatment, it’s natural to presume that they have some fantastic and compelling reason for making this change: Maybe they’re using amazing 3D graphics of money flow or something on the device. Maybe a breathtaking augmented reality experience that allows you to visualize your debt load increasing when the camera is pointed at that new must-have device you really want at the electronics superstore, a virtual banker sternly shaking his head no.

There must be something that they just couldn't do without "going native", right?

Nope.

Their iPhone app features shockingly basic functionality. The single place where it could use something even remotely client-rich – to get the user's location to find the nearest branch – they screw it up and force you to type your location in.

This Is What Web Apps Are Made For

Really HTML5
This Is An HTML5 App on the iPhone

This application was made for HTML 5, which humorously would easily allow them to use the Geolocation API to get the user's current location for richer and more intuitive mapping.

And let’s give credit where it is definitely due: The iPhone features excellent web app support, arguably best of class, likely because that was originally the only way to create applications for the device.

Jobs’ original vision was that the phone would offer a native Apple experience enhanced by a rich and robust web application ecosystem. That was the phone that they originally delivered.

That web richness allows you to make apps that look and act just like an iphone application with some simple targeted styles and scripting, offering rich and robust functionality and features.  It also allows you to avoid going hat-in-hand through Apple's app review process for every update, as is demanded when you publish via the App Store.

So imagine a world where CIBC decided that they didn’t need to kneel in worship before Apple, trying to suck some Apple-idolizing droppings from the dirty ground, and they’d release this as an HTML 5 app.

It would feature the same look and feel, could easily support all of the same functionality (without breaking a sweat). It would almost certainly be far more maintainable, and could function like a minimized version of the web app they already have for PCs, without necessarily demanding new public-facing web service APIs.

Win all around, right? Well that’s just the start.

If CIBC did it the correct way – as an HTML 5 app – it would also work on Android devices (including crazy features like local databases and geolocation and all of the snazzy dynamics), such as the hot new Milestone coming to Telus, and the anticipated Acer Liquid and Sony X10 coming to Rogers, and more importantly – this is the land of RIM – it would work on the newly revamped Blackberry webkit browser coming shortly, which is worthwhile given that Blackberry remains atop Apple in the “smartphone” category, especially here in Canada.

Remember that Apple far from dominates the Smartphone category, and competition is only getting fiercer. Canada has never been as iPhone-crazed as the US, and a number of compelling non-iPhone smartphones are just hitting our shores.

So if they went the HTML 5 route, they could offer a rich experience on all capable devices, easily stylable and feature-scaling to optimize the platform experience. Anything would be better than the WAPishly rudimentary "everyone else" dumpbin interface they currently support for every other mobile device.

Didn’t They Listen To Steve Jobs?

Juxtapose Steve Jobs telling us that the iPad doesn't need Flash because HTML 5 makes it irrelevant – a premature statement, but the time will come when his words will seem prophetic – with organizations like CIBC porting absolutely rudimentary web functionality into native apps, wasting time and resources and cash, primarily benefiting Apple, while undermining marketplace choice.

Very backwards move, CIBC. It doesn't make you look hip and on-the-ball, but instead makes you look like Apple-salesmen hoping for your little bit of me-tooism hipster credibility.

Given how boastful CIBC is about being the first bank to feature an iPhone interface, it would be delightful to see another Canadian bank, such as my old workplace RBC, take the high road and come out with rich and robust mobile web apps that don’t favour one walled garden without cause.

They could show it running just as richly on a Blackberry, gaining benefit from the glow of Canadian patriotism.

A Place for Web Apps and a Place for Native Apps

While Jobs is quick to declare the end of Flash in pitching the iPad, the reality is that there are serious gaps in what HTML 5 web apps are capable of.

Graphical games, for instance, aren’t a web app reality without either adding Flash to the equation or going native (e.g. OpenGL on Android or the iPhone). Some day down the road it will be possible, but that isn’t reasonably the case right now, aside from some tech demos that make a high-end desktop grind to a halt.

Apps that exploit special features and functions of the hardware generally can’t be web apps either, at least until the feature is so common and prolific that it gets baked into the shared standards, as geolocation has. I’m sure at some future point we’ll have “camera” and “webcam” and “DSP” APIs to access from JavaScript, but for now those are native app domains.

Mr. Jobs’ statement could more honestly be worded as “Flash isn’t necessary when you have HTML....and apps from the Apple App Store”.

Platform specific apps are needed for a lot of solutions. That goes without saying.

Still, porting absolutely rudimentary functionality to native apps is a backwards repeat of mistakes made in the past, walling the garden off for no logical reason.

CIBC is hardly alone in making this foolish, foolish move, but given that they seem to be so proud of this mistake they deserve particular criticism.

So You Own An iPhone and You Don’t See The Problem

Hanging with My Youngest
My Youngest Doesn't Have an iPhone...yet

Even if you own an iPhone, and you happily imagine a world where your children’s children will have iPhones, you should still view moves like CIBCs with intense cynicism.

Not only are they limiting your choice unnecessarily if you ever decide to consider alternatives (as everyone should always be doing), even if you’ve declared fealty to Apple forever and ever the movements of organizations like CIBC are diminishing Apple’s need to be competitive.

Recall what happened to Internet Explorer after so much of the web (outside of Canadian banks, notably) decided that they would treat IE users as first class citizens, everyone else ignorable chumps. Once that lock-in was established Microsoft had little incentive to work on their browser and they took their users for granted. We’re still trying to pull ourselves out of that mistake.

Apple isn’t Microsoft, but by the end of 2010 they will likely exceed Microsoft’s market capitalization, which is absolutely shocking. Corporate sludge is inevitable at some point. If something happens to Steve Jobs, for instance, and they recruit Ballmer to run the place, you might decide to consider alternatives only to find that you're tied to the platform in a thousand seemingly minor but cumulative ways.

Competition is good. Building up the walled-garden of the iPhone undermines competition, and encourages a foolish Windowsification of the mobile world.

  walled garden   CIBC   iphone   Android 
Thursday, January 28 2010

Reporting On A Twitter Feed Live

I passively monitored Apple’s much anticipated announcement yesterday via a TechCrunch live feed. Apple makes a lot of brilliant products, and their announcements have a big impact, so it's beneficial for anyone in this industry to keep interested.

The TechCrunch show consisted of a couple of people monitoring the twitter feed of someone actually invited to the event while incompetently dealing with technical challenges like “show a graphic” or “don’t abruptly inject a floor audio feed without warning”.

One of the hosts demonstrated why so many of us have an automatic skepticism about the critical reception of new Apple products: As the picture of the product came onto their feed – carried down from the mountain by Jobs – her reaction was “Uhhhhh....it’s gorrrrrrrrgeous!

Her observation is only shared by the truly faithful, though surely the rest of the Apple herd will inevitably come around. You can be sure that going forward this nondescript rectangle will become the new benchmark of product beauty.

Everything that follows will either be ugly in comparison, or declared a rip off. I just discovered a digital photo frame beside me which I sadly must report is a rip-off of the pure, blessed genius of Apple.

Early Prototype
Early Prototype

A Big iPhone

We now know that the iPad is essentially an iPhone with a larger (low resolution, 4:3 ratio) screen, minus voice. Clearly it runs an ARM-derived processor, with performance likely very similar to a Snapdragon 1Ghz. Apple is talking a big game about the A4 system on a chip (saying things like “Intel is looking to do this with their Atom”, ignoring all that came before to pretend that they lead the pack. It's like coming in last in the marathon yet talking about how you finished before next year’s winner), so it would be interesting to see it put to the test against, for instance, the Tegra 2.

One other feature of the iPad is that you can change the background. Apparently that’s a pretty big deal.

The iPad seems to be the continuation of Apple’s platform royalty play, and may be subsidized in the same way that Microsoft or Sony sell their consoles. With this device Apple is going upscale, moving beyond the repackaged web pages and novelty water cooler apps that overwhelmingly dominate the app store. Getting a cut of magazines and books and even more media will surely pad their pockets.

To repeat what I said before, Apple and Sony would be a perfect union. Their modus operandi is virtually identical. Aside from the common quest to act as the troll under the bridge collecting a toll, they share a profound propensity for endlessly reinventing things that came before, cluttering their devices with proprietary plugs and connectors and cards and slots.

The iPad puts into focus why Apple has been so vigilant about maintaining their strict ecosystem command and control of the iPlatform. While some points were debatable with the iPhone (and were cause for much stupidity when otherwise intelligent technical commentators made ridiculous excuses for the restrictions and limitations of the platform, trying to sell some piss water as lemonade), with the iPad it’s clear that it’s for the same reason that the console makers lock down their platform, though the lame excuses are already being doled out.

It certainly isn’t to benefit the consumer. We had shades of this years back as Microsoft built out the trusted-computing platform, and one feared possibility at the time was that we'd end up with a dominant platform where software had to pay a fee and pass a gatekeep ("DENIED! Competes with Excel!"). Thankfully the massive chill was unfounded, or the objection was so loud that it discouraged that initiative.

Alas, the iPad is real. The faithful are pouring forth to tell us that it’s the end of netbooks. It’s the end of eReaders. It’s the future of computing! While usually it’s the Mac faithful that preach the message, in this case it’s the tech media that is pouring on the unabashed praise with no critical perspective. They’re all afraid of posting something negative, only to be mocked when Apple inevitably succeeds. They point nervously at the Slashdot summary of lore.

As Jobs creepily says during his demonstration, “It’s that easy.” Then again Jobs also told us that it will be the “best browsing experience you’ve ever had”, while showing us the device rendering websites like the NY Times, sans Flash or other accoutrements, much slower and less usably than it takes for virtually any PC, including higher resolution, vastly more capable $400 netbooks, to do the same.

Flash is so yesterday! HTML 5 is the future!” you say. I agree with you, at least if you’re talking through a wormhole from about two or three years in the future, and with a vastly more powerful device. JavaScript and the canvas element can almost yield usable Flash similes on a PC many magnitudes more powerful than this device. Even just for video it’s grossly premature, though Apple will be overjoyed if you’re restricted to their little ghetto, paying your toll while thanking them for it.

Alas, such is the pure innovation of the sort that only Apple can bring us.

A Blog Exclusive!

As a reader-of-my-blog exclusive, I want to let you all into some secret iPad specifications I stumbled across.

http://www.tabletpc2.com/Review-HPTC1100.htm

I knew I couldn’t fool you. That’s actually a tablet PC from 6 years ago. It’s a follow-up of tablet and hybrid PCs that existed since the turn of the century (and of course supermarkets and science centers have had touch screens, including the revered multi-touch, for much longer. Am I the only one who finds that people endless pinching and unpinching on the screen look positively ridiculous?)

Of course it was far more expensive than the coming iPad. It weighed more too, and had a much shorter battery life.

Then again, it was probably faster than the iPad. It was completely open and could run hundreds of thousands of very rich applications (applications not gimped to a smartphone). It also had lots of standard expansion ports and capabilities.

The market generally didn’t care for it or its ilk because the only people who really wanted a screen like that are inventory takers at Home Depot. Most of the demonstrations of it were laughable.

Despite all of Bill Gates’ prayers before he went to bed, the format floundered. They're trying once more to make it stick.

Of course that device used a screen technology that required a stylus. Apple is into the capacitive touch screen technology, so maybe that is super new and innovative for a device like this?

No, it isn’t unique. That touchscreen device came out before Apple’s very first iPod (you know the one. It was the “me too” music player that saved Apple from dying at the hands of a failed computer business — though some gimmickry with the iMac kept it on life-support for a while longer — which they’ve since rebirthed by rebadging PC components, amazingly fooling the faithful into believing that these somehow came from the premium bin).

Where is the Innovation?

The iPad isn't innovative. Everything it does has been done many times before. Claiming that its restrictions are a benefit are like saying North Korea has a more refined sense of freedom.

Executing well is not innovation. Apple executes very well indeed, and they put incredible care and attention into their products. That is hugely laudable and worthwhile, but it isn’t innovative.

As to predictions that the iPad will take over the eReader market, while it may come to pass it ignores precedent.

People don’t read books on LCD screens for the simple reason that people couldn’t accept that as a substitute for print when they wanted print. That led to the creation and adoption of e-Ink, mirroring how actual reflective print works. I have no doubt that a lot of teary-eyed iPad adopters will tell us that it’s the cat’s meow, but we’ve been down this path many times before. Yes, even with IPS screens.

That’s Apple innovation for you. If standards change for your product, how can you fail?

Of course, all of this is for naught. Apple has a precedent of going into markets with products that cost more while doing less, and achieving remarkable success. So this is my final cry before I smile and nod politely as told about how Apple invented IPS display technology, the ARM reference processor, flash memory, and so on. The leader is truly wise and great.

  Apple   netbook   tablet 
Thursday, January 21 2010

The NAS Gets a New PSU

In March of last year I wrote about replacing the home NAS with a custom-built Linux box.  

Almost a year in and the device has served the purpose well, providing a solid foundation for a connected home. I’ve been very satisfied with the change.

The only downsides of the unit are the higher power consumption (averaging around 38W), and the groan of the two fans inside: the power supply and chipset fans. The audible part isn’t really an issue given that it’s stashed away, but considering that a probable failure point on most new electronics is the fans, it becomes a reliability concern.

I junked a laptop because of an impossible to repair broken fan. I’ve lost several video cards for the same reason.

I can even hear the irritating whirring of my blu-ray player’s fan (do not buy the Samsung BDP1600. The thing is complete junk even without factoring in the noisy fan trying to upstage the even noisier optical unit. Speaking of junk, the Sony alpha-200 is another garbage product that made me regret ever turning my back on Canon).

As promised in the original entry, I got around to replacing the power supply with a PicoPSU 90W unit, which was basically a plug and play swap.

In my original entry I estimated a 4-8W power reduction, which turned out to be an underestimation. With this PSU the power consumption dropped a whole 10W, going down to a constant 28W (only slightly spiking under load), making me feel a little less enviro-guilt. There’s still the noisy chipset fan, but that’ll be another project.

The case was built around the expectation of a power supply fan exhausting heat, so some extra natural ventilation was required. With that the sensor readings now hover at low operating levels.

Economically this is a change that will not pay off. From NCIX the new PSU cost me $73.49 all in. Given a savings of 0.01kWh per hour, and a fully loaded electric cost around $0.16/kWh, it would take 5 years for the 10W to pay for the change.

It would be nice if all power supplies were mandated to be efficient (they aren’t for most devices because they know it plays zero part in your purchasing criteria. It’s unfortunately one of those areas where legislation is really the only effective solution), because right now inefficiency is the standard. Of course environmental choices don't always yield the expected results.

The Dream is Over...Wake Up With New Phone

In July of last year I wrote about choosing a new smartphone to replace the MotoQ that I had been using. While the MotoQ served a good tour of duty, it was seriously showing its age and was falling behind in the empowering mobile revolution.

While I’d been using variants of Windows CE since before the turn of the century, Windows Mobile was obviously lost in the wilderness. Not only was each equipped device essentially abandoned right after being released, the clearest sign that Microsoft lost the plot could be seen in PocketIE, where the preloaded bookmarks to various Microsoft Mobile pages led to 404 errors.

The team moved onto something new and shiny and had no concern at all for the existing base. Microsoft has a very short attention span to products that don't earn them Windows Office type revenue numbers, so it wasn't a surprise.

For various reasons I did not want an iPhone (we don’t need another restrictive and innovation crushing Microsoft scenario playing out, and I want to develop for the device without embracing the whole cult), despite it being the easy choice. I opined in the first entry that Android seemed to have a very bright future ahead, which is a prediction that seems quite obvious now given that it is the platform of so many incredible devices recently released or on the horizon.

The future is so bright for Android that the robots have to wear shades.

The options in Canada were (and remain) limited, so I went with an HTC Dream (G1) given that it had a keyboard and otherwise had largely the same specs as the newer HTC Magic, aside from what seemed like a minor difference in memory capacity.

 I have to confess to being disappointed with the device.

Functionally it is amazing, and even with Android 1.5 the platform is simply brilliant. When everything operates correctly I am over the moon with the device.

The problem is that everything didn’t operate correctly. For whatever reason the device seems to be horrendously overloaded, so even with virtually no apps installed and nothing beyond the base system running, most actions are plagued by obnoxious pauses, even on a fresh start-up.

I hate pauses.

I stopped using brilliant apps like Weatherbug because they seemed to make the situation worse.

Alas, my long term plan was always that I would buy one of the newer, faster phones when they came to market, while using the starter device for development purposes until that time. If an unlocked Nexus One or Droid/Milestone worked on Rogers’ wireless band, I’d grab one of those when it was a possibility.

Nonetheless, I was pleasantly surprised recently to find that Rogers was offering all HTC Dream owners a free HTC Magic for $0, with the caveat that your term length pushes out. Given that Dream owners can only possibly be 6 or 7 months into their term, that isn’t that tough of a demand. I am on a very reasonable family plan that allows me 5GB / month (which I seldom use more than 1% of), so I feel fairly future-proofed with that foundation and for me it was all win.

So the next day a Magic arrived in the mail and moments later I was up and running with it. With the SIM card removed my existing Dream still works on wifi, where it can browse the web and play media and respond to emails and take pictures, and I can of course put another card in it and continue using it online. I’ll likely install Cyanogen on it now.

Quite pleased about that.

The most shocking thing, though, is that this Magic is much more responsive. It has the same processor as the Dream, so that doesn’t explain the difference. If I had to guess, I’d point to RAM, which on this device comes in at 288MB, compared to the 192MB in the Dream. For comparison both the Droid and the iPhone 3GS feature 256MB of RAM.

The extra headroom over the base OS seems to make all the difference in the world. On the Magic I can see that the free memory is usually less than 90MB, even on a fresh start-up, which notably would put it over the limits of the Dream.

HTC and Rogers claim that they’ll release Android 2.1 for this device in the near future, which makes me especially pleased.

Great move, Rogers. The new HTC Sense update and free month of data is icing on the cupcake.

Firefox 3.6 Released – Web Worker Performance Remains the Same

Back in June I wrote about Web Workers, a fantastic new method to move processing out of the UI thread. To support the entry I posted a variation of the SunSpider benchmark I named Moonbat.

Safari kicked Firefox around in this benchmark. I just tried it with the just released 3.6, and it doesn’t look like much has changed: FF 3.6 does 10 iterations with 4 threads in ~11 seconds, Chrome does it in 2.6 seconds, while Safari leads the pack at 2.3 seconds.

Alas, web worker performance isn’t a critical factor in choosing a browser (my favourite browser remains Firefox), but it would be nice to see it moving in the right direction.

Celebrating My First Home High Speed Overage

Got the cable bill — a bill that pushes into the $250 range per month these days — to find a surprising $11.25 "internet overage fee". Apparently I used 67.5GB last month, while my limit is 60GB. The Steam sales, several purchased HD movies and a couple of on-demand games for the kids on the 360, added to the normal internet usage apparently really added up to a very atypically throughput-intensive month.

I'm not going to cry many tears about it, even though I do think $1.50 a GB is a bit absurd (in an average month I doubt I use 10GB, so now I almost feel obligated to max it out), given that I think by usage pricing would lead to a far better, more open, more honest system for everyone.

Thursday, October 08 2009

.NET/Microsoft detractors got an early Christmas present recently when the London Stock Exchange, under a relatively new CEO, decided to dump their .NET/SQL Server –based trading platform, TradElect, to replace it with the product of a being-acquired company.

Rockton World's Fair LlamaOn Slashdot, news of this was submitted and accepted as “London Stock Exchange Rejects .NET for Open Source”, with the statement that “The switch is a pretty savage indictment of the costs of a complex .NET system.” The Digg submitter went with the title “London Stock Exchange dumps Windows for Linux” — which they took directly from the linked article — with the description “Fed up with Windows' failures, one of world's major stock exchanges is joining many others in making the switch to reliable Linux“.

The heavily-linked columnist in both cases is a guy who has been riding this "LSE dumps Windows!" horse for a while now. It has certainly provided him with lots of quality incoming visitors, drawing in those looking for validation, and angry hordes baited by his trolls. Encouraged on he seems to be accelerating the unsubstantiated hyperbole.

Let’s take a moment to go back in history for a bit.

Microsoft made a Really Big Deal about the LSE originally switching to this custom, Accenture-built, SQL Server 2000/.NET-based solution. This was sort of Microsoft’s coming out party, in a way saying “look, we’re big boys too! No more pull-ups for us”.

When the LSE had a very public failure on one of the biggest trading days in history, the detractors were screaming “I told you this would happen!” until their throats were sore, despite the cause of the failure never having been publicly detailed.

Failures have happened on every platform, most commonly as a result of application failure. To automatically assume the worst of a novel solution simply because it is atypical is the thought process of annoying simpletons, anxiously and eagerly hoping to try to pin any fault on anything that doesn’t fit their vanilla perspective of how things are supposed to work.

The software must have failed because Bob went with HP instead of IBM!

The payroll system miscalculated. It must be because they moved it to Linux from Solaris!

So what really happened with the LSE?

Accenture built a very expensive, custom solution for the LSE, purportedly costing somewhere in the neighborhood of $65 million dollars. To operate this custom in-house (albeit designed by Accenture out-of-house) system the LSE built up a considerable technology workforce.

The worldwide recession hits and the LSE takes some financial hits. A swarming mass of competitors in Europe, many running off-the-shelf, superior systems that they’re paying less for, go live.

A new CEO takes over and immediately starts to swing the axe. He makes specific comments in the press about the high IT costs of the organization; both of the large number of technology workers in London, and the continuing significant payments to Accenture to finesse the TradElect platform.

He undoubtedly observed that all of this custom work hasn’t gained them any unique advantage in the relatively commodity task that they performed. In some ways it’s like writing and maintaining your own in-house operating system – if it doesn’t give you some advantage, and actually puts you behind as everyone else pools resources on a solution, then why would you do that?

So they go on the market for a replacement, eventually deciding to go with the product of a Sri Lankan company. The price is right, and the lure of low-priced Sri Lankan talent is enticing enough that they buy the whole company.

In the end they have switched from an extravagant, custom-developed solution built by a notoriously expensive consulting company, and a workforce of expensive talent in the West, to a basically off-the-shelf solution that has been subsidized to its current state by other organizations, in the process getting some low-paid talent in South-East Asia.

The new product isn’t open-source, and it runs on a range of non-open-source UNIX platforms. The Oracle database system it uses is the antithesis of open-source.

What about this story has anything to do with open source?

The LSE doesn’t think it has anything to do with open source, or even necessarily Linux.

Where this story gets legs among the zealots is that the LSE plans to deploy the new product on Linux, given that the underlying operating system in many cases has been commoditized. Who wouldn’t?

Zealots cheering on trolling columnists like Steven J. Vaughan-Nichols do the profession harm. Now this nonsense is going to be parroted by people who don’t know better, making them look worse for it, for years to come.

I love Linux. I love open source. And you know I also am even quite fond of .NET and SQL Server. I detest fanatics, fanboys, and hysterical columnist that distort or invent reality to get themselves hits.

Sunday, October 04 2009

I grabbed "Dirt 2" for the xbox 360 recently, looking for an accessible late-night gaming distraction from coding.

The game is a stunning technical achievement, and it is amazing what they squeeze out of the almost half-decade-old era hardware of the device.

What makes the game spectacular isn't specific to some mystical art of console gaming, however, but is simply great software design and execution. While many in "mainstream" development (business processes, websites, etc) consider game development foreign to what they do, it's all just algorithms and code: One person does financial projections and another does particle effects, differing less than many imagine

The Bruce Trail near Mt NemoThe game was so excellent that I decided that I'd try to find who the talent behind it was, my quest thwarted because this game, like many recent releases by large game studios, has an apparently anonymous development team. My search for credits has yielded only a listing of artists responsible for the songs in the game.

It would be great if there was an industry credits site similar to imdb, where you could find out the people responsible for games and applications: I can easily discover who did the foley mixer work on Joe Dirt, but can't discover the team behind Dirt 2 after a lot of digging. Maybe I'll make one.

I did find a "studio tour" video, in which the only person deemed worthy of naming was the "Senior Executive Producer". Maybe if I finish the game I'll discover who did the magic to make this game happen. I'd like to read how these guys operate and do what they do, because they are clearly successful at their craft, and I imagine they'd have interesting things to say.

Are they just cogs in the gears of CodeMasters? Crank it and a great game pops out, quality determined only by your Senior Executive Director in charge of North American Marketing?

Are we past the era of superstars like John Carmack? Are we into an era where everyone is nameless "team players"...unless of course they're in senior management/marketing, in which case their contributions and name will be heralded everywhere.

As a mostly unrelated aside, the "all contributors are equal, but some are more equal than others" policy reminds me of a conversation I once had with a peer, during which they bragged about how their workplace followed a policy that strongly discouraged fancy-pants work titles (e.g. no lead architect, senior developer, etc). My appreciation for that egalitarian workplace dissolved, however, when I learned that the speaker had granted themselves a lofty, important sounding title, as did the other senior members, and they failed to see the hypocrisy in it.

Sidenote: The website for the game is mildly offensive to Canadians. They decided that the landing page would require you to first select a country, with the options being the Netherlands, Belgium, Germany, Spain, the UK, Italy, France, and the USA.

As a Canadian I'm left not knowing which I'm supposed to pick. Maybe I'm supposed to pick the UK to get words with superfluous 'u' still intact. Maybe I'm supposed to pick the US just because of proximity? Two of those countries (the Netherlands and Belgium) are significantly smaller than Canada, so I have to guess it's a hybrid language/proximity thing.

Lots of websites pull this cheap navigation technique and it's lame. Often a US flag really means "English", other times a Union Jack means English. Nationality and language aren't the same thing, so it's a lazy tactic, made especially confusing when both appear together.

Then again, if I recall correctly the old Codemasters site worked by having you select on a world map, where all of North America was labeled "United States of America". Us Canadians get accustomed to it.

Earlier EntriesLater Entries

Dennis Forbes