Dennis Forbes on Pragmatic Software Development
Subscribe to RSS
 
Thursday, October 20 2005

One of the big marketing pushes to help hype the release of SQL Server 2000 was a huge onslaught of the benchmarks - before SQL Server 2000 was even available to buy, its results were dominating the TPC results, primarily via clustering. Shortly thereafter, it is purported, Oracle demanded that the TPC separate clustered and non-clustered results. Not long after SQL Server was doing very well in the non-clustered category as well (on very, very, very expensive machines - Big Iron).

SQL Server had joined the big leagues. Any questions about its scalability dissolved.

Remarkably we're on the cusp of the real release of SQL Server 2005 (Nov. 7th I believe), yet there has been barely any noise at all in the TPC results. It has taken more of a lead in the price/performance TPC-C results, and it has pushed a little higher in the pure performance results - though that has more to do with beefier hardware - but all-in-all it has been very sedated in contrast with 2000's release. I wonder if the TPC results simply aren't considered important anymore (probable, giving how old most of the leader results are. 50% of the top 10 are from 2003)

Is the TPC no longer relevant? Does SQL Server 2005 simply offer marginal scalability/performance advantages for the TPC suites?

On the topic of scalability, SQL Server's clustering capabilities could use some improvements. As it is, scaling your database out across two or more servers is most certainly a non-trivial task. It's something you really have to design around (distributed partitioned views don't partition themselves, and it's a leaky abstraction). In an ideal world you could add a new server, install SQL Server and choose "add to the cluster" and it'll automatically propagate some data over and start sharing the load transparently. If it were so easy and elegant Microsoft would see a tonne of license sales as people scaled out.

I'm not an Oracle expert, but I believe that's how their clustering solution has been built.

Of course that sort of clustering is really focusing on the computation end, which really isn't a problem for most scenarios. Instead most are limited by I/O, and we already have methods (via SANs) of tremendously and transparently scaling-out our storage subsystem. Take a look at the full disclosure of the price/performance leader: A single (albeit dual-core) 2.8Ghz processor - a relatively low-end head-end system - backed by a SAN hosting 56 "clustered" hard drives. The TPC-C benchmark is artificial, so this doesn't necessarily mirror the real world, but it is telling. Keep your data efficient through good design and delay the day that you need a 56-disk SAN. 

Reader Comments

Add Comment

Name *:

Email Address:

(your email address is not displayed)
Website:

Comment *:


Dennis Forbes - Dennis Forbes is a Toronto-based software architect and technology writer