Dennis Forbes on Software and Technology   Subscribe to RSS


About the Author
Dennis Forbes is a Toronto-based software architect. While focused primarily on the .NET and SQL Server worlds, Dennis frequently ventures outside of this comfort zone into game development and image processing. He has been published in several industry magazines, has been quoted in the Wall Street Journal and has been interviewed by NPR.

He is a vice president and lead software architect at an innovative New York City hedge fund back-office services firm.

Dennis has been working on solutions for the financial, telecommunications, and power generation markets for over 13 years.




The Feed Bag
Feb 24 - TED

 
Thursday, October 20 2005

One of the big marketing pushes to help hype the release of SQL Server 2000 was a huge onslaught of the benchmarks - before SQL Server 2000 was even available to buy, its results were dominating the TPC results, primarily via clustering. Shortly thereafter, it is purported, Oracle demanded that the TPC separate clustered and non-clustered results. Not long after SQL Server was doing very well in the non-clustered category as well (on very, very, very expensive machines - Big Iron).

SQL Server had joined the big leagues. Any questions about its scalability dissolved.

Remarkably we're on the cusp of the real release of SQL Server 2005 (Nov. 7th I believe), yet there has been barely any noise at all in the TPC results. It has taken more of a lead in the price/performance TPC-C results, and it has pushed a little higher in the pure performance results - though that has more to do with beefier hardware - but all-in-all it has been very sedated in contrast with 2000's release. I wonder if the TPC results simply aren't considered important anymore (probable, giving how old most of the leader results are. 50% of the top 10 are from 2003)

Is the TPC no longer relevant? Does SQL Server 2005 simply offer marginal scalability/performance advantages for the TPC suites?

On the topic of scalability, SQL Server's clustering capabilities could use some improvements. As it is, scaling your database out across two or more servers is most certainly a non-trivial task. It's something you really have to design around (distributed partitioned views don't partition themselves, and it's a leaky abstraction). In an ideal world you could add a new server, install SQL Server and choose "add to the cluster" and it'll automatically propagate some data over and start sharing the load transparently. If it were so easy and elegant Microsoft would see a tonne of license sales as people scaled out.

I'm not an Oracle expert, but I believe that's how their clustering solution has been built.

Of course that sort of clustering is really focusing on the computation end, which really isn't a problem for most scenarios. Instead most are limited by I/O, and we already have methods (via SANs) of tremendously and transparently scaling-out our storage subsystem. Take a look at the full disclosure of the price/performance leader: A single (albeit dual-core) 2.8Ghz processor - a relatively low-end head-end system - backed by a SAN hosting 56 "clustered" hard drives. The TPC-C benchmark is artificial, so this doesn't necessarily mirror the real world, but it is telling. Keep your data efficient through good design and delay the day that you need a 56-disk SAN. 

Reader Comments

Add Comment

Name *:

Email Address:

(your email address is not displayed)
Website:

Comment *:


Dennis Forbes