Monday, October 17 2005

For those who don't remember this long-defunct product, or never had a chance to experience it, PointCast was a multi-media push-technology news feed that took off back in 1996. Allowing the user to select a variety of channels from a number of sources (albeit all aggregated through PointCast central), along with stock tickers and customized weather, PointCast took the stage whenever the screensaver kicked in. It turned idle PCs across the land into customized news terminals, earning revenue for its corporate masters by displaying time-spliced advertisements amongst the news.

PointCast's rich graphics, generous content, and clean aesthetics made it a winner. Corporations were clamouring for PointCast caching servers to offset the 1000s of workstations all polling for updates and overwhelming their networks. Its success led many to proclaim that push technology was where it was at. Microsoft and Netscape immediately engaged in a war of push (both integrating their own technologies - Microsoft created CDF, with Active Desktop as its canvas, while Netscape created a conceptual relative of RSS...called RSS. Both stagnated when the push ferver died down, though of course the modern RSS rose from the ashes several years later).

At the height of it all, in early 1997, PointCast was offered a staggering $450 million dollar buyout. Feeling that they could do better, they held out. Not long after they were sold for a less than $10 million. This was a mini-.COM bubble popping, and should have served as a foreboding warning of the technology market implosion of the early 00s. Imagine how regretful the group who decided against the $450 million offer must have felt (and probably still feel).

I still look back fondly to PointCast. It, along with You Don't Know Jack - The Net Show, seemed to promise such a remarkable new internet world of rich content. And they managed to pull it off when most of us were lucky to have 2 KB/s connection (I now get 600KB/s).

It is amazing how much we have technically achieved, with both PCs and connections 100s of times faster, yet rich content has in many ways wallowed.

* - PointCast wasn't really push. Indeed, neither is client-side RSS. Instead they're both polled/scheduled pulls. Contrast this with SMTP, which actually is push: When ServerA has something for ServerB, it actively connects to and "pushes" the message. Pedantic point for sure, but I thought it worth making.

   
Monday, October 17 2005

I try to avoid acting as a link-propagator, but I think Nicholas has penned a very intriguing entry. Definitely worth a read. Note that he didn't say immoral, but rather he said amoral.

Skip the existential stuff at the beginning if it doesn't interest you and jump right down to The Cult of the Amateur.

   
Monday, October 17 2005

[NOTE: For fun and giggles I updated this "tool", creating a "story" home for it, which you can find here: yaflaGUID - Create Sequential GUIDs in SQL Server]

As discussed in the entry on using GUIDs in your database, GUIDs in SQL Server 2000 are, at least from the user's perspective, "random". This can lead to a fragmentation and splits in your data, and it's a common reason to avoid GUIDs in the first place.

leaves2

Of course, like most problems, there are a number of possible solutions. SQL Server 2005 offers a solution in the form of NEWSEQUENTIALID() (though it's limited to being the default on a table, among other limitations).

Coincidentally I happened to be mucking around in the disassembly of rpcrt4.dll today, trying to once and for all nail down the current algorithm used for UUIDCreate (which is used behind the scenes for CoCreateGuid, which itself is used by NEWID() and System.Guid.NewGuid()), when I noticed UUIDCreateSequential in the exports. I'd never noticed this function before, and the docs verified that indeed it does create GUIDs the old sk00l way, starting with the unique MAC+time foundation, and then sequentially incrementing on each generation.

"This is like 3-lines and a minute or two to create an extended stored procedure!" think I, even though I infrequently use or advocate the use of GUIDs. Before I did that, though, I thought I'd look around to see what exists, and sure enough someone solved this problem before.

Nonetheless, for such a trivial component, especially for something that can adversely affect the stability of SQL Server, I'm prone to not trusting binaries from micro-outfits on random pages on the web. I looked for the source, and for whatever reason the source to XPGUID isn't released. I cannot overstate how ridiculously trivial this is (even adding some padding functions to make it seem more substantial). In essence it is two credible lines of code over and above the VS.NET 2003 Wizard created extended stored procedure project.

As such, I've made this available for download, source-code and all, at http://www.yafla.com/downloads/yaflaSQLGUID.zip. In it you'll find the source and a compiled Release binary, yaflaGUID.dll. You can place this (or a new build that you made yourself) in your SQL Server \binn directory and run the following command

EXEC sp_addextendedproc 'xp_yaflaGUID2005', 'yaflaGUID.DLL'

(of course you can remove it with sp_dropextendedproc)

If you want, wrap it in a User-Defined Function for some inline scalar goodness.

CREATE FUNCTION dbo.SNEWID()
RETURNS uniqueidentifier AS 
BEGIN
  DECLARE @uuid uniqueidentifier
  EXEC master..xp_yaflaGUID2005 @uuid OUTPUT
  RETURN (@uuid)
END

Voila, the old style of quasi-sequential GUIDs, with far fewer page splits (the value still does jump around, but for any closely time-related sequence of GUIDs it is sequential). Theoretically the generation of the GUID should, on average, be faster given that many are just sequentially created, however the extra indirection of the XP makes it slightly slower from a pure execution time perspective than NEWID(), but you should easily make that up in the DML calls.

 SQL 
   
Tuesday, October 18 2005

OPTGROUP is a grouping element that can be used in SELECT elements. For instance instead of...

Which Web browser do you use most often?

...you might see...

Which Web browser do you use most often?

(That sample courtesy of the OPTGROUP page above, which is why the browser list is obsolete. I'm too lazy to update it)

Whether you see those or not depends on your browser (some ancient ones dislike form elements that aren't in a form), and the visual styling of the group headers will differ.

It's a pretty useful little addition to HTML, and has been around for quite some time. Remarkable, then, that it has seen negligible use in the real world. This despite the fact that it is eminently useful, clear by the fact that lots of sites are doing exactly this sort of thing manually: Adding custom hacked in groupings, where the group headers themselves are selectable but not really selectable (it'll be script-overridden or refused by the form handler if it is selected by the user, which is confusing from a UI perspective) for things like grouping states and provinces by country, and so on.

Of course my point here wasn't to evangelize OPTGROUP, though I do think that it's underused. The profound thing to me, brought to mind by the power of this trivial and underused element, is how marginal the advances in web technology (specifically HTML) have been over the past several years. Everyone is busy doing grossly redundant scripting and hackery for the most mundane of things, trying to use the coarse paintbrush available to build a subpar pseudo-fat client. Maybe they're spending weeks trying to replace their table layout with DIVs to satisfy the XHTML pedants.

How many lame derivatives of combo-boxes have been hacked out? How many intolerable scripted spell checkers? How many hacked out date/time selectors? Every one of them a mystery-meat to unwilling web victims.

Doesn't it seem reasonable that these things could be a part of the basic toolset of HTML by now, allowing rich clients to use their power to present it in the most intelligent way possible?

For that matter, why haven't we made the leap to "databound" controls? Would it really be that difficult to standardize some of the standard "AJAX" type functionality into declarative HTML 5.0? Not only is AJAX far from new, but the fundamentals behind it have been understood since the nascent days of the web. This is painfully obvious stuff.

Is it really that difficult?

Of course, it absolutely would be that difficult. Once a standard is entrenched, it becomes more and more difficult to change through consensus.

History is rife with examples where someone hacked something together and unleashed it on the world, and it was good, and it was revolutionary. The rate of innovation quickly slows, though, as more and more people get involved, and their vested interests and aversion to change takes hold, to the point that the most ridiculously trivial and profoundly necessary of changes take years to see the light of day. RSS feeds are just barely gaining traction, yet already many evangelists have their feet firmly stuck in the concrete, unwilling to even consider trivial changes. For instance alternatives to the absurd RSS/XML icon (who knew that people could get so defensive over a 36x14 icon). RSS will eventually rust, until some new disruptive standard comes along and eats its lunch, and then it will repeat. It is the way these things tend to go.

Oh well, I hear that they're developing Duke Nukem Forever using XForms.

   
Wednesday, October 19 2005

Just a brief entry today as time is short.

I got a lot of great feedback for yesterday's entry - OPTGROUP and the Pace of Standards. It was a mixed set of responses that were both educational and entertaining. I love the feedback, so if you have a comment please feel free to drop me a line (I might post an entry about your comments, though I won't quote you without your permission. Note that I do monitor referrals, so if you make your comment on a blog with a link, it will be noted just like a trackback).

It is remarkable how quagmired the base foundation of the web, HTML, has become. We live in the most dynamic software world that has ever existed, yet much of the infrastructure is the same artefacts from the 90s duct taped together in precarious (and grotesquely redundant) ways.

   
Thursday, October 20 2005

It's been around for a while, but a lot of people still haven't experienced it - The Quiet American's One-Minute Vacations Site. It's an expanding collection of user submitted 60-second audio samples from around the world. Absolutely fascinating to listen to, and many of them really do take you there. Take a minute break and go on a vacation.

While people often use the term "Audio Blogging" to refer to the spoken word (which, when fed through RSS, becomes podcasting), I see these sort of audio samples to be more analogous - though in the audio realm - to photo blogging. As much as I appreciate the Quiet American, it would be interesting to have a site like Flickr-for-audio-samples, with thousands or millions of samples from around the world. Heck, maybe just the Flickr we know and love, but with the addition of audio. It would be interesting to see photos of an Indian market, coupled with some audio samples, and be able to search and browse by keywords.

Of course naturally one would think "Duh...that's video with audio...That's Google Video", however video remains too unwieldy, and in the hands of a less-than-expert it very seldom captures the essence of a scene like a carefully taken photo does, nor does it facilitate quick and easy consumption.

   
Thursday, October 20 2005

One of the big marketing pushes to help hype the release of SQL Server 2000 was a huge onslaught of the benchmarks - before SQL Server 2000 was even available to buy, its results were dominating the TPC results, primarily via clustering. Shortly thereafter, it is purported, Oracle demanded that the TPC separate clustered and non-clustered results. Not long after SQL Server was doing very well in the non-clustered category as well (on very, very, very expensive machines - Big Iron).

SQL Server had joined the big leagues. Any questions about its scalability dissolved.

Remarkably we're on the cusp of the real release of SQL Server 2005 (Nov. 7th I believe), yet there has been barely any noise at all in the TPC results. It has taken more of a lead in the price/performance TPC-C results, and it has pushed a little higher in the pure performance results - though that has more to do with beefier hardware - but all-in-all it has been very sedated in contrast with 2000's release. I wonder if the TPC results simply aren't considered important anymore (probable, giving how old most of the leader results are. 50% of the top 10 are from 2003)

Is the TPC no longer relevant? Does SQL Server 2005 simply offer marginal scalability/performance advantages for the TPC suites?

On the topic of scalability, SQL Server's clustering capabilities could use some improvements. As it is, scaling your database out across two or more servers is most certainly a non-trivial task. It's something you really have to design around (distributed partitioned views don't partition themselves, and it's a leaky abstraction). In an ideal world you could add a new server, install SQL Server and choose "add to the cluster" and it'll automatically propagate some data over and start sharing the load transparently. If it were so easy and elegant Microsoft would see a tonne of license sales as people scaled out.

I'm not an Oracle expert, but I believe that's how their clustering solution has been built.

Of course that sort of clustering is really focusing on the computation end, which really isn't a problem for most scenarios. Instead most are limited by I/O, and we already have methods (via SANs) of tremendously and transparently scaling-out our storage subsystem. Take a look at the full disclosure of the price/performance leader: A single (albeit dual-core) 2.8Ghz processor - a relatively low-end head-end system - backed by a SAN hosting 56 "clustered" hard drives. The TPC-C benchmark is artificial, so this doesn't necessarily mirror the real world, but it is telling. Keep your data efficient through good design and delay the day that you need a 56-disk SAN. 

   


About the Author
Dennis Forbes Dennis Forbes is a Toronto-based software architect. While focused primarily on the .NET and SQL Server worlds, Dennis frequently ventures outside of this comfort zone into game development and image processing. He has been published in several industry magazines, has been quoted in the Wall Street Journal and has been interviewed by NPR.

He is a vice president and lead software architect at an innovative New York City hedge fund back-office services firm.

Dennis has been working on solutions for the financial, telecommunications, and power generation markets for over 15 years.





 
Earlier EntriesLater Entries

Dennis Forbes