Thursday, October 20 2005

It's been around for a while, but a lot of people still haven't experienced it - The Quiet American's One-Minute Vacations Site. It's an expanding collection of user submitted 60-second audio samples from around the world. Absolutely fascinating to listen to, and many of them really do take you there. Take a minute break and go on a vacation.

While people often use the term "Audio Blogging" to refer to the spoken word (which, when fed through RSS, becomes podcasting), I see these sort of audio samples to be more analogous - though in the audio realm - to photo blogging. As much as I appreciate the Quiet American, it would be interesting to have a site like Flickr-for-audio-samples, with thousands or millions of samples from around the world. Heck, maybe just the Flickr we know and love, but with the addition of audio. It would be interesting to see photos of an Indian market, coupled with some audio samples, and be able to search and browse by keywords.

Of course naturally one would think "Duh...that's video with audio...That's Google Video", however video remains too unwieldy, and in the hands of a less-than-expert it very seldom captures the essence of a scene like a carefully taken photo does, nor does it facilitate quick and easy consumption.

   
Thursday, October 20 2005

One of the big marketing pushes to help hype the release of SQL Server 2000 was a huge onslaught of the benchmarks - before SQL Server 2000 was even available to buy, its results were dominating the TPC results, primarily via clustering. Shortly thereafter, it is purported, Oracle demanded that the TPC separate clustered and non-clustered results. Not long after SQL Server was doing very well in the non-clustered category as well (on very, very, very expensive machines - Big Iron).

SQL Server had joined the big leagues. Any questions about its scalability dissolved.

Remarkably we're on the cusp of the real release of SQL Server 2005 (Nov. 7th I believe), yet there has been barely any noise at all in the TPC results. It has taken more of a lead in the price/performance TPC-C results, and it has pushed a little higher in the pure performance results - though that has more to do with beefier hardware - but all-in-all it has been very sedated in contrast with 2000's release. I wonder if the TPC results simply aren't considered important anymore (probable, giving how old most of the leader results are. 50% of the top 10 are from 2003)

Is the TPC no longer relevant? Does SQL Server 2005 simply offer marginal scalability/performance advantages for the TPC suites?

On the topic of scalability, SQL Server's clustering capabilities could use some improvements. As it is, scaling your database out across two or more servers is most certainly a non-trivial task. It's something you really have to design around (distributed partitioned views don't partition themselves, and it's a leaky abstraction). In an ideal world you could add a new server, install SQL Server and choose "add to the cluster" and it'll automatically propagate some data over and start sharing the load transparently. If it were so easy and elegant Microsoft would see a tonne of license sales as people scaled out.

I'm not an Oracle expert, but I believe that's how their clustering solution has been built.

Of course that sort of clustering is really focusing on the computation end, which really isn't a problem for most scenarios. Instead most are limited by I/O, and we already have methods (via SANs) of tremendously and transparently scaling-out our storage subsystem. Take a look at the full disclosure of the price/performance leader: A single (albeit dual-core) 2.8Ghz processor - a relatively low-end head-end system - backed by a SAN hosting 56 "clustered" hard drives. The TPC-C benchmark is artificial, so this doesn't necessarily mirror the real world, but it is telling. Keep your data efficient through good design and delay the day that you need a 56-disk SAN. 

   
Thursday, October 20 2005

[Note: Some have noted that it should be Daylight Saving Time, without the pluralization of Saving. I, like many, use it more as a general-use title rather than a literal statement - given that it isn't actually saving daylight - and I generally hear it referred to as Daylight Savings Time. Just thought I should mention that.

If one wants to be a pedant, I believe it should actually be Daylight-Saving Time]

The Ontario government caved today, rashly deciding to follow the lead of a ludicrous U.S. energy bill rider, extending Daylight-Saving Time by three weeks in the spring, and a week in the fall (switching into DST on the second Sunday of March, rather than the first Sunday of April as it currently is, and switching back to Standard time on the first Sunday of November rather than the last Sunday in October).

Given that many don't entirely understand DST, I thought I'd share a graph I made some time back (I originally planned on turning the source algorithm into a web service to allow one to punch in the inputs such as location and generate their own graph, but could never justify spending the time on it).

Toronto Sunrise/Sunset DST

All values are calculated for Toronto, Ontario, for 2005. The red line represents EST sunrise, the yellow EST solar noon, while the cyan line represents EST sunset. The purple lines represent the 9-5 workday, adjusted in the summer months to account for DST (where 9-5 is really 8-4). The blue lines represent the extensions brought about by this change (3-weeks earlier in the spring, one week later in the fall). To recap - only the workday period on the graph above calculates in DST (e.g. the 8pm sunset during the summer is 9pm on the clock during DST, and the 4:20am sunrise is actually 5:20am on the clock).

As much as I dislike the incredibly costly confusion and complexity of DST (my two and a half year old still hasn't adjusted), there is a small amount of logic behind it - Presuming that human beings don't naturally adapt to the sun coming up earlier during the summer, DST moves an hour of this presumably unused time into the traditional post-work hours, "lengthening" the evening (not really lengthening the evening, but artificially doing so by moving the traditional 1950s work hours earlier in the day).

Many people would argue against this, saying that the summer hours give them an opportunity to jog, garden, go to the gym, and otherwise take advantage of the extended pre-work hours. Nonetheless, DST is geared towards those who do nothing until their pre-work preparation (e.g. the alarm clock goes off an hour and 30 minutes before the work day starts). For those people DST is entirely beneficial.

Extrapolate that logic out, though, and there should be a second layer of DST that moves the clock yet another hour forward during May to September. Maybe an hour more during June. Perhaps we should have a dynamic clock, such that 9am is an hour after sunrise year round.

Humor aside, there is a tremendous risk of this DST extension, especially coming into force so quickly. Having worked with a number of daylight-saving time related software problems (please use UTC people, or at the very least disregard DST), I would wager that there will be significant ramifications of this. Millions of dollars will need to be spent preparing for, and then cleaning up after, what many seem to think is a simple date change.

Anyone interested in the source data that I generated for this can find it here (it's a Microsoft Excel worksheet).

http://www.yafla.com/

   
Friday, October 21 2005

The Seeds of Malcontent

I absolutely despise the acronym "AJAX". Something about it rubs me the wrong way.

Perhaps it's a silly hang-up. Perhaps I'm foolish for not getting with the current lingo.

Nonetheless, I just don't feel comfortable with it. Not only is it impossible for me to say or write it with a straight face, I find it difficult to hear it from others without unconsciously stereotyping the other party as some sort of malleable, misinformed Johnny Come Lately.

It just seems like an uninformed way of uselessly simplifying a complex ecosystem of evolving and varying solutions into a meaningless, trite acronym (don't get me started on "Web 2.0").

Of course AJAX isn't really an acronym for anything anymore - realizing that it was founded on a solid foundation of ignorance, along with an unhealthy lack of historical knowledge, it has become a more generalized term meaning some sort of nebulous "interactive web application" (what we historically called a rich web application, or even the logically descriptive Dynamic HTML or DHTML). Now it's popular for it to just be the non-acronym "Ajax", with a much less restrictive meaning. If it's a cool web application, well that's courtesy of the new-fangled Ajax!

Even ridiculously pedestrian uses of scripted objects are hopping on the AJAX bandwagon these days. "It uses JavaScript...and that's the J....so it's AJAX!"

The History of the Artist Currently Known as AJAX

As a bit of "AJAX" history, way back in 1999 the Microsoft MSXML team - these were the people who made the superlative XML parser - added an oddball little object into the MSXML library: The XMLHTTP COM object. This object - one which originally used its own HTTP transport outside of IE's, leading to all sorts of proxy configuration fun - allowed one to programmatically send parameterized GET and POST requests to HTTP servers, retrieving data back for processing or display. They deserve credit for inventing it, but at the same time it was one of those inevitable solutions (in fact there were already safe for scripting third party HTTP clients, but of course they had very limited deployment).

XMLHTTP, like the rest of the MSXML COM library, was usable and valuable both by native applications, and by Internet Explorer client-side script (because they handily marked it as Safe For Scripting, which is the flag that reveals a COM component to the IE scripting engine). It isn't actually tied to XML in any meaningful way, and the XML misnomer was purely a result of it being created by the MSXML team (it was just a bit of namespace). Microsoft widely distributed the library alongside other products.

Voila, another method of background data loading was released to the world, adding to the already existing and utilized hidden-IFRAME technique. Like many developers in the Microsoft world, I saw the benefits of this relatively small enhancement, and started enabling internal web applications with partial page loading, rendering and post backs (using other parts of MSXML to transform received XML against an XSLT for display - a next generation technique that still has very limited deployment). Indeed, I even used this object in rich, native applications as an easy way of communicating with web services. It was a convenient little object.

All back in the year 2000, half a decade ago. I was hardly a pioneer, and there were thousands of others doing the same thing or more.

Nonetheless, given that it only worked with Internet Explorer, it was a complete non-starter for the public web, at least for the rational.

XMLHttpRequest Goes International

In early 2001, after seeing the value that it brought to the table, the Mozilla team added XMLHttpRequest to Mozilla 0.8. As the years passed, all of the other major-minor browsers incorporated compatible implementations. It was becoming a technology that was usable in the mainstream...but for one small issue: Up into even 2003, a lot of large corporations stuck with Netscape 4.x as their primary browser, or as their back-up cross-platform browser (e.g. they had Internet Explorer as their primary, with Netscape 4.x as the umbrella if Microsoft suddenly jacked up licensing costs and they needed to switch all of their client operating systems).

This lowest common denominator inhibited public sites, forcing them to stick with the tried and true, so even with the pervasive availability of this functionality across all of the late-model browsers it was still off-limits for widescale use because of a very small minority. Imagine a nation restricted by a 30km/h speed limits because there are a couple of people with old, beat-up Yugos, while everyone else is pounding their steering wheels in their beefed-up gas-guzzling 8-cylinders. That was the web world.

Google Shakes Up The Web World

In late-2004/early-2005 Google shocked a lot of the blissfully-unaware world by adding XMLHttpRequest-backed functionality to Google Suggests, offering "type-ahead" usability enhancements (it can also use IFRAMEs, which is one of the original background-communication techniques). The moment was right, especially for a non-critical beta service, when the old (e.g. Netscape 4.7) could finally be tossed aside, and we could move ahead with technologies that have long been available. Google was the biggest player leading the way, but of course they weren't alone, and there were already plenty of public implementations of this sort of dynamic content that demanded a modern browser (Outlook Web Access, a component of Microsoft Exchange, being one of the earliest and best known).

Google Suggests was the largest, most visible implementation, and it opened a lot of web naysayers eyes up to what many of us had been evangelizing for years. Suddenly the unaware could see potential that had been there for years. Google also set a new bar for the amount of server side processing and bandwidth that could be allocated to a single random web visitor (not long ago it would have been considered insane to do that sort of processing and data transfer constantly as users typed keystrokes, for instance, but the availability of cheap computer cycles and copious bandwidth has changed the equation).

Human Nature - The Defensive Compensation Effect

I'm going to segue to a little human-nature story here - I'm a Microsoft-centric user and developer: I chose to professionally focus on their tools and technologies as my specialty early on, and I've done very well with it (though I'll switch to other primary specialties if and when it's in my best interest). Nonetheless I regularly use Linux (primarily through the magic of virtual machines. I'd recommend that everyone does this, as an aside, especially now that VMWare has made their Player free. You owe it to yourself to give Linux a try - you will probably be surprized at how rich and capable it really is on the desktop), and I try to keep myself fairly competent with it. I have lots of experience with other OSs as well, though sadly I've yet to try OS X. Anyways, putting all of that operating system defensiveness aside, I've been with Windows for quite some time.

One thing I've noticed over the past several years, as a Windows user, is that quite a few historically anti-MSers are making the switch to (or back-to) Windows on their desktop. Perhaps Microsoft's security initiative is having an impact, or some other extenuating circumstance is making them want to switch, but nonetheless they're "coming back".

I feel that a lot of the AJAX hype follows the same pattern - someone either poo-poohed web applications for years, or perhaps their career just finally brought them to web applications. Instead of feeling that they're a junior in an established field, they have to invent some paradigm shift that renders everything that came before irrelevant.

Almost invariably they will defend their switch as being justified only by some relatively recent extraordinary revelation in the Windows platform - e.g. "That Windows 2000 was a real piece of crap, and I wouldn't have touched that with a 10-foot pole, but the new XP kernel is pretty good. Windows was crap before, but now it's good". Of course anyone who follows Windows knows that this is utter claptrap - XP is largely 2000 with a facelift and some relatively minor kernel changes. Really 2000 was already quite a well designed, nicely executed, very stable operating system, but they couldn't concede that prior iterations had value if they weren't a part of the game. Ultimately it only has value once they've entered the contest.

You can see this going the alternate direction as well - "Linux was garbage before kernel 2.x.x" - and for virtually any other technology. If someone is late to the game, there's a good chance that they'll try to manufacture some reason to validate making the switch now. Sometimes it's rational, but often it's just complete nonsense. I haven't spent any time with OS X, so instead I'll just wait a bit until the next iteration and declare whatever trivial change they've made to be the pivotal reason that justified my time.

This applies not only to engineers and their technologies, but to business hypsters as well - Instead of pimping themselves as yet-another entrant in the competitive web application market, they need to present the idea that they're coming at the perfect time: So much is possible now that wasn't possible yesterday.

I feel that a lot of the AJAX hype follows the same pattern - someone either poo-poohed web applications for years, or perhaps their career just finally brought them into web applications. Instead of feeling that they're a junior in a long established field, they have to invent or embrace some paradigm shift that renders everything that came before irrelevant. Disregard that the "technology stack" that facilitates a modern web application is extraordinarily wide and deep, and just pretend that this one small schism represents everything.

Conclusion

The crux of the matter, in my opinion, is that the acronym AJAX brings absolutely no clarity to the table, and instead introduces nothing but noise. Not only does it poorly define an implementation pattern that has been in use by experts for years, but its street use is so generic as to be detrimental (just look at that use of AJAX for the "AMASS" solution linked near the outset - Now AJAX just means "something done in a web app". How utterly inane). To others it is a buzzword that tries to simplify a multi-faceted technology platform into a nonsensically literal implementation detail, putting an inordinate amount of attention on one very small part of an interactive web application.

There is such a rich array of techniques and technologies available to a web developer to make a first rate platform, that such a simplistic and meaningless bit of language just confuses things.

Furthermore, there is this simpleton tendency for the hypsters and the unaware alike to see, or talk about, revolutions, where instead there has been a continual evolution (they're ignoring the fossil evidence!). One just has to read Tim O'Reilly's latest self-aggrandizement platform - Web 2.0 - to see this sort of distortion in action. Where many in-the-know see a constant evolution as the platform matured - everyone got faster computers, storage got cheaper, bandwidth became less expensive, the lowest common denominator got better, and so on - these people, with their enormous blind spot, instead see some monumentous divergence between yesterday and today. Yesterday was old-sk00l 1.0, and today is 2.0. Get it? Yesterday's car is a Model-T 1.0, and today's car is 2006 Honda Accord 2.0, with nothing in between.

Great, now come to my conferences, link my pages, and buy my not-open-source-but-open-source-is-great books!

http://www.yafla.com/

   
Saturday, October 22 2005

Web 2.0 is, in its typical usage, a completely nebulous term. Yet remarkably its boosters will declare that it's all so clear, you idiot - it's "<INSERT THEIR OWN PARTICULARLY DIVERGENT INTERPRETATION HERE>". I've seen this play out in quite a few online and offline discussions, proving to me that it's an amorphous/eye-of-the-beholder sort of term.

Tim O'Reilly and friends were one of the first to widely coin it, so his take obviously deserves attention, yet even it lacks any degree of clarity. Really it appears to be nothing more than a freeze-frame of the web's continuing evolution.

Nonetheless, one recurring attribute of the "Web 2.0" religion deserves attention - Folksonomy, which is the loosely-controlled user-base keyword tagging of content (usually contrasted against the taxonomy of a site like Yahoo, where a central group of annoited ones classify content, albeit really they're just rubber stamping the classification provided by the website owner).

For instance I upload a picture to Flickr, keywording it flower and bee. Now people searching for related pictures can browse amongst pictures of bees, or flowers, or bees and flowers, and see my content amongst everyone elses. del.icio.us follows the same model, with users adding links and meta-data keywords that categorize them. Links can then be searched or related by keyword(s).

Everything old is new again.

In the early days of search engines, the content parsers were really quite dumb - they couldn't read the content of a web site and really figure out what the subject of the page was. As such the META tag was added, allowing website owners to attribute their content with a small, select group of keywords, and those keywords would allow it to choose content appropriate for user searches.

Of course what started as a good idea quickly devolved - nefarious website operators learned to put unrelated, popular terms in their keywords to earn additional hits: What started as a great idea devolved into a tragedy of the commons as more and more people got involved, and they started gaming the system for their own advantage.

The sort of things that work on small-scale, edge sites quickly degrade as they become larger and more important. Already many of the "social tagging" sites are getting overwhelmed with spam and false hits, and of course errors, and more commonly errors of omission, are extraordinarily common on these tagging sites.

Of course for photos there are no other options - we're not at a point where an automated analyzer can look at a picture and determine what it's about, so tagging is the best we've got. However for some attributes, like location, free-text tagging is terribly unreliable, which is why I look forward to the automated GPS tags talked about previously.

http://www.yafla.com

   
Monday, October 24 2005

If you've participated in online forums, you've probably had the NSFW (Not Safe For Work) experience: Someone posts a link to something NSFW without adding appropriate warnings or disclaimers - either as a blatant troll, or without considering varied tastes and situations (e.g. someone sitting in their underwear at home posts a link to a questionable "funny" video that others are blindly opening in their workplace). Often the URL is obscured and indecipherable courtesy of "helpful" services like http://tinyurl.com/

Web 2.0 to the rescue!

Through the magic of collaboration and the power of many eyes making all troll/porn/questionable sites shallow, I propose an innovative new web service that allows users to add URLs, or whole domains, indicating the degrees and types of questionable content, perhaps using the power of folksonomy to tag the URLs. They can add descriptions using a Wiki-style interface, all powered by an AJAX-rich DHTML web application. Think http://del.icio.us/ + http://www.flickr.com + http://www.wikipedia.com + http://www.netnanny.com/.

Via an exposed open-API XML web service, an extension for Firefox can be created that would allow work users to browse more safely, as it automatically validates all URLs through the NSFW service, blocking or warning on potentially questionable content. For those who worry about the privacy ramifications of having all URLs auto-validated, they could manually validate select URLs, or perhaps download the latest "Current Widely Seen NSFW Websites" filter list and use that in their browser extension.

Sounds good, doesn't it? Any VCs out there ready to send me some start-up capital? I'm thinking in the 7-figures range.

Of course in actuality I think it would be a disaster. Not only is it the sort of service that most people wouldn't pay a penny for (yet without a large userbase you can't have useable community-driven rankings, and of course you can't have a professional taxonomy classification of the sites without revenue, as that costs real money), there are limited ancillary revenue options (it'd just be a tiny service that people use almost unconsciously - there is no stickiness). Until it gets a large userbase - which it is doubtful that it ever would as a user-contributed service -  it would suffer from significant false-negatives through exclusion as well, not to mention that it could be easily gamed to cause false-positives.

A classic chicken-egg problem.

On top of that, varying scales of puritanism, as well as trolling with the site itself, would lead to extraordinary pollution of the database. You could try some sort of web-of-trust with relationships and personal networks, for instance only trusting rankings coming from those you trust (or who people you trust trust, and so on) but again that would vastly limit the scope of applicable NSFW rankings, rendering it close to useless for all but the most common of links.

[I should also note that people were tossing around ideas for collaborative, community-driven rankings of websites over a decade ago. The idea of community-driven content is hardly new. What is new is that people think they can build a revenue model on it, largely on the back of Google's innovative and prolific Adsense]

   
Monday, October 24 2005

[UPDATE 2005-10-26 : One of the web tools is available at http://www.yafla.com/yaflaColor/ColorRGBHSL.aspx]

I have a couple of fun little web services that I worked on over the weekend that I'm going to stick on here in the coming days. One is a simple utility that allows you to punch in Hue/Saturation/Lightness (e.g. 327 degrees, 75% saturation, 55% lightness), or alternately an RGB combo (e.g. #8C235C), and it gives the alternative encoding (e.g. from RGB to HSL, or vice versa). Additionally, and usefully, it provides a listing of saturation/lightness variations so you can easily get correlated shades and saturations for color themes. This is really scratching my own itch, as I constantly find myself picking a base color, and then manually, and imprecisely, varying the RGB components to form a theme (e.g. a slightly lighter variation for a side element, a darker one to go behind another element, and so on).

HSL is a vastly superior way of dealing with colors, and is extraordinarily more intuitive than the 24-bit color-space RGB (e.g. #FF0000). Eventually HSL will be widely available via CSS 3, which means that you should be able to use it natively by around 2009 (if only I were joking...). HSL breaks color into its hue, which is from 0 to 360 degrees, saturation from 0 - 100%, and lightness from 0 - 100%. Understanding the relationship between RGB and HSL was eye opening to me, and I discovered that I had significant misperceptions about how they correlated.

To add to the usefulness, it also gives complimentary and triad colors, as well as complete color sets for your themeing fun based upon common color theory. Each of the colors are linked to be the new base color, so you can browse around to find the perfect collection. Fun little tool.

One other micro-application that I'm releasing, as a downloadable Win32 command-line EXE, is a tool that gives a weighted "suggested photo environment". It analyzes an image, using an outer weighting that you select, and recommends a core environment HSL for it to be contained within. You can then use that to create a color theme, and make gorgeous aesthetically pleasing pages, such that Flickr often does.

This tool alternately allows you to monotone an image to a given hue if that's your desire. e.g. to make a photo fit within an existing color theme. This will be a very easy, lightweight, free method of doing what you can do in most good imaging apps.

Anyways, just a "weekend update". This work was all for some R&D of something much different.

   


About the Author
Dennis Forbes Dennis Forbes is a Toronto-based software architect. While focused primarily on the .NET and SQL Server worlds, Dennis frequently ventures outside of this comfort zone into game development and image processing. He has been published in several industry magazines, has been quoted in the Wall Street Journal and has been interviewed by NPR.

He is a vice president and lead software architect at an innovative New York City hedge fund back-office services firm.

Dennis has been working on solutions for the financial, telecommunications, and power generation markets for over 15 years.





 
Earlier EntriesLater Entries

Dennis Forbes