Through the amazing magic of mobile computing and wireless networking, I'm finishing up some work while my daughter watches a (commercial free, courtesy of Treehouse TV here in Canada) episode of "Timothy Goes To School" beside me.
Today's episode features the "Japanese" character and her mother (both portrayed as cats, of course) doing a show-and-tell demonstration of a Japanese Tea Ceremony. While I wouldn't normally post an entry about an episode of a pre-schooler's television show, this provides a convenient segue to my interest in Japanese culture (from a philosophical and social-structure perspective - I don't own a single anime and I've never watched Robotech). The tea ceremony is one of the elements of Japanese culture that seems so...philosophically wise: Taking a moment to actually pay attention to, and appreciate, the most minute of details seems so enlightened, and is so contrary to most of Western culture.
To continue this long and drawn out transition to a post about the software business - About a year ago I decided to learn Japanese (with an eventual goal of an extended visit to Japan). When you have no immersion this can be extremely difficult, and in the limited time I've been able to allocate towards this fringe goal, I've learned the glyphs and pronunciation of the Hiragana and Katakana alphabets. Given that real world Japanese intermixes a massive set of Kanji in with that, not to mention that there's crazy things like words and grammar used to communicate, it really means that I have no functional Japanese ability beyond transcribing Japanese sounds. Nonetheless, it's been very enjoyable, and has indirectly taught me a tremendous amount about human language - it's akin to learning a new programming language, and circuitously gaining insight about the languages that you already know.
So what does any of this have to do with anything? One night some time back I was weighing how much time I really wanted to dedicate towards this fringe goal, and what, if any, software tools existed to help the process. My search brought me to Declan Software's ReadWrite series. Judging from my own experience, which I think would be similar to others, Declan is an excellent case study for small software publishers to learn from.
Not only did they allow me to demo their software to an extent that made me feel confident in a purchase - but still leaving me hungry for more - they also allowed me to satisfy my impulse with immediate satisfaction (I immediately got a full-unlock - no waiting for snail-mail. Obviously this sort of impulse buy would only happen for relatively inexpensive software, but if it wasn't as immediate I would have likely put off the purchase, during which I probably would've lost interest, or found a different product). Any worries I might have had about handing over credit card information to a small vendor dissolved when I saw that their payment processing was handled by regnow (who I'd worked with before).
To really put the icing on the cake, they also offer a small discount for buying a set of related products together (Kanji, Hiragana, and Katakana).
All fears were eliminated, they satisfied my impulse, and they maximize the revenue by selling me additional products I didn't originally intend to buy. All from a random search engine hit.
(For those considering or studying Japanese, I highly recommend the superb JWPce as a great little accessory tool. With the dictionary, including English to Japanese, it is remarkably useful)
[UPDATE 2005-10-26 : One of the web tools is available at http://www.yafla.com/yaflaColor/ColorRGBHSL.aspx]
I have a couple of fun little web services that I worked on over the weekend that I'm going to stick on here in the coming days. One is a simple utility that allows you to punch in Hue/Saturation/Lightness (e.g. 327 degrees, 75% saturation, 55% lightness), or alternately an RGB combo (e.g. #8C235C), and it gives the alternative encoding (e.g. from RGB to HSL, or vice versa). Additionally, and usefully, it provides a listing of saturation/lightness variations so you can easily get correlated shades and saturations for color themes. This is really scratching my own itch, as I constantly find myself picking a base color, and then manually, and imprecisely, varying the RGB components to form a theme (e.g. a slightly lighter variation for a side element, a darker one to go behind another element, and so on).
HSL is a vastly superior way of dealing with colors, and is extraordinarily more intuitive than the 24-bit color-space RGB (e.g. #FF0000). Eventually HSL will be widely available via CSS 3, which means that you should be able to use it natively by around 2009 (if only I were joking...). HSL breaks color into its hue, which is from 0 to 360 degrees, saturation from 0 - 100%, and lightness from 0 - 100%. Understanding the relationship between RGB and HSL was eye opening to me, and I discovered that I had significant misperceptions about how they correlated.
To add to the usefulness, it also gives complimentary and triad colors, as well as complete color sets for your themeing fun based upon common color theory. Each of the colors are linked to be the new base color, so you can browse around to find the perfect collection. Fun little tool.
One other micro-application that I'm releasing, as a downloadable Win32 command-line EXE, is a tool that gives a weighted "suggested photo environment". It analyzes an image, using an outer weighting that you select, and recommends a core environment HSL for it to be contained within. You can then use that to create a color theme, and make gorgeous aesthetically pleasing pages, such that Flickr often does.
This tool alternately allows you to monotone an image to a given hue if that's your desire. e.g. to make a photo fit within an existing color theme. This will be a very easy, lightweight, free method of doing what you can do in most good imaging apps.
Anyways, just a "weekend update". This work was all for some R&D of something much different.
If you've participated in online forums, you've probably had the NSFW (Not Safe For Work) experience: Someone posts a link to something NSFW without adding appropriate warnings or disclaimers - either as a blatant troll, or without considering varied tastes and situations (e.g. someone sitting in their underwear at home posts a link to a questionable "funny" video that others are blindly opening in their workplace). Often the URL is obscured and indecipherable courtesy of "helpful" services like http://tinyurl.com/
Web 2.0 to the rescue!
Through the magic of collaboration and the power of many eyes making all troll/porn/questionable sites shallow, I propose an innovative new web service that allows users to add URLs, or whole domains, indicating the degrees and types of questionable content, perhaps using the power of folksonomy to tag the URLs. They can add descriptions using a Wiki-style interface, all powered by an AJAX-rich DHTML web application. Think http://del.icio.us/ + http://www.flickr.com + http://www.wikipedia.com + http://www.netnanny.com/.
Via an exposed open-API XML web service, an extension for Firefox can be created that would allow work users to browse more safely, as it automatically validates all URLs through the NSFW service, blocking or warning on potentially questionable content. For those who worry about the privacy ramifications of having all URLs auto-validated, they could manually validate select URLs, or perhaps download the latest "Current Widely Seen NSFW Websites" filter list and use that in their browser extension.
Sounds good, doesn't it? Any VCs out there ready to send me some start-up capital? I'm thinking in the 7-figures range.
Of course in actuality I think it would be a disaster. Not only is it the sort of service that most people wouldn't pay a penny for (yet without a large userbase you can't have useable community-driven rankings, and of course you can't have a professional taxonomy classification of the sites without revenue, as that costs real money), there are limited ancillary revenue options (it'd just be a tiny service that people use almost unconsciously - there is no stickiness). Until it gets a large userbase - which it is doubtful that it ever would as a user-contributed service - it would suffer from significant false-negatives through exclusion as well, not to mention that it could be easily gamed to cause false-positives.
A classic chicken-egg problem.
On top of that, varying scales of puritanism, as well as trolling with the site itself, would lead to extraordinary pollution of the database. You could try some sort of web-of-trust with relationships and personal networks, for instance only trusting rankings coming from those you trust (or who people you trust trust, and so on) but again that would vastly limit the scope of applicable NSFW rankings, rendering it close to useless for all but the most common of links.
[I should also note that people were tossing around ideas for collaborative, community-driven rankings of websites over a decade ago. The idea of community-driven content is hardly new. What is new is that people think they can build a revenue model on it, largely on the back of Google's innovative and prolific Adsense]
Web 2.0 is, in its typical usage, a completely nebulous term. Yet remarkably its boosters will declare that it's all so clear, you idiot - it's "<INSERT THEIR OWN PARTICULARLY DIVERGENT INTERPRETATION HERE>". I've seen this play out in quite a few online and offline discussions, proving to me that it's an amorphous/eye-of-the-beholder sort of term.
Tim O'Reilly and friends were one of the first to widely coin it, so his take obviously deserves attention, yet even it lacks any degree of clarity. Really it appears to be nothing more than a freeze-frame of the web's continuing evolution.
Nonetheless, one recurring attribute of the "Web 2.0" religion deserves attention - Folksonomy, which is the loosely-controlled user-base keyword tagging of content (usually contrasted against the taxonomy of a site like Yahoo, where a central group of annoited ones classify content, albeit really they're just rubber stamping the classification provided by the website owner).
For instance I upload a picture to Flickr, keywording it flower and bee. Now people searching for related pictures can browse amongst pictures of bees, or flowers, or bees and flowers, and see my content amongst everyone elses. del.icio.us follows the same model, with users adding links and meta-data keywords that categorize them. Links can then be searched or related by keyword(s).
In the early days of search engines, the content parsers were really quite dumb - they couldn't read the content of a web site and really figure out what the subject of the page was. As such the META tag was added, allowing website owners to attribute their content with a small, select group of keywords, and those keywords would allow it to choose content appropriate for user searches.
Of course what started as a good idea quickly devolved - nefarious website operators learned to put unrelated, popular terms in their keywords to earn additional hits: What started as a great idea devolved into a tragedy of the commons as more and more people got involved, and they started gaming the system for their own advantage.
The sort of things that work on small-scale, edge sites quickly degrade as they become larger and more important. Already many of the "social tagging" sites are getting overwhelmed with spam and false hits, and of course errors, and more commonly errors of omission, are extraordinarily common on these tagging sites.
Of course for photos there are no other options - we're not at a point where an automated analyzer can look at a picture and determine what it's about, so tagging is the best we've got. However for some attributes, like location, free-text tagging is terribly unreliable, which is why I look forward to the automated GPS tags talked about previously.
http://www.yafla.com
I absolutely despise the acronym "AJAX". Something about it rubs me the wrong way.
Perhaps it's a silly hang-up. Perhaps I'm foolish for not getting with the current lingo.
Nonetheless, I just don't feel comfortable with it. Not only is it impossible for me to say or write it with a straight face, I find it difficult to hear it from others without unconsciously stereotyping the other party as some sort of malleable, misinformed Johnny Come Lately.
It just seems like an uninformed way of uselessly simplifying a complex ecosystem of evolving and varying solutions into a meaningless, trite acronym (don't get me started on "Web 2.0").
Of course AJAX isn't really an acronym for anything anymore - realizing that it was founded on a solid foundation of ignorance, along with an unhealthy lack of historical knowledge, it has become a more generalized term meaning some sort of nebulous "interactive web application" (what we historically called a rich web application, or even the logically descriptive Dynamic HTML or DHTML). Now it's popular for it to just be the non-acronym "Ajax", with a much less restrictive meaning. If it's a cool web application, well that's courtesy of the new-fangled Ajax!
Even ridiculously pedestrian uses of scripted objects are hopping on the AJAX bandwagon these days. "It uses JavaScript...and that's the J....so it's AJAX!"
As a bit of "AJAX" history, way back in 1999 the Microsoft MSXML team - these were the people who made the superlative XML parser - added an oddball little object into the MSXML library: The XMLHTTP COM object. This object - one which originally used its own HTTP transport outside of IE's, leading to all sorts of proxy configuration fun - allowed one to programmatically send parameterized GET and POST requests to HTTP servers, retrieving data back for processing or display. They deserve credit for inventing it, but at the same time it was one of those inevitable solutions (in fact there were already safe for scripting third party HTTP clients, but of course they had very limited deployment).
XMLHTTP, like the rest of the MSXML COM library, was usable and valuable both by native applications, and by Internet Explorer client-side script (because they handily marked it as Safe For Scripting, which is the flag that reveals a COM component to the IE scripting engine). It isn't actually tied to XML in any meaningful way, and the XML misnomer was purely a result of it being created by the MSXML team (it was just a bit of namespace). Microsoft widely distributed the library alongside other products.
Voila, another method of background data loading was released to the world, adding to the already existing and utilized hidden-IFRAME technique. Like many developers in the Microsoft world, I saw the benefits of this relatively small enhancement, and started enabling internal web applications with partial page loading, rendering and post backs (using other parts of MSXML to transform received XML against an XSLT for display - a next generation technique that still has very limited deployment). Indeed, I even used this object in rich, native applications as an easy way of communicating with web services. It was a convenient little object.
All back in the year 2000, half a decade ago. I was hardly a pioneer, and there were thousands of others doing the same thing or more.
Nonetheless, given that it only worked with Internet Explorer, it was a complete non-starter for the public web, at least for the rational.
In early 2001, after seeing the value that it brought to the table, the Mozilla team added XMLHttpRequest to Mozilla 0.8. As the years passed, all of the other major-minor browsers incorporated compatible implementations. It was becoming a technology that was usable in the mainstream...but for one small issue: Up into even 2003, a lot of large corporations stuck with Netscape 4.x as their primary browser, or as their back-up cross-platform browser (e.g. they had Internet Explorer as their primary, with Netscape 4.x as the umbrella if Microsoft suddenly jacked up licensing costs and they needed to switch all of their client operating systems).
This lowest common denominator inhibited public sites, forcing them to stick with the tried and true, so even with the pervasive availability of this functionality across all of the late-model browsers it was still off-limits for widescale use because of a very small minority. Imagine a nation restricted by a 30km/h speed limits because there are a couple of people with old, beat-up Yugos, while everyone else is pounding their steering wheels in their beefed-up gas-guzzling 8-cylinders. That was the web world.
In late-2004/early-2005 Google shocked a lot of the blissfully-unaware world by adding XMLHttpRequest-backed functionality to Google Suggests, offering "type-ahead" usability enhancements (it can also use IFRAMEs, which is one of the original background-communication techniques). The moment was right, especially for a non-critical beta service, when the old (e.g. Netscape 4.7) could finally be tossed aside, and we could move ahead with technologies that have long been available. Google was the biggest player leading the way, but of course they weren't alone, and there were already plenty of public implementations of this sort of dynamic content that demanded a modern browser (Outlook Web Access, a component of Microsoft Exchange, being one of the earliest and best known).
Google Suggests was the largest, most visible implementation, and it opened a lot of web naysayers eyes up to what many of us had been evangelizing for years. Suddenly the unaware could see potential that had been there for years. Google also set a new bar for the amount of server side processing and bandwidth that could be allocated to a single random web visitor (not long ago it would have been considered insane to do that sort of processing and data transfer constantly as users typed keystrokes, for instance, but the availability of cheap computer cycles and copious bandwidth has changed the equation).
I'm going to segue to a little human-nature story here - I'm a Microsoft-centric user and developer: I chose to professionally focus on their tools and technologies as my specialty early on, and I've done very well with it (though I'll switch to other primary specialties if and when it's in my best interest). Nonetheless I regularly use Linux (primarily through the magic of virtual machines. I'd recommend that everyone does this, as an aside, especially now that VMWare has made their Player free. You owe it to yourself to give Linux a try - you will probably be surprized at how rich and capable it really is on the desktop), and I try to keep myself fairly competent with it. I have lots of experience with other OSs as well, though sadly I've yet to try OS X. Anyways, putting all of that operating system defensiveness aside, I've been with Windows for quite some time.
One thing I've noticed over the past several years, as a Windows user, is that quite a few historically anti-MSers are making the switch to (or back-to) Windows on their desktop. Perhaps Microsoft's security initiative is having an impact, or some other extenuating circumstance is making them want to switch, but nonetheless they're "coming back".
I feel that a lot of the AJAX hype follows the same pattern - someone either poo-poohed web applications for years, or perhaps their career just finally brought them to web applications. Instead of feeling that they're a junior in an established field, they have to invent some paradigm shift that renders everything that came before irrelevant.
Almost invariably they will defend their switch as being justified only by some relatively recent extraordinary revelation in the Windows platform - e.g. "That Windows 2000 was a real piece of crap, and I wouldn't have touched that with a 10-foot pole, but the new XP kernel is pretty good. Windows was crap before, but now it's good". Of course anyone who follows Windows knows that this is utter claptrap - XP is largely 2000 with a facelift and some relatively minor kernel changes. Really 2000 was already quite a well designed, nicely executed, very stable operating system, but they couldn't concede that prior iterations had value if they weren't a part of the game. Ultimately it only has value once they've entered the contest.
You can see this going the alternate direction as well - "Linux was garbage before kernel 2.x.x" - and for virtually any other technology. If someone is late to the game, there's a good chance that they'll try to manufacture some reason to validate making the switch now. Sometimes it's rational, but often it's just complete nonsense. I haven't spent any time with OS X, so instead I'll just wait a bit until the next iteration and declare whatever trivial change they've made to be the pivotal reason that justified my time.
This applies not only to engineers and their technologies, but to business hypsters as well - Instead of pimping themselves as yet-another entrant in the competitive web application market, they need to present the idea that they're coming at the perfect time: So much is possible now that wasn't possible yesterday.
I feel that a lot of the AJAX hype follows the same pattern - someone either poo-poohed web applications for years, or perhaps their career just finally brought them into web applications. Instead of feeling that they're a junior in a long established field, they have to invent or embrace some paradigm shift that renders everything that came before irrelevant. Disregard that the "technology stack" that facilitates a modern web application is extraordinarily wide and deep, and just pretend that this one small schism represents everything.
The crux of the matter, in my opinion, is that the acronym AJAX brings absolutely no clarity to the table, and instead introduces nothing but noise. Not only does it poorly define an implementation pattern that has been in use by experts for years, but its street use is so generic as to be detrimental (just look at that use of AJAX for the "AMASS" solution linked near the outset - Now AJAX just means "something done in a web app". How utterly inane). To others it is a buzzword that tries to simplify a multi-faceted technology platform into a nonsensically literal implementation detail, putting an inordinate amount of attention on one very small part of an interactive web application.
There is such a rich array of techniques and technologies available to a web developer to make a first rate platform, that such a simplistic and meaningless bit of language just confuses things.
Furthermore, there is this simpleton tendency for the hypsters and the unaware alike to see, or talk about, revolutions, where instead there has been a continual evolution (they're ignoring the fossil evidence!). One just has to read Tim O'Reilly's latest self-aggrandizement platform - Web 2.0 - to see this sort of distortion in action. Where many in-the-know see a constant evolution as the platform matured - everyone got faster computers, storage got cheaper, bandwidth became less expensive, the lowest common denominator got better, and so on - these people, with their enormous blind spot, instead see some monumentous divergence between yesterday and today. Yesterday was old-sk00l 1.0, and today is 2.0. Get it? Yesterday's car is a Model-T 1.0, and today's car is 2006 Honda Accord 2.0, with nothing in between.
Great, now come to my conferences, link my pages, and buy my not-open-source-but-open-source-is-great books!
http://www.yafla.com/
[Note: Some have noted that it should be Daylight Saving Time, without the pluralization of Saving. I, like many, use it more as a general-use title rather than a literal statement - given that it isn't actually saving daylight - and I generally hear it referred to as Daylight Savings Time. Just thought I should mention that.
If one wants to be a pedant, I believe it should actually be Daylight-Saving Time]
The Ontario government caved today, rashly deciding to follow the lead of a ludicrous U.S. energy bill rider, extending Daylight-Saving Time by three weeks in the spring, and a week in the fall (switching into DST on the second Sunday of March, rather than the first Sunday of April as it currently is, and switching back to Standard time on the first Sunday of November rather than the last Sunday in October).
Given that many don't entirely understand DST, I thought I'd share a graph I made some time back (I originally planned on turning the source algorithm into a web service to allow one to punch in the inputs such as location and generate their own graph, but could never justify spending the time on it).
All values are calculated for Toronto, Ontario, for 2005. The red line represents EST sunrise, the yellow EST solar noon, while the cyan line represents EST sunset. The purple lines represent the 9-5 workday, adjusted in the summer months to account for DST (where 9-5 is really 8-4). The blue lines represent the extensions brought about by this change (3-weeks earlier in the spring, one week later in the fall). To recap - only the workday period on the graph above calculates in DST (e.g. the 8pm sunset during the summer is 9pm on the clock during DST, and the 4:20am sunrise is actually 5:20am on the clock).
As much as I dislike the incredibly costly confusion and complexity of DST (my two and a half year old still hasn't adjusted), there is a small amount of logic behind it - Presuming that human beings don't naturally adapt to the sun coming up earlier during the summer, DST moves an hour of this presumably unused time into the traditional post-work hours, "lengthening" the evening (not really lengthening the evening, but artificially doing so by moving the traditional 1950s work hours earlier in the day).
Many people would argue against this, saying that the summer hours give them an opportunity to jog, garden, go to the gym, and otherwise take advantage of the extended pre-work hours. Nonetheless, DST is geared towards those who do nothing until their pre-work preparation (e.g. the alarm clock goes off an hour and 30 minutes before the work day starts). For those people DST is entirely beneficial.
Extrapolate that logic out, though, and there should be a second layer of DST that moves the clock yet another hour forward during May to September. Maybe an hour more during June. Perhaps we should have a dynamic clock, such that 9am is an hour after sunrise year round.
Humor aside, there is a tremendous risk of this DST extension, especially coming into force so quickly. Having worked with a number of daylight-saving time related software problems (please use UTC people, or at the very least disregard DST), I would wager that there will be significant ramifications of this. Millions of dollars will need to be spent preparing for, and then cleaning up after, what many seem to think is a simple date change.
Anyone interested in the source data that I generated for this can find it here (it's a Microsoft Excel worksheet).
http://www.yafla.com/
One of the big marketing pushes to help hype the release of SQL Server 2000 was a huge onslaught of the benchmarks - before SQL Server 2000 was even available to buy, its results were dominating the TPC results, primarily via clustering. Shortly thereafter, it is purported, Oracle demanded that the TPC separate clustered and non-clustered results. Not long after SQL Server was doing very well in the non-clustered category as well (on very, very, very expensive machines - Big Iron).
SQL Server had joined the big leagues. Any questions about its scalability dissolved.
Remarkably we're on the cusp of the real release of SQL Server 2005 (Nov. 7th I believe), yet there has been barely any noise at all in the TPC results. It has taken more of a lead in the price/performance TPC-C results, and it has pushed a little higher in the pure performance results - though that has more to do with beefier hardware - but all-in-all it has been very sedated in contrast with 2000's release. I wonder if the TPC results simply aren't considered important anymore (probable, giving how old most of the leader results are. 50% of the top 10 are from 2003)
Is the TPC no longer relevant? Does SQL Server 2005 simply offer marginal scalability/performance advantages for the TPC suites?
On the topic of scalability, SQL Server's clustering capabilities could use some improvements. As it is, scaling your database out across two or more servers is most certainly a non-trivial task. It's something you really have to design around (distributed partitioned views don't partition themselves, and it's a leaky abstraction). In an ideal world you could add a new server, install SQL Server and choose "add to the cluster" and it'll automatically propagate some data over and start sharing the load transparently. If it were so easy and elegant Microsoft would see a tonne of license sales as people scaled out.
I'm not an Oracle expert, but I believe that's how their clustering solution has been built.
Of course that sort of clustering is really focusing on the computation end, which really isn't a problem for most scenarios. Instead most are limited by I/O, and we already have methods (via SANs) of tremendously and transparently scaling-out our storage subsystem. Take a look at the full disclosure of the price/performance leader: A single (albeit dual-core) 2.8Ghz processor - a relatively low-end head-end system - backed by a SAN hosting 56 "clustered" hard drives. The TPC-C benchmark is artificial, so this doesn't necessarily mirror the real world, but it is telling. Keep your data efficient through good design and delay the day that you need a 56-disk SAN.