One of the continuing trends of the Web 2.0 revolution is tag-mania -- sticking tags on everything and anything, hoping that it somehow improves the flow, digestion, and utility of information. From adding tag clouds to your blog, to slashdot, to photos, to bookmarks, tags have continued to spread across the web landscape.
As with every tech "revolution", in corporations across the globe eager employees are embracing the trend, advocating adding tags to documents and directories and files, and embracing the concept of metadata.
As a bit of an explanation for those who haven't been following TechCrunch in morbid curiousity -- wondering what dubious business came out of super-secret stealth alpha invite-only mode today -- and thus aren't up on their Web 2.0 lingo, tags are, in essence, a set of words that one or more users apply to something to categorize it -- what we historically called keywords, albeit sometimes (thought not always) with a "democratic" process determining the rendered tag set.
For instance the tags of this post might be "Web 2.0, tags". Ten visitors might add "tripe", making it the dominant tag in the tag cloud.
Getting a variety of people adding tags to the same content, or building a common directory of information loosely categorized by tags, is what's commonly called a folksonomy. Consider, for comparison, a formal taxonomy of a system like Yahoo's classic categorization, where a submitter would choose exactly where in the hierarchy a link went, and the Yahoo overlords would validate it, and insert it if appropriate. Instead the loose addition of tags adapts to have multiple categorizations over time.
[Web 2.0 aware readers will probably shudder seeing an explanation of something so "basic", yet discussions in the field have led to me to believe that much of this great revolution has gone unnoticed by the bulk of society, including even the majority of technology workers. I regularly converse with people who've never seen del.icious, don't know who 37signals are, and haven't been to Reddit or Digg or Flickr or Furl. Much like bloggers have grossly overestimated the impact of blogs on the general population, there seems to be a presumption that the Web 2.0 lingo and dogma is more universal than it actually is]
While many of the Web 2.0 aficionados declare there to be a fundamental religious difference between the venerable keyword and tags, the difference is superficial at best (democratically selected keywords are still just keywords). The same keywords that have always existed as a data block in the JPEG file format, and exists in virtual every document format (Word, for instance), form the foundation of tags. Metadata has been around since we first started storing data, and tags are a continuation of that trend.
Many of the foundations of modern tagging, the evolution of the keyword, were first demonstrated widely by the superlative web photo organizing and sharing application Flickr.
Given the primitive state of image recognition, this was a perfect fit: Without tagging your photo with keywords such as "bridge, burlington skyway, qew", there was no way searches could find that photo if asked, for instance, for pictures of the Burlington Skyway bridge -- We aren't yet at a stage where software can reliable figure out what the subjects of a picture are, and mechanical metadata is still incomplete (although it's getting there), so keywords/tags/folksonomies fills a critical gap if the photography data process.
Outside of photos the use of tags is often much more dubious.
To go back in history a bit, when search engines first appeared they largely relied upon meta keywords. This was a compromise due to limits in the "comprehension" of content -- search engines got confused easily, and even when they could parse the content properly they couldn't truly figure out what the content was about.
Keywords came along, offering a simple, condensed, human-created subset of the data, categorizing the important attributes of the content. Search engines embraced and utilized keywords as an important element of fulfilling search requests.
The honeymoon didn't last for long. It turned out that keywords were a prime stomping ground for search engine spammers, not to mention that it was a horribly limited method of searching through data: Not only were the choices of keywords entirely subjective -- often grossly incomplete and inconsistent -- but by design it was limited to a very, very small subset of the content. If you really wanted content about metal railings, you might have missed my extensive discussion on that topic in my Burlington Skyway Bridge article because I didn't feel that metal railings made the cut for the keywords.
Meta tags are largely dead now.
In its place search engines have become much better at determining what a given page is about (or at least simulating a reasonable promixity thereof). By analyzing content, having a directory of similar and derivative words, and by deriving information by context (such as links and related pages, and how they word links) and layout (noting that heading text, title, and early text holds more importance in classifying the page, though it still is used in concert with the rest of the content), search engines have come a long way it understanding content, and in correlating searches with appropriate results.
The loss of the keyword has proven to be very beneficial for search. Now it's the actual data that classifies the content, rather than artificial metadata.
With improvements in language processors and context associative correlations (e.g. where the content parser understands that the paragraph on boxers is talking about the boxer breed of dog, determined by its correlation with other documents coupled with other details of the language, using language trees to classify probable meaning), things will only get better.
Content search has a very bright present, and a brighter future.
Yet tags continue to spread in woefully inappropriate domains, even where it's serving as nothing more than the modern day equivalent of the venerable META keyword. Instead of building reliable, feature-rich search tools into product, appropriately determining relationships and context to understant content, product vendors are just tossing in a hack-job tag infrastructure and calling their job complete.
Worse still, users are accepting it and calling it a feature.