If you believe the Alexa graph, Digg is a significantly more popular site than Reddit, with the gap growing larger every day.
I find these results surprizing. I've had half-a-dozen entries on the front page of Reddit, with each yielding from 2500-5600 distinct referrals per day. In comparison, I've had a front page entry on Digg for a day, only bringing in ~750 referrals.
Of course, a single entry isn't a very good sample, and it's entirely possible that most people just weren't interested in the link -- that it was a fluke that it got on the front page in the first place -- but I've seen several other stat watchers mention very similar stats (that a front-page of Digg yielded them 800 or so visitors per day), so I'm not basing this comment simply on my own observation. I've looked for exceptions to this, and found one individual who had a broad-interest link atop Digg's front page for 24 hours straight, and they claim that for that day they received a total of 7200 distinct referrals, and then it rapidly tailed off, disappearing in two days. That case seems to be the exception.
One possibility is that Digg offers more link diversity and thus the much greater traffic is dispersed, significantly reducing the impact on any one link. Alternately, perhaps Digg users spend more of their time within the Digg community, rather than following the links (in the same way that many Slashdot readers just make assumptions about the linked article, responding accordingly, rather than RTFA).
Another possibility is that Digg caters to a crowd that is more likely to have the Alexa/A9 toolbars installed, both of which feed back the stats that are used to drive the Alexa popularity metrics. Given that they're somewhat infrequently used toolbars, and are much more likely among certain crowds (and seems to appear in clusters), the traffic rankings are a bit of a crapshoot outside of the top sites -- Here on yafla I've had days with 6000 visitors where my Alexa ranking doesn't budge, whereas other days 2000 visitors cause it to quintuple.
One of the continuing trends of the Web 2.0 revolution is tag-mania -- sticking tags on everything and anything, hoping that it somehow improves the flow, digestion, and utility of information. From adding tag clouds to your blog, to slashdot, to photos, to bookmarks, tags have continued to spread across the web landscape.
As with every tech "revolution", in corporations across the globe eager employees are embracing the trend, advocating adding tags to documents and directories and files, and embracing the concept of metadata.
As a bit of an explanation for those who haven't been following TechCrunch in morbid curiousity -- wondering what dubious business came out of super-secret stealth alpha invite-only mode today -- and thus aren't up on their Web 2.0 lingo, tags are, in essence, a set of words that one or more users apply to something to categorize it -- what we historically called keywords, albeit sometimes (thought not always) with a "democratic" process determining the rendered tag set.
For instance the tags of this post might be "Web 2.0, tags". Ten visitors might add "tripe", making it the dominant tag in the tag cloud.
Getting a variety of people adding tags to the same content, or building a common directory of information loosely categorized by tags, is what's commonly called a folksonomy. Consider, for comparison, a formal taxonomy of a system like Yahoo's classic categorization, where a submitter would choose exactly where in the hierarchy a link went, and the Yahoo overlords would validate it, and insert it if appropriate. Instead the loose addition of tags adapts to have multiple categorizations over time.
[Web 2.0 aware readers will probably shudder seeing an explanation of something so "basic", yet discussions in the field have led to me to believe that much of this great revolution has gone unnoticed by the bulk of society, including even the majority of technology workers. I regularly converse with people who've never seen del.icious, don't know who 37signals are, and haven't been to Reddit or Digg or Flickr or Furl. Much like bloggers have grossly overestimated the impact of blogs on the general population, there seems to be a presumption that the Web 2.0 lingo and dogma is more universal than it actually is]
While many of the Web 2.0 aficionados declare there to be a fundamental religious difference between the venerable keyword and tags, the difference is superficial at best (democratically selected keywords are still just keywords). The same keywords that have always existed as a data block in the JPEG file format, and exists in virtual every document format (Word, for instance), form the foundation of tags. Metadata has been around since we first started storing data, and tags are a continuation of that trend.
Many of the foundations of modern tagging, the evolution of the keyword, were first demonstrated widely by the superlative web photo organizing and sharing application Flickr.
Given the primitive state of image recognition, this was a perfect fit: Without tagging your photo with keywords such as "bridge, burlington skyway, qew", there was no way searches could find that photo if asked, for instance, for pictures of the Burlington Skyway bridge -- We aren't yet at a stage where software can reliable figure out what the subjects of a picture are, and mechanical metadata is still incomplete (although it's getting there), so keywords/tags/folksonomies fills a critical gap if the photography data process.
Outside of photos the use of tags is often much more dubious.
To go back in history a bit, when search engines first appeared they largely relied upon meta keywords. This was a compromise due to limits in the "comprehension" of content -- search engines got confused easily, and even when they could parse the content properly they couldn't truly figure out what the content was about.
Keywords came along, offering a simple, condensed, human-created subset of the data, categorizing the important attributes of the content. Search engines embraced and utilized keywords as an important element of fulfilling search requests.
The honeymoon didn't last for long. It turned out that keywords were a prime stomping ground for search engine spammers, not to mention that it was a horribly limited method of searching through data: Not only were the choices of keywords entirely subjective -- often grossly incomplete and inconsistent -- but by design it was limited to a very, very small subset of the content. If you really wanted content about metal railings, you might have missed my extensive discussion on that topic in my Burlington Skyway Bridge article because I didn't feel that metal railings made the cut for the keywords.
Meta tags are largely dead now.
In its place search engines have become much better at determining what a given page is about (or at least simulating a reasonable promixity thereof). By analyzing content, having a directory of similar and derivative words, and by deriving information by context (such as links and related pages, and how they word links) and layout (noting that heading text, title, and early text holds more importance in classifying the page, though it still is used in concert with the rest of the content), search engines have come a long way it understanding content, and in correlating searches with appropriate results.
The loss of the keyword has proven to be very beneficial for search. Now it's the actual data that classifies the content, rather than artificial metadata.
With improvements in language processors and context associative correlations (e.g. where the content parser understands that the paragraph on boxers is talking about the boxer breed of dog, determined by its correlation with other documents coupled with other details of the language, using language trees to classify probable meaning), things will only get better.
Content search has a very bright present, and a brighter future.
Yet tags continue to spread in woefully inappropriate domains, even where it's serving as nothing more than the modern day equivalent of the venerable META keyword. Instead of building reliable, feature-rich search tools into product, appropriately determining relationships and context to understant content, product vendors are just tossing in a hack-job tag infrastructure and calling their job complete.
Worse still, users are accepting it and calling it a feature.
I like Reddit.
On average the signal to noise ratio is great, and a scan through the hot list is usually a very worthwhile venture. The wide range of topics makes it more entertaining and informative than many tech-only sites, but it still has enough tech-related info to feel pertinent to the software development profession.
I've also received a substantial number of hits from Reddit over the past couple of months, with no less than 5 entries hitting the front page for periods of time, with each of them yielded 6000+ inbound visitors. Though these are of no profit to me, it is satisfying that many of these visitors left great comments and sent interesting emails, and found the entries informative or educational. After each onslaught the number of RSS subscribers jumps by a hefty amount.
Early on I admittingly submitted a couple of my longer, more thought-out posts to Reddit, thinking it would help exposure a bit, but became a bit discouraged by the whole exercise after seeing them instantly start descending into the negative range. Pure speculation, but my guess is that some rather unsportsmanlike submitters are automatically "voting down" everything in proximity of their addition, hoping to make their own submission stand out in relation (it's the only rational explanation for the almost instant vote downs). I would also guess that many users skip over low-ranked new items, so it basically becomes a race to get the first couple of up votes before it's voted into oblivion, and then a continued series of up votes to offset the continual downvotes.
This came to mind as I was just "testing out" the quicklinks that I just added on posts. I discovered a case of a single entry that had been submitted to Reddit three different times from different areas of the blog (which is a "benefit" of users who subscribe to and read the different areas). I've put these in order, determined by the obvious sequential ID that Reddit adds. I'm not sure of the specific times of each of them.
http://reddit.com/info?id=14ev - This was added from the home page version. It earned a forgettable score of 1.
http://reddit.com/info?id=14lu - This was apparently added referencing the static version located here. It earned a healthy -4 score. Perhaps because it was a duplicate of the prior one.
http://reddit.com/info?id=14sm - This was added from the Software Development version. It earned a very respectable score of 204, and I knew about this one because of a substantial impact on the visits over a two day period.
The exact same content, in different forms, yielded a 1, a -4, and a front-page for two days 204. Whether it was because of titles, time of day, or simply luck of the draw (that the last one got momentum before the haters started downvoting), it is a fascinating demonstration that these sorts of web democracies aren't always a meritocrasy.
Just thought that was a little fascinating.
I've received a couple of fantastic comments about troubles that people have faced adding items from here to their del.icio.us bookmarks, namely because Radio Userland uses a constant title for all entries (and del.icio.us automatically uses the title, so three different entries get the same title if you fail to manually override its choice). The common title problem was one of the reasons I created the notables static listing, though of course that listing is just a subsection of entries.
To help with this issue, I've added quicklinks below each entry to add it to your del.icio.us bookmarks, furl bookmarks, to Digg it or to Reddit it (which will link to an existing entry if one is already on there), and to check for Technorati links (there are seldom Technorati links because most of the readers here aren't bloggers, or they aren't the sort of bloggers that comment on every site they visit. I'd get a big boost in the Technorati rankings if I started pandering to the incestuous blogging community). I've mirrored these items to the static section as well.
I've received some great feedback regarding the entry on setting up a MediaWiki install on Windows. Many of the comments were kind words of thanks (which I really appreciate. Knowing that it helps people is my greatest motivation), and others helpfully suggested improvements to the instructions.
As an example of comment-driven improvements, my instructions have you installing the GNU diff utilities, in particular for the diff3.exe utility, however the MediaWiki setup scripts don't properly find it (e.g. as the instructions are currently written the GNU diff utilities are completely unused, although they can still be useful in your day-to-day travails). This is because a prior revision included fairly involved changes to the MediaWiki config/index.php script so it would properly locate diff3 on the Windows platform, as it is currently Unix-centric and doesn't look for the proper executable, not to mention that it parses the PATH environment variable incorrectly . After receiving two comments that those steps were a little too complex, however, I removed that section.
My goal was to get people experimenting with MediaWiki, or even just wikis in general, so diff3 functionality really wasn't critical. I pared the instructions accordingly. Similarly one early draft included the building and installation of a PHP memory cache to improve performance, but that too is unnecessary to simply try out the product.
Another line of comments involved asking:
To answer this I really need to describe the philosophy of this blog, along with my resistance to "technology alliances".
In the byline of this blog I describe my philosophy as "pragmatic software development", and this really drives my recommendations. In this case there are a lot of development shops that are Windows-centric, with little or no UNIX/Linux experience, yet MediaWiki is one of the best, most featurer rich, "standard" wiki products out there. Choosing a solution that leveraged what shops already know with the best solution is a pragmatic approach.
Which brings me to my general philosophy towards Microsoft, as comments indicating that I'm either a Microsoft hater, or a Microsoft drone parroting the corporate line, have hit my inbox over the short history of this blog.
I am not subservient to Microsoft.
Unlike many Microsoft technology advocates (I truly love both SQL Server, and .NET, and I think they're remarkable solutions), I have no desire to ever work for Microsoft (Microsoft has some top notch, world-class talent, and I've met and worked with a lot of great talent from there, but they also have their share of both jerks and duds). I'm not going to praise their every move in hopes that I'll get noticed. yafla, my consulting/ISV company, has chosen to avoid any partnerships or tying to the Microsoft brand because we don't want to become another drone "consulting" company single-mindedly acting as a third-party sales force for Microsoft, desperately racking up Microsoft partner points by pushing less-than-optimal solutions on customers. We didn't choose to use .NET for our software because we're hoping to nestle into the Microsoft family -- we chose it on technical merit, and a pragmatic analysis of our current and prospective clients.
We work for our clients and ourselves, not Microsoft. This is a very important mantra for our services, and for the technology of our software, and if Microsoft wants their products to get recommended to our clients, and their technology to the foundation of our software, they need to make great products at competitive prices. No sales gladhanding, or sad career dreaming, is going to change that.
Am I saying that Microsoft solutions are second rate? Of course there are examples of Microsoft products that are terrible, and customers are being misled into buying buzzword-laden atrocities because a Microsoft partner is hoping to get invited to the next Microsoft dinner party. Yet there are also Microsoft solutions that are extraordinary. Windows 2003R2 is a superlative operating system, and where you need the breadth of its functionality, it can be well worth the money. Microsoft Small Business Server can be an amazing package of value for some small organizations, within the constraints of the product. Other times, however, if you have the appropriate skills, a Linux machine is the best choice, along with a stack of the many available free or close to free server products on that platform. Sometimes IIS 6 is the superior solution for a problem, while other times Apache would be your best bet. Sometimes PHP and MySQL is a great solution, and other times C#/ASP.NET with SQL Server is the perfect combo.
I don't blindly assume the Microsoft product to be the best, but neither do I automatically presume it to be second rate. Instead I evaluate on merit, and propose solutions based upon the customer and their needs.
To do otherwise would be just biased noise, and wouldn't be to the service of clients and peers.
Tagged: [Software Development], [Programming], [Software-Development]
I've been doing this as a somewhat regularly updated blog for just over half a year now, and the results have been extremely satisfying: I get about ~2500 direct unique visitors on an average day (increasing 2-6x when something ends up being a meme-of-the-day on sites like Reddit or Digg, and of course many read via aggregators), search engine referrals are up to 200 or so a day, and viewing the "who's on" list is a laundry list of influential corporations and locations across the globe.
It does feed my ego a little bit seeing visitors from various governments, the CIA, nuclear research labs, just about every large financial company, and visitors from every end of the globe. My numbers aren't huge, but it's a perfect composite of influential and knowledgeable readers.
The most popular entries thus far are as follows (I'm providing the static version links where possible):
Effectively Integrating Into Software Development Teams
Optimal Software Development Processes and Practices
Spelling Matters
Everyone Is Above Average - The Overpopulated Top 2%
I've tried to minimize the number of entries (outside of the personal category, though this anniversary one being an exception) to keep the noise as low as possible -- if you're using a reader it won't constantly pretend there's new content when I'm just adding a peanut gallery comment about someone else's blog -- though on the flip side that means that I've delayed various .NET and SQL entries until they're "perfect".
Perhaps I might have to find a compromise somewhere in between.
After fielding several wiki-related queries by clients and associates, along with numerous questions and comments online, it is evident that it's the Year of the Wiki. Wikipedia has proven the concept, and users have a growing awareness of the benefits that organic information growth could bring to their teams.
As such, I'm putting together a feature covering wiki options and alternatives, including specific instructions for configuring and using Wikis on Windows (as this is a particularly neglected area, and much of the information that exists is terribly out of date or quite simply non-functional). Of course yafla provides turnkey Wiki solutions and training as a service as well.
One more point: The consulting work has always overflowed purely from word-of-mouth and associate networking, so the business website has always been terrible (sort of the whole "the cobbler has the worst shoes" thing). As yafla is now entering a growth stage, wailing past the temporary manpower limit, I'm finally going to change the corporate website to properly reflect the services and capabilities of the organization, and actually allow options for prospective clients to engage our services. That should be up shortly, growing and improving rapidly over the coming weeks.
Three yafla resource "shout-outs":
yaflaColor
- A dynamic web tool to select colors, including proper saturation
and lightness varations of colors
pureJpeg - Remove extraneous JPEG blocks
High Performance SQL Server - Information to ensure your
database designs and usage are optimal
Have a fantastic weekend!