One of my PCs is a bit of a Frankenstein, having gone through countless small upgrades over the years.
A video card here. Some memory modules there. A replacement primary harddrive here (thank you g4u). A supplementary hard drive there. Half a dozen different CD and then DVD and then Dual-Layer DVD burners.
Every now and then it'd see a larger upgrade that mandated a motherboard replacement alongside a new CPU. Often that would require new memory modules as well. Maybe even a new power supply as connection standards changed.
Motherboard replacements have always been the most disruptive, and it's been interesting to watch as each has negated the need for some add-in or other. First the USB+firewire board got punted, having been replaced by onboard functionality. Then the network card. Then the Soundblaster card. The only true add-in card usually needed nowadays is the video card, and I'm sure it's only a matter of time before the on-board video reaches a credible level of performance, eliminating even that.
I've pursued this piecemeal approach to upgrading primarily because it minimized the software disruption in my life, usually requiring just a quick module swap, some driver updates, and it's up and running again. I actually enjoy the modular, hybrid-PC pursuit, individually scoping out and replacing components with the best bang-per-dollar option available at the time. It's a bit of a hobby.
[Clearly I'm not alone: A local "Tiger Direct" store opened recently in my town, featuring a huge floorspace stocked with esoteric power supplies, mod cases, and other components for DIY builders. I'm surprized that the demand is still there, having thought that the self-builder was an endangered species]
I've been negligent, however. Over the past while this PC had seen little attention. Running on an extremely dated Athlon XP 1800+ (overclocked to equal a 2200+), with a "measly" 1GB of DDR1 RAM and a dated collection of complimentary components, it had fallen so far behind the times that it has dropped far off the current CPU charts. While it served its casual gaming task well (the video card is quite contemporary, and given that few games are constrained by the CPU, it held its own), and admirably provided the network storage for photos and videos, its anemic standings were a bit embarrassing. Sure, it didn't need to be decent given the various home and business laptops -- powerful, modern units that saw most of my computing activity -- but I felt like I was letting it down.
So following up the entry from a couple of weeks ago, I finally got around to ordering a new CPU and motherboard on Tuesday, ordering a retail boxed Intel Core 2 Quad Q6600 2.4Ghz processor from Direct Canada for the extraordinarily low price of $279.99 CAD. I'd been directed to their site from a search-engine yielded link to "Shopbot.ca", so I was a bit wary placing my order with this unfamiliar provider, but at 1pm the next day the box arrived at my door, amazingly delivered less than 24 hours after I ordered, coming from a shop 3000km away. I'm very satisfied with the price and speed. (I received no considerations for that comment, and know nothing about the shop beyond the fact that they sold me a killer piece of hardware at a great price, delivering it very quickly. Your mileage may vary.)
In the end I discovered that some new memory modules would be in order to fully yield the speed (going with 2GB to correlate with the oft claimed speed advantage that often flies in complete contradiction to actual memory usage metering). Oh, and a new case as it might make the whole process a little easier.
In the end, the only legacy pieces that made the migration to the "upgraded" box are the hard drives, and the video card.
Minutes later the full-retail copy of Windows was running the right drivers, and after a quick re-activation it was storming along.
I booted up.
In a word (and a punctuation) - Wow!
What a tremendous amount of computational power on the cheap. Day to day activity really feels no different than it did before -- browsing is the same fast browsing that it was before, and given that I don't try to use Excel as a warehousing database, Office seems the same as well. Battlefield 2 plays the same given that I have the same video card, albeit now with absolutely zero stutters or hiccups as other threads demanding timeslices are generally satisfied by one of the other cores.
For the things that actually keep me waiting -- encoding a home video from the MiniDV, or building firefox from CVS, as I do regularly -- the improvement is enormous. Not only are these operations massively sped up by the four cores available to them, better still I can configure them to only use one, two, or three threads of parallel executions (via the -j build option for Firefox, for instance), constraining them as a coarse fix for the deficiencies of the Windows scheduler. I can now run a full Firefox 3 build in just 12 minutes with full parallelism, or run it (or other demanding applications) with little or no impact in the usability and functionality of this PC for other tasks.
The build continued to speed up with more possible parallel operations, albeit with a decreased rate of return, with the fastest test build occuring in just over 12 minutes with the highest option tested: -j12. Having more parallel operations than cores can yield benefits when it increases the time utilization of a saturated resource, which in this case was the hard drive. At this point the cores were left twiddling their thumbs waiting for the storage to catch up.
Limiting the build process to two cores via the process CPU affinity had it CPU starved beyond -j2, yielding no benefit via more parallelism.
You can find a stacked graph detailing core processor usage for the above -j4 run (on 4 cores) at http://www.yafla.com/dforbes/images/Firefox_build_j4_4core.png. You can also look at a chart of building Firefox using the -j4 option, but setting the processor affinity to only allow the build access to two cores.
Not only is the build performance fantastic, but better still I can throttle it back to only run at most two parallel operations (-j2), getting a build in a still impressive 17 minutes while leaving two cores completely available for other tasks, like browsing the web with full responsiveness. I can even launch Battlefield 2, and remarkably it plays flawlessly...despite the fact that a full-scale, parallel build is going on in the background.
(Sidenote: Threads can still be left stalled, stranded waiting for a shared resource like the limited memory bandwidth and I/O paths, for instance. In the sample above my build was on a second harddrive -- a configuration that I recommend for all power users -- and clearly the other shared resources didn't impact the game to a perceivable degree)
What a revolution in computer usage. What a discount-priced computational powerhouse.
A recent article on the utility of multiple cores has been making the rounds. Despite being largely a copy/paste of other articles and graphics, with a smidge of editorial commentary, it is anxiously heralded by dual-core owners as purchase justification in the face of progressing technology.
[As fair disclosure, let me say that I'm about to purchase a quad-core processor based system, and this article and its sources did absolutely nothing to dissuade me from this choice]
The meat of the article (or rather the articles that are referenced by the article -- someone else did the dirty, arduous footwork work of benchmarking) is comprised of a showdown between a 2.4Ghz quad-core and a 3.0Ghz dual-core, which is reasonable given that they're comparable in price [at writing the 3.0Ghz dual-core E6850 can be had for $384 CDN, while the 2.4Ghz quad-core Q6600 is $319 CDN]. Given that many games and applications are effectively single-threaded as a legacy of lowest-common denominator development, the faster clock speed dual-core processor abstractly takes the lead in such fundamentally synthetic benchmarks for the pricepoint.
Aside from the questionable "it's good to have one extra core to allow you to kill bad processes" premise (what if those bad processes are multithreaded? Do you just have to buy bad-process-threads+1 cores? Maybe set the affinity such that you've dedicated a core solely for the task manager? In the real world of modern schedulers, the only time you can't get control of the machine to kill a rogue process is because of some absolutely atrocious elements of the implementation of Windows, and a scheduler that is effectively broken in the face of some situations. Neither is necessarily improved by more cores), what really gets me about the whole exercise is how utterly synthetic it really is, using contrived benchmarks instead of rationally considering how people actually use their PCs, and where their real need for more power comes from.
Firstly, it largely focuses on games benchmarks. Even if gaming performance is pertinent to the reader, for the majority of users playing the majority of games, their video card is far more of a bottleneck than their processor (even if their processor is a dated affair). I'm saying this as a long time computer gamer -- one that finds the stuttering framerate on even top of the line game consoles intolerable: unless you've turned every quality setting to low and you're running at 800x600, it's doubtful that you're going to even measure, much less notice, a difference between a modern 2.4Ghz core and a 3.0Ghz core. Indeed, the very first benchmark I looked at on the referenced article says exactly that: "For this test, we set Oblivion's graphical quality to "Medium" but with HDR lighting enabled and vsync disabled, at 800x600 resolution". They did that to create a scenario where the differences are measurable.
So if you plan to game in a contrived way for the purposes of demonstrating CPU differences in benchmarks, then you'd better pay attention to core speed.
In the real world of gaming, after you've adjusted the quality and resolution settings to appropriate settings for your video card, the primary slowdowns during gaming tend to come about because of external applications rudely stealing your thread quanta: I'm about to toss the grenade into the bunker in Battlefield 2 when suddenly Windows Search has decided that this is a good time to rebuild its index corpus, for instance, so instead it falls to the ground and I take out my entire squad (Seriously, Windows Search guys - when a full-screen DirectX game is running, it probably isn't a good time to decide that the PC is "idle").
For moments like that, more cores make a huge difference. Dual-cores would be sufficient for that simple scenario, but what if my PC is even more active, as it always is? Perhaps the blog updater is running an update, I'm FTPing some files, a download is happening, and I'm gaming.
Every core works towards the ultimate goal of eliminating the real world problem of cycle theft from my hardcore gaming.
Presuming that you've passed a reasonable bar -- long behind you when you're talking about a 2.4Ghz Core 2 -- more cores will realistically improve things for gamers enjoying their vice in the real world. One day we might even have a world where we don't have to shut down services and trawl taskmanager violently killing processes before launching a game, fearful that it will disrupt our immersion.
My second problem with the article is that it doesn't question what people are really waiting for nowadays. Personally I see almost no difference between virtually any mainstream PC for the overwhelming majority of day to day operations (and this is as a developer) -- most activities are so fast the difference is negligible. I just switched laptops from a single-core 1.6Ghz Pentium M to one with a Core 2 Duo T7200 -- a significant improvement -- and from a day to day perspective I've indeed notice that the new laptop has a better screen, a faster harddrive, and much better graphics, but the computational difference is largely unnoticed.
Until, humorously, I do something that is highly parallelizable, such as encoding a video pulled in from the miniDV video camera. In that case the dual-core processor strides to a massive lead over its single core predecessor. If it were a quad-core, it would storm even further ahead, even with the loss of frequency.
For something that I'm actually waiting on, more cores = more goodness.
I would definitely choose the quad-core processor for the software reality legacy that we have today, despite the many applications that in the singular fail to exploit the possibility. My conviction is amplified by the tremendous strides that application developers are making to parallelize their products. Once you've parallelized to 2 cores, it's generally a very small step to parallelize to 4 cores, or n cores for that matter.
Bring on the cores!
Summer is waning here in the Northern hemisphere.
While it's sad that the warm weather and summer activities will soon be packed in the garage for another year, it's almost the time for fall fairs, rich soups, apple picking, walks in the gorgeous escarpment country when the leaves have changed color, pumpkins and costumes.
'Tis a wonderful season ending, to be replaced by another great time of year.
With the decrease in outdoor activities, I'll be posting more frequently. I've been kicking SQL Server 2008 around, and look forward to writing about it (I'm excited about its new hierarchical functionality, which has echos of versatile high performance hierarchies), along with many other thoughts that have percolated in my head.
I'm going through the process of upgrading some Infragistics NetAdvantage 2007 v1 components to 2007 v2, one step in the upgrade process being the uninstallation of v1. The uninstaller has now been running for some 65 minutes, saturating both the hard drive and the CPU during the entirety of that time.
What possible explanation is there for this? Remove some registrations, delete some files and directories. Done. Where's the big complexity?
"But it's doing complex things!" a friend of MSIEXEC might retort (this is hardly the first time I've encountered outrageous installer times). Like what? Calculating the next Mersenne Prime?
In the time that it has run it could read and written my entire hard-drive several times over, and from a computational perspective it has now processed trillions of CPU operations. Trillions.
Given the basic metrics, there is simply no rational explanation beyond absolutely mind-boggling inefficiency. Par for the course, unfortunately.
"Come here and see this cool PBS promo that was just on!" I call to my wife, tapping pause on the PVR's remote. Given my history of calling her in for replays that weren't as hilarious or amazing as I originally imagined them to be, I knew this had to be a slick presentation if I wanted to impress with my PVR-fu.
As she enters the room the PVR finally reacts to my command, pausing, but then immediately playing again: Once again I've been caught out hitting the button twice, assuming that the first request got lost in the ether -- as often happens -- when it didn't seem to respond in a timely manner.
I hit pause again and this time it immediately reacts, coming to a halt.
"I just have to cue it up," I say, buying some time.
I tap rewind. Nothing happens. Come on, I think, I've got an impatient audience here! I tap it again, and the box launches into double speed rewind as I race for the play button. It plays on demand, but now it's several minutes before the desired start point.
Repeat.
Unpredictably high user interface latency strikes again. While this Motorola PVR is an exceptionally bad culprit for random non-responsiveness, it's hardly the only example of this seemingly growing trend.
My Moto Q smart phone is a great little device that I really enjoy, but the user interface responsiveness is enormously uneven, frequently lagging several seconds behind commands. Whether waiting for it to complete an application switch, or even during basic interactions such as entering a URL in the address bar of Pocket Internet Explorer, it's often out to lunch.
Presumably the questionable multitasking of Windows Mobile completely blocks the user-interface thread when it decides to chatter with a cell tower. Nothing else can explain its behaviour.
On startup my DVD player apparently needs to initialize its own little operating system, and if there's a disc in the drive it automatically demands that it determine the contents before it will eject. It insists that it be able to put "DVD" on the front-panel before any other activity occurs, achieving silicon self-satisfaction that it accurately determined the media type.
A common scenario has us getting ready to leave the house, preschool children bubbling with energy, when we realize that we have a disc that we should return to the movie store.
Turn on the entertainment unit. Wait for DVD to pre-power initialize. Hit eject on the front panel (which automatically turns the unit on). Wait as DVD player initializes and then unnecessarily spins up the disc in the drive to read the disc type and root menu.
Finally it ejects.
It isn't just household devices that show this worrisome trend. The bank recently "upgraded" their ATM machines, bringing a colourful, graphical façade to what was once an glowing-green, ASCII, very serious interface. What once was a quick navigation through the menus now sees the painful redrawing of screens and laggardly keystroke responses.
It might only be 10 seconds or so from beginning of DVD eject procedure to the actual ejecting of said optical disc, but that's approximately 9.75 seconds more than it needs to be.
When time is short, even small delays like this can be incredibly irritating.
Despite the increasing computational capacity of our devices, the problem seems to be getting worse. User interface responsiveness seems to lie low on the list of priorities in many contemporary electronic devices.
These devices seem to be growing slower and slower, yet the processors that power them are getting dramatically more powerful.
Slashdot recently had a story linking to some reviews of a new Windows Mobile 6 smartphone. Several of the comments provided variations of the argument that the primary weakness noted in the review -- poor performance -- was the result of "underpowered" hardware.
Throw some more hardware at it and everything would be okay, the argument goes.
Consider that for a moment: is hardware really the problem? My Moto Q -- a device that often demonstrates terrible responsiveness (I'm not trying to pick on Motorola -- I've noticed the same behaviour with Nokia and Audiovox phones) -- is powered by an Intel XScale PXA272 processor running at 312Mhz. It comes with 64MB of RAM, 128MB of flash memory, and I've supplemented it with 1GB of miniSD flash storage.
Is that an insufficient bit of hardware to manage the awesome tasks of a smartphone with a 320x240 screen?
As a point of historic comparison, in the late 80s I was a proud owner of an Atari 520ST. It was a multimedia powerhouse powered by an 8Mhz Motorola 68000.
Despite what now is a laughably anemic CPU, it seemed infinitely capable at the time: I used it to create complex reports for high school in a full featured desktop publishing app. I did hobbyist software development in a rich IDE on it. I wasted away countless hours trolling local BBS'. It even was a wonderful game platform, running richly challenging games with gusto (games far more advanced than what you often find running in J2ME on your cell phone).
Later I upgraded to the Atari 1040STE, still with the same 8Mhz 68000, but offering expansion capacity to bring it up to a colossal 4MB of memory. This was so much memory that I usually created a memory "disk" out of 3MB of it, and still never felt limited in the 1MB.
Seldom did my ST ever feel laggy or non-responsive -- it booted close to instantly from ROMs, and the simple UI was always extremely responsive. Demo programmers had it doing tricks that still impress me to this day. Later a UNIX-style OS was ported to it, including full pre-emptive multitasking.
So how does that relic of the past compare with something like the Moto Q? Comparing straight Mhz isn't a valid comparison (for instance the ST is a CISC processor, versus the RISC XScale), so I went searching and found some Dhrystone 1.1 benchmark numbers for both the XScale at 312Mhz and the 8Mhz 68000.
8Mhz 68000 (Atari ST) - ~1,603 Dhrystones / second
312Mhz XScale PXA272 - ~731,512 Dhrystones / second
On this benchmark the PXA272 in the tiny little smartphone on my belt (yeah, I'm a nerd) is equal to 456 Atari STs. Let's look at that in a bar graph in case it isn't clear enough.
| 8Mhz 68000 | 1,603 Dhrystones / second
|
| 312Mhz XScale PXA272 | 731,512 Dhrystones / second
|
Wow.
Memory wise the Moto Q has a virtually infinite amount of memory compared to my old Atari ST.
I'm not trying to pretend that the ST of old did what a PDA of today is doing: I remember first getting access to low resolution JPEGs on a local BBS (I was a teenager and they were swimsuit photos...pretty risqué at the time. This stuff was much tamer than an issue of Maxim magazine or an "Umbrella" video), having to go through the tedious process of first "decompressing" them to a TGA, waiting as the decompression processed for sometimes minutes, and then viewing the uncompressed image. There was no way it could realtime do something as complex as rendering a JPEG.
Yet considering this enormous increase in computational power, it does seem evident that many device developers aren't respecting the time of their users, and few users are calling out terrible interfaces for being unresponsive and disrespectful. Reviewers, in particular, seem blind to responsiveness when rating devices, presumably because the artificial environment of a review can't be compared to quickly trying to respond to an email while standing in an airport terminal just as the last boarding call is made.
A user interface should be predictable and consistent -- it should always respond in a short, consistent amount of time (I would honestly feel that the PVR would be better if it always took 3 seconds to react to a command, versus now when it's anywhere between 0 and 5 seconds), always allowing the user to cancel operations that they're no longer interested in.
Responding to the user's input should always be job #1.
Jeff Atwood, of Coding Horror fame, recently rebutted my post "Beginners and Hacks", which itself was a reply to his post "C# and the Compilation Tax".
Jeff makes some great points, but at the outset I have to disagree with his statement "The present model of software development is clearly monkeys all the way down. And if you're offended to be lumped in with the infinite monkey brigade, I'd say that's incontestable proof that you're one of us."
No, Jeff, I don't develop via the Infinite Monkeys Model. It disturbs me that any professional in this industry would volunteer for such a pejorative.
While humility is often a good thing, there is a limit. Every developer can't be Linus Torvalds or John Carmack, but every single developer should still have professional self-respect, and a desire to do and be the best that they can.
As for my denial of membership in the worldwide IMB representing "incontestable proof" that I'm among that group, that comment had me reminiscing about a shop I worked in about a decade ago: A new hire had proposed a questionable set of development changes, some of which I was passionately opposed to. He dismissed such disagreement via a hilarious bit of circular reasoning--
a) If you passionately disagree, you are being
"defensive"
b) If you're being defensive per the definition given in a),
it must be because you are wrong.
It's a simple, comforting way of dismissing opposing perspectives:
Everyone who disagrees is just being defensive because they're
wrong. It was so remarkable that it has always stuck with me as an
example of self-delusional perception.
Jeff goes on to compare his apparent utter dependence on continuous compilation code checking with squiggle-line spell-checking. Even if I were to accept that simile, which I don't at all, let's humor that comparison for a moment.
I've written about the importance of correct spelling before, and have lauded the integration of automatic, continuous spellchecking in Firefox. I'm typing this entry in Microsoft Word, which has helpfully alerted me to several misspellings (mostly the result of typos).
I greatly appreciate these tools, and how they help me with the craft of writing.
Yet I'm not a professional writer. I am, in actuality, a hack and a beginner.
By noting that differentiation, am I then saying that a professional, dedicated-to-the-craft writer would actively abhor such a tool (see the Frank Navasky character from You've Got Mail as just such an anti-technology luddite)?
Of course not, and that is not and has never been the argument I'm making. Those who jump to such a conclusion are just being defensive, and thus, we have learned, must be wrong. No I'm not calling for editing in notepad, or making shoes like we made them 150 years ago.
Instead I'd wager that you'd find the average professional writer, dedicated to the craft of putting words to print, has dramatically less dependency on such accoutrements than "beginners and hacks": They have elevated their creations to the point where something as rudimentary as spelling no longer represents a significant part of their "problem". They compose their creations so carefully that they're less likely to have such errors in the first place: When every line is a conscientious, careful, considered work of art, it's less likely that a typo-detection utility is as important.
For a blowhard blogger like me, vomiting paragraphs of raw thought into an editor, this sort of handholding is much more important, and the use of spell-checking actually speaks directly to my point. Writing is not my craft, and these literary creations aren't craftsmanship. I've even been known to mix up it's and its on occasion, to the delight of my critics.
This brings us to the crux of the whole "debate": It was never about the advanced functionality of tools, or even the use of said features or whether they "annoy" me or not, but instead I'm speaking to a growing trend of laziness and carelessness in coding, where developers emit screens of code (probably gloating about their remarkable LOC achievements), and after spending as much time fixing up the many automatically detected errors they spend weeks trying to diagnose the much more insidious logic, design and usage errors that almost certainly permeate their creation.
If their work is so carelessly authored that they consider continuous automated correctness checks a heavily leaned upon, necessary feature of their environment, then I wouldn't put much stock in the quality otherwise.
That is the problem that I argued against, simply stating that when you feel naked and abandoned without these assistants, finding yourself automatically doing frequent compilations to catch egregious mistakes, then you've probably lost touch of the craft, and one's work isn't getting the loving attention it deserves.