Dennis Forbes on Software and Technology   Subscribe to RSS


About the Author
Dennis Forbes is a Toronto-based software architect. While focused primarily on the .NET and SQL Server worlds, Dennis frequently ventures outside of this comfort zone into game development and image processing. He has been published in several industry magazines, has been quoted in the Wall Street Journal and has been interviewed by NPR.

He is a vice president and lead software architect at an innovative New York City hedge fund back-office services firm.

Dennis has been working on solutions for the financial, telecommunications, and power generation markets for over 13 years.




The Feed Bag

 
Friday, June 26 2009

Web Worker Benchmark - Moonbat

If you're running Firefox 3.5 or Safari 4 [EDIT: Or Chrome 3.0], take a look at the "benchmark"/technology demo I just put up. [Safari 4 compatibility added based upon the great comment submitted by Oliver]

It's a modified variant of the SunSpider benchmark that I've written about before (in less than flattering terms), which I heavily altered to utilize the remarkable new Web Worker functionality you can now explore in Firefox 3.5. If you're really analyzing performance, be sure to disable Firebug as it significantly impacts the results.

Google GearsWeb Workers, a standardization of a feature of Google Gears, are a remarkably simple method of multi-threading JavaScript, not just to get it out of the UI thread — where it can be very detrimental to the user experience as the interface freezes while a script runs — but also to scale across multiple CPUs and cores on modern PCs, which while seemingly a ridiculous notion ("but it's just JavaScript! Multithreading?") is becoming a real concern as the JavaScript engines continue to advance and the usage and scope of the language and related technologies continue to expand.

Through a simple, synchronized message passing system and a minimalist API, the Web Workers model lends itself to robust, elegant code that isn't as prone to classic multi-threading pitfalls. While not a part of the current instantiations, in a theoretical implementation there is no reason why web workers couldn't be located on entirely different machines, given that each worker is essentially an isolated runtime, sharing very little (the navigator properties and some basic security info for things like enforcing XmlHttp restrictions), communicating via serialized messages.

Understanding the Benchmark

The benchmark/technology demo is operational in Chrome, Opera, and Internet Explorer, but only if you change Web Workers to 0. In that case it is sequentially running the set of tests in the main thread, as JavaScript has traditionally been run. I didn't intend for this to be used for cross-browser comparisons, even if I resort to presenting just such a comparison at the end of this entry, and instead the focus is really on the technology, so the real "power" is seen once you start to turn up the web worker dial, all the way to 11.

SafariWeb worker multithreading isn't limited to Firefox 3.5. Oliver left a comment pointing to a Safari-ready variant he threw up, so I modified the test accordingly (the difference being that when Safari implemented it, it didn't intrinsically include JSON encoding, so your caller and receiver had to do that themselves). I didn't realize that Safari had covered this ground, though it isn't shocking given how rapidly that browser has been advancing.

With one web worker, the UI remains fully responsive to user interaction, which is an experience quite unlike what was seen at 0 (where the browser essentially locks up during the run), and the actual run itself suffers little for the isolation. On a quad-core CPU, the CPU usage during the duration of the test cycle fluctuates around approximately 25%.

At two web workers, the individual tests take slightly longer to run, however the actual completion and pace of the tests in the whole is greatly improved. Not quite a halving of the runtime, but not too far off. Two cores are saturated during the duration of the test.

At three web workers, three of the cores are filled with work, and the total elapsed time improves somewhat, albeit not by the ratio that correlates with the 50% increase in computation power.

At four web workers, we've tapped out the parallelism and despite all four cores being saturated for most of the duration, the total runtime actually suffers slightly. Going above four doesn't cost much, but it also brings no real gain (beyond possible algorithm gain isolated various parts of the application).

You can also run a mode where instead of running a modified js directly in the worker thread, the code is passed as a string parameter, eval'd into a function reference, and the function is run. There are some interesting observations to be observed by this test, such as the lack of tracemonkey loop optimizations on eval'd code (see bitwise-and in particular. It suffers dramatically when run as an eval'd function relative to running as literal JavaScript). This surprised me as the eval merely instantiates a function in the current context, but doesn't run it, yet the performance penalty remains because it was sourced from an eval.

Here are some results for 1-8 threads, running 10 cycles of each test, gathering the total elapsed time in Safari 4 and Firefox 3.5 RC2. This was run on a quad-core Q9400 machine, and of course your mileage will vary. While it is evident that Firefox 3.5 is using more of the available processing power as you move past 1 thread, with it increasing from 25%, 50%, 75%, to 100% at 1, 2, 3, and 4 threads respectively, it doesn't fully benefit from the additional resources, yielding a greatly diminished rate of return. Safari, on the other hand, already started with a considerable lead, and it pulled away with each thread up to the optimal 4, really hitting its stride.

Multiple threads in Safari and Firefox 3.5

I'll add some charts and the like to this entry later, but just thought I'd drop a line on that demo of a very promising technology that will soon see fairly robust deploymet (one huge benefit of Firefox -- shared by Chrome and Opera -- is that the uptake rate for new versions is extremely high).

  Firefox 

Reader Comments

I've put up a version of the benchmark that runs on Safari 4 (Safari still has the string based postMessage API from earlier this year) at http://nerget.com/bugs/worker-sunspider/moonbat-driver.html -- it's basically your driver, but it explicitly serialises messages using JSON.
Oliver @ 6/26/2009 2:12:19 AM
Excellent post. Thank you Dennis.
Reginald @ 6/26/2009 5:25:50 AM
Do I really want my browser to eat up even more of my PC? One of the benefits of the single-threaded nature of the browser is that on my multi-core CPU I at least know it's going to be restricted to sucking up one whole core when it misbehaves. I'm a bit concerned about the idea that this unleashes that beast, and I hope I don't have to resort to setting processor affinity each time I start the browser to keep it in its box.
Tom Jacobs @ 6/26/2009 6:49:02 AM
You are my god.
Jeff @ 6/26/2009 7:56:26 AM
Looks like something has changed from Chrome 3 to Chrome 4 that breaks your benchmark. Sad - I was really looking forward to seeing the improvement for myself.
Greg @ 12/16/2009 3:31:03 PM
Hey there Greg, and a big thanks for the heads up.

This got caught up in the whole json2.js thing - http://www.stevesouders.com/blog/2009/12/10/crockford-alert/

Fixed now and should be functioning properly.

Again thank you for the notification.
Dennis Forbes @ 12/16/2009 4:03:50 PM

Add Comment

Name *:

Email Address:

(your email address is not displayed)
Website:

Comment *:


Dennis Forbes