Speed Hall of Fame
For some time in 2013-2015, we listed a performance improvement of the week. This page is now here for historical purposes.
But, feel free to nominate future changes! All Chromium contributors are eligible. To ensure a change is considered, nominate it to email@example.com.
This week, we highlight the performance sheriffing process working as it should due to reliability improvements. A few weeks ago, Joshua Bell landed a patch impacting IndexedDB. The performance sheriff Oystein Eftevaag filed a bug for an IndexedDB regression and the autobisect bot submitted a bisect job on his behalf. It returned with high confidence that Joshua's patch was to blame. While the regression was unexpected, Joshua investigated, determined it was his patch, and then posted a fix that resolved the performance regression. While we still have plenty of work to do improving reliability and decreasing latency, successes like these show that our improvements are making progress. Thank you to Oystein for the find and Joshua for the fix!
This week Balazs Engedy moved opening the LoginDatabase off of the UI thread and out of the critical path for opening Chrome. This shaved hundreds of milliseconds off startup times for Windows and Linux, decreasing the amount of time until the browser windows becomes responsive. Thanks for your help Balazs!
This week's improvement of the week goes to Ulan Degenbaev. A few weeks ago we noticed recurring, long garbage collection activity and a seemingly related 80% spike in metrics from the field. The V8 team triaged this back an issue with their idle notification scheme, and Ulan landed a fix to make things right. The patch has already been merged to M39 and will go out with the next push. Thanks Ulan!
Last week Mike Klein turned on the new SkRecord-based backend for SkPicture, which resulted in 25-30% faster recording on Android devices. Recording is one phase of the painting process, so faster recording will result in smaller paint times and faster framerates. This new backend is the result of several months of Mike's work. Thanks Mike!
Last week Oystein Eftevaag landed a patch to start the commit upon receiving the last blocking stylesheet. This resulted in a 25% improvement from the time the user initiates a request to the time they see the first paint. Unlike most metrics we quote here, this didn't come from from our devices in the lab, but from real users in the wild. Great work Oystein!
This week, Fadi Meawad fixed a bug in Chrome's power monitor on Windows that resulted in an improvement of 7% of the entire system's battery life while running Chrome. This particular bug gathered over 7,000 stars on the issue tracker - the most of any bug by several thousand. While these savings are impressive, our power work is nowhere near complete. Stay tuned for more improvements!
This week, Emil Eklund finished up work on DirectWrite for Windows. In addition to being a widely requested feature by users and developers with over 500 stars on the bug, DirectWrite also has performance considerations. For example, we're seeing about a 7-10% warm page load time improvement on pages with non-Latin fonts. Thanks Emil, for your work toward a beautiful and speedy user experience.
Over the past few weeks, Sami Kyostila landed a series of patches for increased reliability and decreased cycle time of our performance tests. Performance improvements are fantastic, but without work like Sami's we would be flying blind with a bunch of broken tests. Thanks for your hard work Sami!
Simon Hatch upgraded the capabilities of our bisect bots so that they can now bisect functional breakages and changes in variance. You can find instructions in the "tips" section of the documentation, but it's as easy as setting bisect_mode to return_code or std_dev in your bisect jobs. Simon's work should help us quite a bit in our quest for reliable, stable benchmarks!
Dale Curtis landed a patch this week that increases the buffer size for audio streams when appropriate, reducing both CPU and power usage. For certain media types (such as mp3) the system-wide power consumption improved by up to 35%, with about 20% savings on average. This is the first of many great power patches we expect to see now that the infrastructure is in place. Great work Dale!
This week David Reveman landed a patch to remove task references from RasterWorkerPool on all platforms. I can't possibly produce a better summary than his own patch description: "This moves the responsibility to keep tasks alive while scheduled from the RasterWorkerPool to the client where unnecessary reference counting can be avoided. The result is a ~5x improvement in BuildRasterTaskQueue performance, which under some circumstances translate to almost 2x improvement in ScheduleTasks performance." Thanks David!
Last week Hayato Ito reduced checking if a DOM tree is a descendent of another from O(N) in the height of the tree of trees to O(1). In smaller trees this produces 2-3x faster event dispatching, but in the deeply nested trees Hayato created he saw more than a 400x improvement! I'd also like to thank Hayato for the fantastic description of the patch and its effects in the CL description. Great work!
This week's improvement goes to Chris Harrelson, who landed a patch speeding up CSS descendant selectors by an astounding factor of 20-30x across all platforms. Though it can be difficult to trust microbenchmark results, this change is expected to save 90% or 50ms of style recalc time from expand animations. One more step in the direction of silky smooth web apps!
This week Oystein Eftevaag landed a patch that allows faster first paints in the lack of pending stylesheet loads. This produced a Speed Index improvement of about 5% on Android, and a radically faster Google Search loading time -- beginning to show the page a full 2.5s sooner! Oystein's work also unlocked the possibility for several further performance enhancements, so stay tuned for more progress.
This week it was impossible to choose between three truly epic improvements! So they’ll have to share the winnings. First, Daniel Sievers dropped Windows cold message loop start time from ~4s to ~1s, returning us to pre-Aura levels. David Reveman pwnd some compositing benchmarks in a series of patches, improving them several fold across platforms. Last but not least, Camille Lamy shaved a couple hundred milliseconds off some page loads in the top 10 mobile sites suite by moving unload event handling off of the critical path.
This week's improvement has been over a year in the making. Toon Verwaest succeeded in reducing code duplication and complexity by removing call inline caches from the V8 codebase. The result was the deletion of over 10,000 lines of code and several unexpected perf improvements
- not to mention the improvement to the developers' lives who would have had to maintain that code. Thank you Toon for all your hard work!
This week’s improvement is in our ability to measure. We’re now receiving the first energy consumption metrics on the Mac 10.9 perf bots (with Android soon to follow). This enables us to begin optimizing while avoiding regressions. Huge thanks to Jeremy Moskovich, Elliot Friedman, and the Chrome Infra team for getting our first energy benchmark up and running!
Just before the holidays, the on-duty performance sheriff Victoria Clarke noticed significant startup time performance regressions; in some cases, the regression was as much as 350%. Victoria traced it back to an innocuous-looking change in the password manager. Upon revert our startup metrics recovered completely. Thanks Victoria!
Simon Hatch landed a patch that prioritizes the loading of visible images. This is a huge user win in perceived loading time, but the initial attempt also unexpectedly introduced several large performance regressions which probably would have went to stable in the olden days. Pat Meenan, the performance sheriff on duty, tracked them back to Simon’s patch. This enabled Simon to revert the patch and reland it a few days later with all regressions resolved.
More compositor improvements this week, thanks to Vlad Levin. Vlad introduced the concept of TileBundles into the compositor, resulting in a 3-4x performance gain in updating tile priorities on desktop.
This week we saw a 70-90% improvement in composited layer tree host commit time across all platforms with impl-side painting. This is due in part to two changes, one by Adrienne Walker and one by Eric Penner. The former ensures that on a page with many layers, Chrome only spends time updating ones that have actually changed; the latter optimizes tiling resolution when scaling. Thanks to you both!
This week’s improvement comes from outside Google: Jun Jiang from Intel landed an order of magnitude speed increase for drawing dynamic WebGL to hardware-accelerated Canvas 2D.
This week we wanted to recognize the folks working on Aura and the Ubercompositor, who not only reversed a number of Windows regressions but pushed them further into solid performance enhancements. Specifically they were able to improve framerate (as much as double!) on the blob demo, deferred irradiance volume demo, and WebGL aquarium. The changes also affected real-world applications, similarly improving framerate on properties such as Google Maps.
This week’s top improvement is from a perf sheriff, Prasad Vuppalapu, who diagnosed a 50% regression in loading Japanese and Chinese web pages on Windows. The revert recovered the performance. Holding on to the speed we have is equally as important as improving our speed in the first place.
This week’s top improvement is from Elliott Sprehn, who landed a massive 93.5% performance improvement to CSS/StyleSheetInsert. Great work Elliott!