Warning: this page is stale. Performance test results are now displayed at chromeperf.appspot.com. See:
Chromium's buildbot runs a battery of performance tests on each build it produces. These performance test results are collected and viewable on the web to help Chromium developers study how the project's builds perform over time. By adding new tests, tuning existing tests, and configuring monitors to alert developers to unexpected conditions (such as regressions or speedups), the Chromium project continually improves its overall code quality. This document explains how that performance testing system functions and how changes can be introduced. Gathering performance resultsEach performance test is run by buildbot by executing that test's harness. The harness sets up the initial conditions for the test, runs the test (maybe repeatedly), and reports results on the program's standard output. You can build and run a test harness on your system -- these are the same programs the buildbot runs to measure performance. Note that your performance results could vary considerably from what the Chromium project's test systems report. A sample performance plotTest runs can be simple -- "test the amount of time it takes to switch between two tabs" -- or complicated -- "test warm/cold startup performance with/without themes." Most of the tests run by the buildbot are pretty straightforward, but the performance test output can be hard to use at first. Let's look at the most complicated kind of performance test, a page-cycler test, as an example. The page-cycler tests take a set of URLs that have been saved to local files to eliminate the network as a source of performance noise, and load them several times each, tracking various data. Find a "Perf" builder, scroll down to any page_cycler step, and click the "results" link. You'll see a plot like this one:
The boxes along the top list the different graphs available, one for each kind of data that was collected by the page_cycler test. Click on one to see that information plotted. Here's what all the pieces of the names mean:
So for example, "vm_final_b" is the final VM usage of the browser process.
Depending on the graph you're looking at, there will be one or more colored lines, or "traces". Generally one of these is the main data point for the build being tested ("t" for times, for instance) and one is data taken from a reference build so we can discount temporary glitches on the machine ("t_ref" for reference times). Other traces might be shown, too, and described in the legend below the plot.
If you click your mouse on the plot, one build will be highlighted by a grey vertical line, and the svn log for the build you've selected will be shown in a frame below the plot. Below the legend is the revision number and first data point (using the order on the legend) for that revision. It's followed by the data value for the spot your mouse is hovering over, so you can quickly compare two points. If you shift-click on the plot, you'll place a horizontal marker, and the offset between that marker and the mouse is shown in the lower right. When
a Chromium developer wants to measure a new aspect of Chromium's
performance, the developer either modifies an already-in-place harness
to provide additional data or creates a new harness to produce this
data. When a new harness is added, the Chromium buildbot's
configuration is updated to run it and told where to gather its
performance test results. Modifying a harness to add more data may not
require any buildbot configuration changes. Tests in Chromium typically use one of the methods in chrome/test/perf/perf_test.h to print their results in the standard format expected by the post-processing scripts. Monitoring performance for regressionsOnce
the Chromium buildbot runs all of a build's tests, it gathers the results on the perf dashboard and reports whether the test met, exceeded, or missed its expectations. The Chromium buildbot follows this algorithm to determine what to report after gathering results:
Format of expectations filePerf regression monitoring is controlled via src/tools/perf_expectations/perf_expectations.json. This is a JSON file with the following format:{ "PERFID/TEST/GRAPH/TRACE": {"reva": REVA, "revb": REVB, "improve": IMPROVE, "regress": REGRESS}, "PERFID/TEST/GRAPH/TRACE": {"reva": REVA, "revb": REVB, "improve": IMPROVE, "regress": REGRESS}, "PERFID/TEST/GRAPH/TRACE": {"reva": REVA, "revb": REVB, "improve": IMPROVE, "regress": REGRESS}, ..., "load": true }
Notes:
Perf expectations exampleTake the following perf_expectations.json file as an example:{"xp-release-dual-core/morejs/times/t": {"improve": 50, "regress": 110},This declares 2 trace instances that should be monitored for regressions, both on Chromium's XP perf test system. The first -- xp-release-dual-core/morejs/times/t -- monitors results of the morejs page cycler's page load time. It alerts when the page load time for the current build less the reference build is greater than 110ms. The second -- xp-release-dual-core/startup/warm/t -- monitors results of the warm startup test and alerts when the difference between the current build and reference build is greater than 25ms. For each of these traces, warnings (not failures) are generated when speed ups occur. A speed up occurs when a given result is better than the expected improve value. Updating performance expectations, selecting acceptable results, ...See the Perf Sheriffing page for more details about updating perf expectations, selecting acceptable results, etc. |