A Python script for automating the process of syncing, building, and testing commits for performance regressions.
The script works by syncing to both revisions specified on the command line, building them, and running performance tests to get baseline values for the bisection. After that, it will perform a binary search on the range of commits between the known good and bad revisions. If the pinpointed revision turns out to be a change in Blink, V8, or Skia, the script will attempt to gather a range of revisions for that repository and continue bisecting.
Supported platforms include Android, Linux, Mac, and Windows. See http://build.chromium.org/p/tryserver.chromium.perf/builders.
Run on try bot
- Create new git branch or check out existing branch.
tools/auto_bisect/bisect.cfg (instructions in file).
- Take care to strip any
src/ directories from the head of relative path names.
- You can search through the stdio from performance tests for
run_benchmark help figure out the command to use.
- You cannot specify a blink, v8, or skia revision range directly. You can only specify a range of chrome revisions, and the tool will extract other revision ranges by looking at the .DEPS.git file.
- Also note that cycle times for the bots can be vastly different. Often linux bisects are the fastest, followed by windows, mac, and android.
Commit your changes locally; if 2nd+ change to branch, amend via
git commit --amend (else trychange.py can't determine the diff).
You can run a try job using Rietveld like soRun the try job like so; you can substitute a specific bisect bot name for the --bot parameter.
git cl upload --bypass-hooks
git cl try --email=<email>@chromium.org -m tryserver.chromium.perf --bot=linux_perf_bisect
You can also see the results on the bisect try server waterfall page.
The try bot will send an email on completion as a regular "Try Success" email, showing whether bisect was successful and linking to output (the Results stdio link at the bottom).
Also note that the trybots run on LKGR, not ToT. If you just a made change to the bisect script themselves, make sure you pass -r REV to ensure the bisect script contains your revision.
- If the bot seems to be down, you can try pining a trooper.
You probably don't want to run locally, except to debug proper settings before sending to a try bot, or if running overnight, since the test will run in your local session and make your computer impossible to use. Further, you won't be able to run anything without interfering with the tests.
- Recommended that you set power management features to "performance" mode.
-config set powermanagement performance
- Run locally in private checkout (recommended way of running locally); first change to
tools/run-bisect-perf-regression.cfg (instructions in file)
tools/run-bisect-perf-regression.py -w .. -p "$GOMA_DIR"
-w Working directory to store private copy of depot
-p Path to goma's directory (optional).
- Run locally from <chromium>/src:
tools/run-bisect-perf-regression.py -c "out/Release/performance_ui_tests --gtest_filter=PageCyclerTest.MozFile" -g 179776 -b 179782 -m times/t
tools/run-bisect-perf-regression.py -c "out/Release/performance_ui_tests --gtest_filter=ShutdownTest.SimpleUserQuit" -g 1f6e67861535121c5c819c16a666f2436c207e7b -b b732f23b4f81c382db0b23b9035f3dadc7d925bb -m shutdown/simple-user-quit
- Often you can get a clearer regression by looking at the other metrics in the same test. For example, if the metric is warm_times/page_load_time for a page_cycler regression, look at the individual pages. Often there's a page where the regression clearly stands out that you can bisect on.
- With tests that suddenly become noisy, bisecting on changes in the mean isn't all that useful. There's a "bisect_mode" parameter in the config that allows you to specify "std_dev", and the bisect script will bisect on changes in standard deviation instead. There is currently no way to do this from the dashboard, so you'll have to initiate the bisect manually. The list of available modes can be found in bisect_utils.py: 'mean', 'std_dev', and 'return_code' are the values as of January 2015. 'error_code' can be used to find the point where perf tests begin failing (when metrics aren't being produced).
- You can use the bisect script to find functional breakages as well. Specify "return_code" for the "bisect_mode". You can leave "metric" empty since this won't be used. There is currently no way to do this from the dashboard, so you'll have to initiate the bisect manually.
- If you suspect a Blink/V8/Skia roll, be sure that the range you specify includes when the DEPS file was submitted; the script will attempt to detect this situation and expand the range to include the DEPS roll.
- Remove "src" from paths in command
--browser_executable flag and
- You are running the script from <chromium>/src.
- You have 'master' checked out on src.
- You are on the LKGR (may not be HEAD)
- (You may be able to instead use a .diff file, as above.)
Known issues: See label Cr-Tests-AutoBisect on the issue tracker.