There are a few ways to profile Chromium and Blink. Here are some of the tools that work well for diagnosing performance problems. See also the Deep memory profiler. For JavaScript issues, the built in profiler works very well. To use this open up the Chrome Dev Tools (right click, Inspect Element) and select the 'Profiles' tab.
For a broader understanding of Chromium speed and bottlenecks, as well as understanding how posted-task and threads interact in aggregate, there is a cross-platform, task-level profiler built in. Profiler results can be seen in about:profiler (or equivalently chrome://profiler) For more details, visit (http://www.chromium.org/developers/threaded-task-tracking).
See chrome://tracing for timelines showing TRACE_EVENT activity across all the different threads; originally used for GPU performance, and will probably require you to add TRACE_EVENT calls to the features you're interested in outside of compositing & rendering (this was named about://gpu through M14).
C++For native C++ code the tools depend on the OS.
Note that basic printf debugging and using a general debugger (such as gdb) may be sufficient for some purposes. However, more specialized tools are available.
LinuxSee LinuxProfiling for alternative discussion. gperftoolsThe gperftools project, from which we get TCMalloc, also includes a very nice profiler: Google CPU Profiler.
GYP_DEFINES+=" profiling=1 release_extra_cflags=-fno-omit-frame-pointer disable_pie=1"
src/out/Release/chrome
src/out/Release/chrome --single-processpprofUse pprof to analyze the results:pprof src/out/Release/chrome chrome-profile-browser-NNNSome tips:
You should be able to increase (or decrease) the sampling frequency (defaults to 100 Hz = every 10 milliseconds) via the CPUPROFILE_FREQUENCY environment variable, but For nice viewing, output in DOT format and view with one of these programs: XDot (packaged in Ubuntu), ZGRViewer. For XDot, you need to trim off the first line, reading "Total: XXX samples", which you can do with tail, as follows:pprof --dot src/out/Release/chrome chrome-profile-browser-NNN | tail --lines=+2 > NNN.dot(if using --focus, there will also be an "After ..." line, so trim off the first two lines.)You can also pipe directly to xdot if you don't want a temporary file: pprof --dot src/out/Release/chrome chrome-profile-browser-NNN | tail --lines=+2 | xdot -test_shellPreviously you could control when the profiler starts and stops from within test_shell (from revision 41218 until revision 132841). This can help a lot when trying to isolate a certain action without polluting the profile with a lot of startup/shutdown code. To do this:
The profile will be saved in a file called "chrome-profile" in the working directory. You can't stop and restart the profiler without blowing away the previously stored data currently.
perfYou can also use the standard Linux perf tool:
By default this saves "perf.data" in the current working directory, which can be renamed. perf report may be able to run on older data, but perf annotate will be inaccurate if you've since rebuilt the executable.
Chrome OS
Profiling for Chrome OS is very similar to Linux, with a couple of key differences
OS XDTrace and the pre-packaged "CPU Sampler" tool in XCode work well. Shark or the command-line sample work also, though they both will spend an exceedingly long time processing symbols if you are running Leopard (10.5). Anecdotally this is much faster in Snow Leopard (10.6)
WindowsSyzyProf is a made-to-order, license-free, instrumenting hierarchical performance profiler that works well with Chrome. The aim with SyzyProf is to allow comprehensive profiling of Chrome code, including profiling over tasks or IPC, as well as integrated profiling over JavaScript in V8 and C++. SyzyProf is implemented as a 20% labor of love by a small group of Chrome developers, and we're looking for more >= 20% help.
I've heard that Purify has a profiler but have no experience with this personally.
AMD Code Analyst is a free profiler that can run inside Visual Studio. It captures frequency counts for functions in every process on the computer. It can optionally capture call-stack information, %CPU, and memory usage statistics; even with the Frame Pointer Omission optimization turned off (build\internal\release_defaults.gypi; under 'VCCLCompilerTool' set 'OmitFramePointers':'false'?), the call stack capture can have lots of bad information, but at least the most-frequent-caller seems accurate in practice.
Intel's VTune 9.1 does work in the Sampling mode (using the hardware performance counters), but call graphs are unavailable in Windows 7/64. Note also that drilling down into the results for chrome.dll is extremely slow (on the order of many minutes) and may appear hung. It does work (I suggest coffee or foosball). VTune has been essentially supplanted by Intel® VTune™ Amplifier XE, which is an entirely new code base and interface, AFAIK.
Very Sleepy (http://www.codersnotes.com/sleepy) is a light-weight standalone profiler that seems to works pretty well for casual use and offers a decent set of features.
GPU profilingBoth nVidia PerfHUD and Microsoft PIX are freely available. They may not run without making minor changes to how the graphics contexts are set up; check with the chrome-gpu team for current details.
The OpenGL Profiler for OSX allows real-time inspection of the top GL performance bottlenecks, as well as call traces. In order to use it with Chrome/Mac, you must pass --disable-gpu-sandbox on the command line. Some people have had more luck attaching it to the GPU process after-the-fact than launching Chrome from within the Profiler; YMMV.
GPUView is a Windows tool that utilizes ETW (Event Tracing for Windows) for visualizing low-level GPU, driver and kernel interactions in a time-based viewer. It's available as part of the Microsoft Windows Performance Toolkit, in
%ProgramFiles%\Microsoft Windows Performance Toolkit\GPUView. There's a README.TXT in there with basic instructions, or see http://graphics.stanford.edu/~mdfisher/GPUView.html. N.B.: There's a known bug which causes GPUView to crash when visualizing traces captured on machines with more than 8 cores. On an HP Z600, disabling hyperthreading in the BIOS is enough to work around this issue. |
