Native Client‎ > ‎2: How Tos‎ > ‎

Profiling Native Client applications on 64-bit Windows

Introduction
nacl_profile.py is a tool for profiling NaCl modules on 64-bit Windows. It is a thin wrapper around AMD CodeAnalyst. CodeAnalyst does the actual profiling, and nacl_profile.py maps the samples in the NaCl sandbox to symbols in the nexe files and the IRT (NaCl Integrated RunTime).

nacl_profile could also be made to work on Linux (using CodeAnalyst) and Mac (using Shark) with minor modifications. For now, developers on Linux can use oprofile for NaCl profiling. If you’d like nacl_profile support on Linux or Mac, let the NaCl team know on the Native Client discussion group. That’s also a great place to ask questions about nacl_profile. Found a bug? Please report it in our bug tracker.

Requirements

  • nacl_profile.py
    Attached at the bottom of this page.
  • Native Client SDK
    Known to work with x86_64-nacl-objdump revision 6808.
  • Python 2.6+
    Known to work with Python 2.6.4.
  • AMD CodeAnalyst
    Known to work with 3.3.1016.774 including careport.exe version 1.0.1.65, caprofile.exe version 1.2.4.62, and cadataanalyze.exe version 1.2.1.10.

Get profiling data for NaCl plus a system summary

Profile a NaCl module that is already running

To enable nacl_profile to get symbol information (information about what functions are being called), you should make sure to point it to a non-stripped nexe. To profile my_app_x86_64.nexe navigate to your web app, so that my_app_x86_64.nexe is already running in Chrome, and execute the following in a command prompt:

python nacl_profile.py --nexe_file=my_app_x86_64.nexe


This will start the profiler, profile the system for 20 seconds, and print three summaries to the console: (1) a system-wide process top 20, (2) a system-wide module top 20, and (3) a Native Client function top 30:

  Timer

Samples        %      PID Process Name

-------- -------- -------- ------------------------

3020338    92.53        0 System Idle

 184477     5.65      936 chrome.exe

  26361     0.80     4268 nacl64.exe

  19073     0.58        4 unknown module pid(4)

...


  Timer

Samples        % Module Name

-------- -------- -------------------------

3018223    92.47 ntoskrnl.exe

 148113     4.53 ntdll.dll

  24052     0.73 unknown module pid (4268)

  14695     0.45 libglesv2.dll

  12924     0.39 ntoskrnl.exe

...


Probable nexe process ids:  ['4268']


   % Address       Symbol

----- ------------- -----------------------------------------------------------

18.6 0x00c00024460 Tentacle::draw(unsigned long long, mat4 const&, mat4 const&

                   , mat4 const&)

16.4 0x00c000201e0 mat4::operator*(mat4 const&) const

 8.0 0x00c000ecd20 __kernel_sinf

 6.8 0x00c000ebce0 __kernel_cosf

 6.1 0x00c0fc322a0 gpu::CommandBufferHelper::WaitForAvailableEntries(int)

...


Want more than the top 30? Use --top=n or simply go --top=all to get everything.

Including IRT (NaCl Integrated RunTime) symbols

In the list of NaCl functions you are likely to see lines with an address of the form 0xc0fchhhh, e.g.,

   % Address       Symbol
----- ------------- -----------------------------------------------------------
15.6 0x00c000201e0 mat4::operator*(mat4 const&) const
                   ...
 1.3 0x00c0fc32706 0xc0fc32706
 1.0 0x00c0fc4c3bc 0xc0fc4c3bc
 0.9 0x00c0fc328ed 0xc0fc328ed

These addresses are inside the Native Client IRT. To get symbol information for them, point nacl_profile to the nexe for the IRT (nacl_irt_x86_64.nexe for 64-bit). You can find the IRT in the directory where chrome.exe resides or a subdirectory typically named after the Chrome version, e.g., 15.0.874.100\.

To have nacl_profile lookup the symbols corresponding to these addresses, use the --irt_file argument:

python nacl_profile.py --nexe_file=my_app_x86_64.nexe ^
   --irt_file=nacl_irt_x86_64.nexe


Kick off a process, wait a bit, and then profile

Sometimes it is convenient, e.g., from within an IDE like Visual Studio or Eclipse, to start a NaCl process with Chrome, profile it, and exit when done. This can be achieved by using --delay=n --command_line=cmd, which runs cmd, waits for n seconds, profiles for 20 seconds, and exits. E.g.,

python nacl_profile.py  --nexe_file=my_app_x86_64.nexe ^
 --irt_file=nacl_irt_x86_64.nexe --delay=5 ^
 --command_line="[...]\chrome.exe http://localhost:5103/my_app.html"

We recommend binding a key in your IDE to this command, so that you can profile the application with one keystroke. E.g., in Visual Studio you can set up a build configuration to execute this when you hit Ctrl-F5 (Start Without Debugging).

A note on offsets

The profiling tools assumes a standard inner sandbox offset of 0xc00000000. If this is not the case, the --offset=0xhhhhhhhh option should be used. To find the offset connect a debugger to the running nexe and read the contents of register r15.

Get profiling data for both NaCl and other processes (e.g., Chrome, GPU process)

Sometimes, e.g., when developing applications that render 3D graphics, it is important to know if there are bottlenecks elsewhere in the system, perhaps because the NaCl process is issuing more glDraw calls than the graphics stack on the machine can handle.

By default nacl_profile (via CodeAanalyst) collects system-wide information that can be examined in CodeAnalyst. After running nacl_profile, you’ll see a .caw file, e.g., my_app_x86_64.caw. Open this file in CodeAnalyst to examine the profiling information for all non-NaCl sandboxed processes. Your nexe will typically show up as “unknown module pid (n)”, but CodeAnalyst will not be able lookup its symbols because of the sandbox offset.



nacl_profile overwrites the data from previous runs without prompting. If you want to keep the data from a particular run, put the .caw file, the .tbp.dir directory, and optionally the raw .prd file somewhere safe before running nacl_profile again.

If you want to see what Chrome is doing, you first need to get the symbols. There are at least two options:
  1. Point CodeAnalyst to the Chrome symbol servers at http://chromium-browser-symsrv.commondatastorage.googleapis.com (see instructions) in the CodeAnalyst options dialog.
  2. Use Chromium, download the symbols (.pdb files), and point CodeAnalyst to the symbols that you manually downloaded.
Here is an example showing both of the two options in CodeAnalyst:



Here is a screenshot showing the files for Chromium used in the above set-up. Notice the .pdb files:



To make profiling and analyzing both NaCl and other processes easier, nacl_profile provides the flag --run_codeanalyst, which after profiling will launch CodeAnalyst in addition to printing the usual profiling information for the NaCl module:

python nacl_profile.py --nexe_file=my_app_x86_64.nexe ^
   --irt_file=nacl_irt_x86_64.nexe ^
   --run_codeanalyst

Get profiling data for other processes only (no NaCl)

If no profiling data is needed for the nexe, nacl_profile is not necessary. Instead one can simply start up the CodeAnalyst application directly or run it from within Visual Studio.

Command line options

nacl_profile support several additional command line options, e.g,. to disable system summaries, output CSV results, and not collapsing individual samples into function buckets. To see the complete summary run the following:

c:\...\> python nacl_profile.py --help
Usage: nacl_profile.py [-h] [options]


nacl_profile.py is a tool for profiling NaCl modules on 64-bit Windows.


Options:

 -h, --help            show this help message and exit

 --command_line=COMMAND_LINE

                       Optional command to run before profiling (e.g.,

                       Chrome)

 --nexe_file=NEXE_FILE

                       Path to the unstripped .nexe file being run

 --irt_file=IRT_FILE   Path to the unstripped IRT being used

 --csv                 Output results (NaCl only) in comma-separated format

 --out=OUTPUT          Output file.

 --offset=OFFSET       Sandbox memory offset for NaCl.

 --delay=DELAY         Delay in seconds before starting profiler.

 --top=TOP             Number of lines to print (default 30) or --top=all for

                       everything.

 --no_collapse         Don't collapse profiler samples into a single sample

                       for each function.

 --run_codeanalyst     Launch CodeAnalyst to see system-wide non-NaCl data

                       when done.

 --no_system_data      Don't print system-wide module/process data. If you

                       don't print the data, you can still see it by opening

                       the .caw file in CodeAnalyst.

Reporting performance issues

If you feel you've run into a performance issue in Chrome or Native Client, you can report it by creating an issue. Please be sure to include
  • the .caw file,
  • the entire .tbp.dir directory,
  • the raw .prd file,
  • a URL to the Native Client app (using optimized, but non-stripped nexes),
  • a detailed description of the issue (what do you see vs. what did you expect), and
  • all information shown under chrome://version.

Other profiling tools

  • In Chrome: navigate to chrome://gpu to see various debug output from the GPU process. This page will mention if hardware acceleration is active and show log messages at the bottom.
  • In Chrome: navigate to chrome://tracing to profile hardware accelerated graphics. Click “Record”, switch to the tab that renders graphics for 5-10 seconds, and switch back to the profiling tab.
  • In Chrome: navigate to chrome://memory to see process ids, memory use, etc.
  • To profiling hardware accelerated graphics (i.e., DirectX on Windows) you can use PIX. See instructions for doing PIX profiling for apps running inside Chrome.
  • In-nexe profiling: look for the xray profiler in the Native Client source tree. The xray profiler is not in the SDK at this time.
Č
ċ
ď
nacl_profile.zip
(4k)
Christian Stefansen,
Oct 22, 2011, 1:57 PM
Comments