Debugging x86-64 Native Client Modules on Linux with nacl64-gdb


Overview

This page describes nacl64-gdb, an experimental Native Client Debugger based on GDB made specifically for debugging 64-bit Native Client modules on x86_64 Linux systems. 

For debugging 32-bit Native Client modules, refer to the corresponding documentation.

Installation

Getting the Source

This nacl64-gdb is not officially included in the toolchain or SDK. You can get it from the branch nacl of nacl-gdb repository:

git clone http://git.chromium.org/native_client/nacl-gdb.git
cd nacl-gdb
git checkout nacl

Building

Configure and build as the native debugger:

mkdir <build-dir>
cd <build-dir>
<src-dir>/configure --prefix=<install-dir>
make all-gdb
make install-gdb

Now you have <install-dir>/bin/gdb. Copy or link it to nacl64-gdb for convenience.

Usage

The debugger works by debugging the Service Runtime (see Glossary) together with the NaCl program. Thinking of the debugging techniques, simply treat NaCl program as yet another shared library loaded by the service runtime.

ATTENTION: service runtime and NaCl program will likely have equally named symbols, for example, main function exists in both. This makes things like setting breakpoint on a particular main somewhat tricky. See detailed solutions below.

Debugging in sel_ldr

First you need to specify service runtime binary. Do it as usually:

(gdb) file ./sel_ldr

Specify full command line for the service runtime: service runtime command-line arguments, NaCl binary and NaCl command line arguments (if applicable). With sel_ldr, we recommend to use -Q to disable platfrom qualification test, otherwise you might get a SIG_SEG at startup (however, you can simply continue from there):

(gdb) set args -Q ./t.nexe

Unfortunately for certain service runtime there is no way to deduce NaCl program name from the service runtime command line. For example, chrome gets NaCl program name from the html.

For now, you always have to specify the NaCl binary explicitly, with this new gdb command:

(gdb) nacl-file ./t.nexe

Here is another example, for debugging NaCl dynamically linked executable. Here nacl-file is the dynamic loader:

(gdb) set args -Q -c -a -f ./runnable-ld.so -- ./t.out
(gdb) nacl-file ./runnable-ld.so

All of the above may be quite annoying. Make good use of gdb scripts and --command=<script> command line gdb option.

Debugging in Chrome

WORK IN PROGRESS. These notes are incomplete. Follow-up questions on debugging are welcome on native-client-discuss@googlegroups.com.

Getting satisfactory results from nacl64-gdb requires binaries with symbolic information for your NaCl module. As with any debugging situation, NaCl modules should be built with -g and not stripped. If you want symbolic information for the browser as well  you can find release builds of Chromium with symbolic information at http://commondatastorage.googleapis.com/chromium-browser-continuous/index.html. Typical workflow:
  • Remove the nacl_helper_bootstrap from your Chromium system. This will cause Chrome to use the chrome binary rather than the nacl_helper as a container for Native Client modules. 
  • Start nacl64-gdb, specifying your Chromium binary as the debug target.
  • Use the "nacl-file" command as described above to load symbols for your Native Client module.
  • In a different terminal window from your nacl64-gdb session, start Chrome (or Chromium) from the command line.
  • Once Chrome is launched, open the Chrome task manager (wrench icon -> Tools -> Task Manager). Right click on the header and add the "Process ID" field to the display.
  • Now start your Native Client module in Chrome. When you see "Native Client module" in the task manager, suspend the browser (^Z) so that you can attach the debugger before anything interesting happens.
  • In nacl64-gdb, use "attach <process-id>" to attach the debugger to the Native Client module. Once you have attached, you can un-suspend Chrome and continue the Native Client process using "cont" in GDB.
The workflow is helpful for situations where a crash happens relatively early, for example during initialization, and you would like to capture the crash in the debugger. You can forgo the part about suspending Chrome when hunting for bugs that happen in response to a user action.

Another common debug scenario is an infinite loop. Keep in mind that sometimes a profiler can be a useful tool for finding such problems. See the Native Client developer pages on profiling. 

Setting Breakpoints

If the symbol is non-ambiguous, set the breakpoints as usually:

(gdb) b NaClCreateMainThread
Breakpoint 1 at 0x259f8: file src/trusted/service_runtime/sel_ldr_standard.c, line 645.

If setting breakpoint on NaCl symbol before NaCl program is loaded, gdb will ask if the breakpoint should be pending on future shared library load. Recall that NaCl program is treated as yet another shared library and answer 'y':

(gdb) b hello_world
Function "hello_world" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (hello_world) pending.

If the symbol is ambiguous, breakpoint will be set on a symbol found by symbol lookup, which depends on the current context. And this might be not stable, and breakpoints may migrate when re-set, for example, at program restart. To avoid this, try to specify not only the symbol name, but the source file as well:

(gdb) b sel_main.c:main
Breakpoint 3 at 0x7ffff7f640a8: file src/trusted/service_runtime/sel_main.c, line 147.
(gdb) b t.c:main
Breakpoint 4 at 0x7ff4010005e4: file t.c, line 19.


Sample session
 
Here is an example for debugging dynamically linked executable:


(gdb) file ./sel_ldr
Reading symbols from /home/user/sel_ldr...done.

(gdb) set args -a -Q /home/user/nacl/toolchain/lib64/runnable-ld.so --library-path /home/user/nacl/toolchain/lib64 ./b.out

(gdb) nacl-file /home/user/nacl/toolchain/lib64/runnable-ld.so

(gdb) b sel_main.c:main
Breakpoint 1 at 0x150a8: file src/trusted/service_runtime/sel_main.c, line 147.

(gdb) b t.c:main
No source file named t.c.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (t.c:main) pending.

(gdb) r
Starting program: /home/user/sel_ldr -a -Q /home/user/nacl/toolchain/lib64/runnable-ld.so --library-path /home/user/nacl/toolchain/lib64 ./b.out
[Thread debugging using libthread_db enabled]

Breakpoint 1, main (argc=7, argv=0x7fffffffe048) at src/trusted/service_runtime/sel_main.c:147
147 int main(int  argc,

(gdb) li
142          " -Q disable platform qualification (dangerous!)\n"
143          " -E <name=value>|<name> set an environment variable\n"
144          );  /* easier to add new flags/lines */
145 }
146
147 int main(int  argc,
148         char **argv) {
149  int                           opt;
150  char                          *rest;
151  struct redir                  *entry;

(gdb) bt
#0  main (argc=7, argv=0x7fffffffe048) at src/trusted/service_runtime/sel_main.c:147

(gdb) c
Continuing.
DEBUG MODE ENABLED (bypass acl)
PLATFORM QUALIFICATION DISABLED BY -Q - Native Client's sandbox will be unreliable!
[21206,4159854368:19:11:31.579376] BYPASSING ALL ACL CHECKS
[21206,4159854368:19:11:31.629045] Entered NaClMakeDispatchThunk
[21206,4159854368:19:11:31.629066] NaCl_page_alloc_randomized: 0x6d9e0d18
[21206,4159854368:19:11:31.629072] NaCl_page_alloc_randomized: hint 0x6d9e0d180000
[21206,4159854368:19:11:31.629080] NaClMakeDispatchThunk: got addr 0x6d9e0d180000
[New Thread 0x7ffff7f47700 (LWP 21209)]
[New Thread 0x7ffff7f24700 (LWP 21210)]
[21206,4159850240:19:11:31.637227] munmap: rounded length to 0x10000
[21206,4159850240:19:11:31.637249] invalid mmap flags 04022, ignoring extraneous bits
[21206,4159850240:19:11:31.637267] invalid mmap flags 04022, ignoring extraneous bits
[21206,4159850240:19:11:31.652784] NaClHostDescOpen: open returned -1, errno 2
[21206,4159850240:19:11:31.668011] munmap: rounded length to 0x10000
[21206,4159850240:19:11:31.668035] invalid mmap flags 04022, ignoring extraneous bits
[21206,4159850240:19:11:31.668050] invalid mmap flags 04022, ignoring extraneous bits
[21206,4159850240:19:11:32.109906] munmap: rounded length to 0x1b0000
[21206,4159850240:19:11:32.109983] invalid mmap flags 04022, ignoring extraneous bits
[21206,4159850240:19:11:32.110005] invalid mmap flags 04022, ignoring extraneous bits
[Switching to Thread 0x7ffff7f24700 (LWP 21210)]

Breakpoint 2, main () at t.c:19
19  hello_world();

(gdb) li
14 {
15  printf("Hello, world!\n");
16 }
17
18 int main() {
19  hello_world();
20  return 0;
21 }

(gdb) bt
#0  main () at t.c:19
#1  0x00007ff4010707a0 in __libc_start_main (main=<optimized out>, argc=<optimized out>, ubp_av=<optimized out>, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ff4ffffff00) at libc-start.c:227
#2  0x00007ff4010003e0 in _start (info=<optimized out>) at ../sysdeps/nacl/start.c:39
#3  0x00007ff4000210e0 in _dl_start (arg=<optimized out>) at rtld.c:589
#4  0x0000000000000000 in ?? ()

(gdb) c
Continuing.
Hello, world!
[Thread 0x7ffff7f47700 (LWP 21209) exited]
[Thread 0x7ffff7f24700 (LWP 21210) exited]
[Inferior 1 (process 21206) exited normally]

That's it.

Known Issues

The major known issue is that nacl64-gdb is not stable. Please provide feedback and report errors!

Comments