Current implementation is Win32-only but the interface should be the same across all OSes. In order to handle exception, we run sel_ldr under custom debugger process. This debugger process only handles sel_ldr exceptions and does not have a full-blown debugger functionality. Information about exception handler is stored with two syscalls in fields NaClApp.exception_handler and NaClThreadContext.exception_stack. When exception happens, debugger receives debugger event. Then it checks that exception happened in untrusted code and these two fields are correct. If everything is ok, it transfers control to exception_handler address and continues sel_ldr execution. If exception_stack is not zero, stack pointer is changed to it, otherwise it remains the same. Then previous value of stack (esp) and eip together with the zero return address are pushed to stack.
The return address is zero because it is hard to return to the interrupted program in NaCl which doesn't have ret instruction. So the handler must clobber some register. Syscall can't save as here, because springboard code clobbers registers too. Additionally we can't return to bundle-unaligned address but exceptions can happen everywhere.
All other registers are not changed and so handler can save them if necessary.
ProblemHow we are going to handle nested exceptions? Current implementation in case of non-zero exception_stack simply overwrites previous stack content. This could easily lead to infinite exception loop which is worse than crash.
We can create a flag that is set by the debugger when exception is happened. This flag can be cleared with a syscall by an exception handler if it transfers control to the catch blog. But what should we do on nested exception? We can either call handler on the same stack, or terminate the program.
If we call handler on the same stack, we have new range of problems. How this handler will notice that this is a nested exception? How clearing flag function should work? I propose to add an additional parameter that is set to 1 if exception is nested and nested exception handler shouldn't clear exception flag if it resumes handling of the first exception. Triple exceptions are indistinguishable from double exceptions in this scheme.
If we save both stack address and stack size instead of end of the stack, debugger can check whether the exception happened while stack points to this area. Then we again have to use to either call handler on the same stack, or terminate the program. If nested exception handler is called on the same stack, we can set additional parameter to 1.
If user haven't sent alternative stack for exception handling, nested exceptions will be indistinguishable from normal ones.
We can transfer control to another thread and stop the thread with exception. Exception-handling thread should change the context of the stopped thread and then resume its execution. We need to create a NaCl-safe context manipulation api to prevent sandbox breach. What we should do when exception happens in the exception-handling thread? How exception-handling thread should handle concurrent exceptions or should we create an exception-handling thread automatically?
My current implementation uses approach with the flag and aborting the program on nested exceptions. Syscall prototypes:
exception_handler (void (*handler) (int eip, int esp));
exception_stack (void *stack);