Native Client‎ > ‎PNaCl‎ > ‎

Bitcode ABI

This document describes the standards to which bitcode files must conform to be compatible with PNaCl.  This will be a subset of the LLVM bitcode along with PNaCl-specific metadata.  Here is a link to the LLVM bitcode reference: http://llvm.org/docs/LangRef.html.


NOTE: This document is a draft, and is subject to change while PNaCl remains under development. 



File Types

PNaCl recognizes three kinds of bitcode files:

Object files (.po)

Object files represent application code which has not yet been linked together. These files may contain symbols which are declared, but not defined.

Shared object files (.pso)

Shared object files represent application code which is intended to be shared between multiple applications. Normally, shared object files are produced by a linker which adds additional metadata to the bitcode file (see PNaCl-specific Metadata for details).

Executable files (.pexe)


Executable files represent a complete application. They are produced by a linker which adds additional metadata to the bitcode file 
(see PNaCl-specific Metadata for details). Executables files should have no undefined symbols, except those which are defined in an explicit external library dependency.



NOTE: The filename extension is not actually required, but they are useful as shorthand for each filetype.  You may continue to use ".o" instead of ".po" in Makefiles for object files, even though they are bitcode.

Module Properties


Target Datalayout

Only fields that identify PNaCl as little endian and ILP32, are set for purposes of bitcode optimization.

target datalayout = "e-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-p:32:32:32-v128:128:128"

Target Triple

target triple = "le32-unknown-nacl"

PNaCl-specific Metadata for Shared Linking (NOTE: not part of V1 ABI)

OutputFormat

object, shared, or executable. Stored using named metadata.

deplibs

List of library dependencies. Present for "shared" and "executable" files. Stored using existing field.

SOName

SOName of this library. Present only for "shared" files. Stored using named metadata.

Versioning

TODO: After this is implemented, document it.

Assembly

No assembly is permitted, including both module level and inline assembly.

Supported Intrinsic Variables, Instructions, and Functions

Bitcode may use any architecture-neutral LLVM intrinsics.

Compiler intrinsics
llvm.used
llvm.compiler.used
TEST=any test (we use this for various things)

llvm.global_ctors
llvm.global_dtors
TEST=native_client/tests/toolchain/initfini*  (indirectly)

Variable argument intrinsics (see below for more information)
llvm.va_start
llvm.va_end
llvm.va_copy
TEST=see any printf test, or native_client/tests/callingconv  (indirectly)

LibC intrinsics

llvm.memcpy.*  (TODO document alignment restrictions for memcpy, memmove, memset)
llvm.memmove.* 
llvm.memset.*
llvm.sqrt.*  (This is only valid for numbers >= -0.0)
llvm.powi.*
llvm.sin.*
llvm.cos.*
llvm.pow.*
llvm.exp.*
llvm.log.*
llvm.fma.* (can we really support this, or can we only support llvm.fmuladd.*?)
llvm.fabs.*
llvm.floor.*
TEST=TODO -- partly covered by native_client/tests/math/float_math (indirectly)

Bit Manipulation Intrinsics

llvm.bswap.*
llvm.ctpop.*
llvm.ctlz.*
llvm.cttz.*
TEST=native_client/tests/toolchain/llvm_bitmanip_intrinsics.c

Arithmetic with Overflow Intrinsics
llvm.sadd.with.overflow.*
TEST=TODO


Specialized Arithmetic Intrinsics
llvm.fmuladd.*  (similar to llvm.fma.*, but does not guarantee a fused multiply add)
TEST=TODO

Debugger Intrinsics
Supported for transient bitcode files during development/debug sessions.  The format is close to DWARF but may change, so it is not guaranteed to be supported.  Run "pnacl-strip --strip-debug" before shipping any bitcode.
TEST=TODO

Exception Handling Intrinsics
llvm.eh.*
TEST=tests/toolchain/eh_* (indirectly)

General Intrinsics
llvm.trap
llvm.expect
llvm.donothing
TEST=TODO

Unsupported Intrinsics

Half Precision Floating Point Intrinsics
llvm.convert.to.fp16
llvm.convert.from.fp16

    Trampoline Intrinsics
    llvm.init.trampoline
    llvm.adjust.trampoline

    Used to support nested functions (see nest parameter attribute).  llvm.init.trampoline currently expects a target specific trampoline size and alignment.  Perhaps a "large enough" trampoline size will suffice, but this has not been tested.

    General Intrinsics
    llvm.stackprotector

    Data Types

    Endianness

    Bitcode will assume that the machine is little-endian.

    Primitive Types

    Supported bitcode types: i8, i16, i32, i64, float, double, pointers, arrays, void, metadata.
    No other primitive types (like x8_fp80, fp128, ppc_fp128) are yet supported.

    Derived Types

    Aggregate types (arrays, structures, and opaque structures).
    Vectors may not make it into V1 (TODO) for ABI reasons.
    Pointers to other types.
    Functions.

    C Data Types

     C Type Size (bytes) Alignment (bytes)     Bitcode type
     void undefined undefined void
     char    1 1 i8
     short 2 2 i16
     int 4 4 i32
     long 4 4 i32
     long long 8 8 i64 
     float 4 4 float
     double 8 8 double
     long double 8 8 double
     void* 4 4 i8*
     function pointer 44

     function pointer type
     (specific to function signature)
     va_list  16 4 4 x i32
     struct - - Use bitcode struct type
     See the LLVM struct reference

    Bitfields

    Assumes little endian layout (double check).

    Attributes

    Supported function attributes

    • alwaysinline
    • inlinehint
    • naked
    • noinline
    • noreturn
    • nounwind
    • optsize
    • readnone
    • readonly
    • returns_twice

    Unsupported function attributes

    • alignstack(<n>)
    • is_nsdialect
    • nonlazybind
    • ssp
    • sspreq

    Supported function parameter attributes

    • zeroext
    • signext
    • inreg (up to 2. Supported but not guaranteed to have any effect. Only the first 2 arguments may be labelled inreg, if they are of integral type, or a single 64 bit integer may be labeled inreg. The first non-integral type in the arg sequence consumes the remaining available registers.)
    • byval
    • sret
    • noalias
    • nocapture

      Unsupported function parameter attributes

      • nest (see trampoline intrinsics) -- this is used for nested functions, but the trampoline area size and alignment is currently target specific so needs investigation.

        Supported linkage types

        private, linker_private, linker_private_weak, linker_private_weak_auto, internal, available_externally, linkonce, weak, common, appending, extern_weak, linkonce_odr, weak_odr, externally_visible

        Supported calling conventions

        ccc, fastcc, coldcc



        Calling Conventions

        Stack Variables

        Bitcode must not make any assumption about stack direction, alignment, or layout.
        Always use alloca to allocate space on the stack.

        Function Arguments

        When invoking a function, do not lower or expand function arguments. Always use correctly typed arguments.

        For example, here is the correct signature of strncmp:

        declare i32 @strncmp(i8*, i8*, i32) nounwind readonly

        This C code:

        long long example(long long x, double y);

        Should produce:

        declare i64 @example(i64 %x, double %y) nounwind

        Passing Structures by Value

        Structures passed as arguments (by value) must be represented as a single argument to the function.

        %struct.foo = type { i32, i8 }
        declare void @example(%struct.foo* nocapture byval %arg) nounwind

        How the backend for a particular architecture actually lowers byval is backend-specific.  This may or may not match the sys v ABI.

        Returning Structures

        Functions returning structures must use the sret attribute.

        declare void @example(%struct.foo* sret %result) nounwind

        Variadic Functions 

        Use "..." in bitcode to denote functions with variable arguments.

        declare i32 @printf(i8*, ...) nounwind

        Variadic functions should use the built-in LLVM intrinsics for accessing variable arguments.

        declare void @llvm.va_start(i32*) nounwind
        declare void @llvm.va_copy(i32*, i32*) nounwind
        declare void @llvm.va_end(i32*) nounwind

        %0 = va_arg i32* %ap, i32

        Setjmp/Longjmp

        setjmp and longjmp are provided as functions (TODO: do we want to use intrinsics instead?)
        struct jmp_buf is currently padded to be a very large buffer (to handle exotic architectures).

        C++

        Name Mangling

        We use the Itanium C++ mangling scheme.

        Initialization

        Constructors and destructors for static objects are listed in the bitcode arrays @llvm.global_ctors and @llvm.global_dtors.

        Exception tables / Unwind Information

        Unwind information (e.g., DWARF) is not directly exposed to bitcode.  It is only generated into the .o file.

        C++ should invoke LLVM intrinsics and functions from libgcc_eh to perform exception handling.

        Bitcode can assume that the functions defined in libgcc_eh.a (or libgcc_s.so) are present.

        LLVM instructions and intrinsics supporting unwinding:
        invoke
        landingpad
        resume
        TEST=native_client/tests/toolchain/eh_*

        llvm.returnaddress
        TEST=native_client/tests/toolchain/return_address.c




        NOTE: more work is needed to make the "struct _Unwind_Exception" truly platform neutral


        Debugging Information

        LLVM debug metadata is not part of the stable PNaCl bitcode ABI. For debugging, we recommend only having debug information in nexes and not pexes.  I.e., run pnacl-translate to convert pexes to nexes before running a debugger.  Before shipping, run pnacl-finalize to strip debug info from the bitcode.

        Documentation on LLVM debug intrinsics:
        http://llvm.org/docs/SourceLevelDebugging.html#format_common_intrinsics

        Operating System Interface

        Startup

        Bitcode may not define _start. Instead, bitcode applications should define main.

        Main

        main has the following signature:

        define i32 @main(i32 %argc, i8** %argv, i8** %envp) nounwind



        Comments