MemorySanitizer

MemorySanitizer (MSan) is a tool for detecting uses of uninitialized memory. 
MSan is EXPERIMENTAL, currently supported on x86_64 Linux only.
Additional info on the tool is available at http://clang.llvm.org/docs/MemorySanitizer.html.

MSan bots are running on chromium.memory.fyi and client.webrtcThere are also two LKGR builders for ClusterFuzz: no origins, chained origins (see below for explanation). Blink and V8 deployment is ongoing.
Note: If your build includes instrumented libraries (see below), you need to install build dependencies for those before building:

sudo third_party/instrumented_libraries/install-build-deps.sh

This is how we currently build:

GYP_DEFINES='use_goma=1 msan=1 use_instrumented_libraries=1 instrumented_libraries_jobs=20' gclient runhooks
ninja -C out/Release base_unittests

Flags breakdown:
  • msan=1: Enable MSan support in Chromium binaries.
  • use_instrumented_libraries=1: shared libraries which Chrome depends on will be rebuilt from source with MSan instrumentation. Without this flag MSan will produce lots of false reports, making it unusable. This flag is EXPERIMENTAL. Additionally, it's not currently guaranteed to work correctly on systems other than Ubuntu Precise and Ubuntu Trusty.
  • instrumented_libraries_jobs=20: Use multiple parallel jobs when building instrumented libraries. This specifies the number of jobs to be spawned per package, so the total number of jobs will be much higher. If you're not using goma, it's probably best to omit this flag.
Instrumented libraries are enabled with a separate flag because certain smaller targets do not need them. This applies to targets such as pdfium_test and d8, which do not depend on any external libraries. When building those targets you may set this flag, but doing so will not bring any additional benefit. Larger targets like Chrome and most Chromium test binaries will definitely require it.
Note that setting this flag will add a number (<100) of extra steps to the build. These steps will have names such as msan-<package_name> and each of them will build an entire instrumented package, thus they are expected to take much longer than other steps. The build should still take a reasonable time though (perhaps 20 minutes longer than a regular build).

The following flags are implied by msan=1 (i.e. you don't have to set them explicitly):
  • use_custom_libcxx=1Use a just-built, MSan-instrumented libc++ instead of the system-wide libstdc++. This is required to avoid false reports whenever the C++ standard library is used.
  • v8_target_arch=arm64: JavaScript code will be compiled for ARM64 and run on an ARM64 simulator. This allows MSan to instrument JS code. Without this flag there will be false reports.
Run the resulting binaries as usual. Pipe the output through tools/valgrind/asan/asan_symbolize.py to get symbolized reports.

Chrome must not use hardware OpenGL when running under MSan. This is because libgl.so is not instrumented and will crash the GPU process. OSMesa can be used as a software OpenGL implementation, although it is extremely slow. There are several ways to proceed:
  • --use-osmesa --disable-gpu-compositing: This is a reasonable combination of flags which enables Chrome's software compositor (much faster than OSMesa compositing) but still uses OSMesa for things like Flash and WebGL. Be aware that software compositing is not otherwise used by default, so you'll be testing a code path that is not enabled for most users. If you're not happy with that, you can drop the second flag.
  • --disable-gpu: This forces Chrome to use the software path for everything (not just compositing). WebGL will not be supported at all, but other GL-accelerated features may perform better compared to the combination of flags above.
  • --use-osmesa --disable-gl-drawing-for-tests: Use this if you don't care about the actual pixel output. This exercises the default code paths, however expensive OSMesa calls are replaced with stubs (i.e. nothing actually gets drawn to the screen).
If neither flag is specified, Chrome will fall back to the first option after the GPU process crashes with an MSan report.

Origin tracking

MSan allows the user to trade off execution speed for the amount of information provided in reports. This is controlled by the GYP flag msan_track_origins:
  • msan_track_origins=0: MSan will tell you where the uninitialized value was used, but not where it came from. This is the fastest mode.
  • msan_track_origins=1 (default): MSan will also tell you where the uninitialized value was originally allocated (e.g. which malloc() call, or which local variable).
  • msan_track_origins=2: MSan will also report the chain of stores that copied the uninitialized value to its final location. If there are more than 7 stores in the chain, only the first 7 will be reported. This mode is EXPERIMENTAL. Note also that compilation time may increase in this mode.

Suppressions

MSan does not support suppressions. This is an intentional design choice.

We have a blacklist file which is applied at compile time, and is used mainly to compensate for tool issues. Blacklist rules do not work the way suppression rules do - rather than suppressing reports with matching stack traces, they change the way MSan instrumentation is applied to the matched function. In addition, blacklist changes require a full clobber to take efffect. Please refrain from making changes to the blacklist file unless you know what you are doing.

Note also that instrumented libraries use separate blacklist files.

Debugging MSan reports

Important caveats:
  • Please keep in mind that simply reading/copying uninitialized memory will not cause an MSan report. Even simple arithmetic computations will work. To produce a report, the code has to do something significant with the uninitialized value, e.g. branch on it, pass it to a libc function or use it to index an array.
  • When you examine a stack trace in an MSan report, all third-party libraries you see in it (with the exception of libc and its components) should reside under out/Release/instrumented_libraries. If you see a DSO under a system-wide directory (e.g. /lib/), then the report is likely bogus and should be fixed by simply adding that DSO to the list of instrumented libraries (please ping earthdok@).
  • Inline assembly is also likely to cause bogus reports.
  • If you're trying to debug a V8-related issue, please keep in mind that MSan builds run V8 in ARM64 mode, as explained below.
MSan reserves a separate memory region ("shadow memory") in which it tracks the status of application memory. The correspondence between the two is bit-to-bit: if the shadow bit is set to 1, the corresponding bit in the application memory is considered "poisoned" (i.e. uninitialized). The header file <sanitizer/msan_interface.h> declares interface functions which can be used to examine and manipulate the shadow state without changing the application memory, which comes in handy when debugging MSan reports.

Print the complete shadow state of a range of application memory, including the origins of all uninitialized values, if any. (Note: though initializedness is tracked on bit level, origins have 4-byte granularity.)
void __msan_print_shadow(const volatile void *x, size_t size);

The following prints a more minimalistic report which shows only the shadow memory:
void __msan_dump_shadow(const volatile void *x, size_t size);

To mark a memory range as fully uninitialized/initialized:
void __msan_poison(const volatile void *a, size_t size);
void __msan_unpoison(const volatile void *a, size_t size);
void __msan_unpoison_string(const volatile char *a);

The following forces an MSan check, i.e. if any bits in the memory range are uninitialized the call will crash with an MSan report.
void __msan_check_mem_is_initialized(const volatile void *x, size_t size);

This milder check returns the offset of the first (at least partially) poisoned byte in the range, or -1 if the whole range is good:
intptr_t __msan_test_shadow(const volatile void *x, size_t size);

Hint: sometimes to reduce log spam it makes sense to query __msan_test_shadow() before calling __msan_print_shadow().

The complete interface can be found in src/third_party/llvm-build/Release+Asserts/lib/clang/3.6.0/include/sanitizer/msan_interface.h. Functions such as __msan_unpoison() can also be used to permanently annotate your code for MSan, but please CC earthdok@ or eugenis@ if you intend to do so.


Comments