This page is (mostly) obsoleteThere used to be a "Memory Full" waterfall with dedicated memory sheriffs. The bots on that waterfall have been merged into the main waterfall here and the duties of the memory sheriffs have been transferred to main waterfall sheriffs. There are no dedicated memory sheriffs anymore. The information below is obsolete and retained for historical purposes only. Tools on the memory waterfall
Sheriffing ToolsWe have several tools designed to simplify the sheriffing duties.First, waterfall.sh, try it like this: sh tools/valgrind/waterfall.sh Please read the chromium-dev thread about this script for the basic idea and some how-to's. Next, there is scan-build.py. It allows you to scan through build logs looking for common terms. (Most often a error hash, so you can quickly see when an error first surfaced) tools/valgrind/scan-build.py --update # updates the local cache to the latest statet ools/valgrind/ scan-build.py --find <string> # looks through all build logs for a stringFinding travels backwards until it hasn't encountered the search term for a given number of builds. (CUTOFF, currently set to 100). If it travels further backwards than your current cache is filled, it will automatically fetch more old logs. It will however _not_ fetch newer logs than the last --update fetched.Updating makes sure that we have at least CUTOFF builds locally available, and catches up to the latest build logs. What to do with failures on the Memory FYI waterfallThere are two main types of failures you can observe on the memory bots: memory reports detected and test failures. Both are actionable by either fixing the code (probably reverting a recent change) or suppressing/excluding the failures. Recomendation: consider sending your patches to the next sheriff on the schedule. Memory errors are not fixed fast usually, so it's good to be up to date before you start your sheriffing shift. When to close the tree or revertSince the bots on the Memory FYI waterfall cycle slowly, it's hard to keep up with what's happening on these builders so we don't close the tree automatically as other waterfalls do. You may want to close the tree manually to throttle commits so you can commit your suppressions faster. You can close the tree by typing "Tree is closed (Memory FYI waterfall is too red)" at http://chromium-status.appspot.com/ Please note that some of the reports indicate serious bugs (e.g. "unaddressable access", "use after free", etc. - they are likely to affect stability/security). If you see a new serious report and it's clear which change caused it - go ahead and revert. Also, the same rule applies to not-so-serious reports: if you see a recent commit with an obvious bug which showed up on the Valgrind bots, talk to the commiter if he's OK with reverting and polishing his CL. This is something like an unsolicited code review, right? :) Suppressing memory reportsWe suppress some of the memory reports, either because they are from system libraries we can't do anything about, or because we already have bugs filed in the Chromium issue tracker.By suppressing errors instead of excluding tests we still get coverage for the tests with known memory reports.
The chrome_tests.sh/bat scripts read overall suppressions from several sources:
Suppressions for TSan v2 live in tools/valgrind/tsan_v2/suppressions.txt. See the ThreadSanitizer v2 documentation for more info. In general, any suppression that is there because of a bug in chromium should be named bug_NNNNNN where NNNNNN is the chromium bug number, and the changeset that adds that suppression should include the string BUG=NNNNN in its description. The runner script automatically generates suppressions for all unique errors reported, like this: 22 bytes in 1 blocks are definitely lost in loss record 491 of 3,129 // this is a report malloc (mp/scripts/valgrind-memcheck/coregrind/m_replacemalloc/vg_replace_malloc.c:241) WTF::fastMalloc(unsigned int) (third_party/WebKit/JavaScriptCore/wtf/FastMalloc.cpp:249) WebCore::StringImpl::createUninitialized(unsigned int, unsigned short*&) (third_party/WebKit/JavaScriptCore/wtf/text/StringImpl.cpp:96) WebCore::StringImpl::create(unsigned short const*, unsigned int) (third_party/WebKit/JavaScriptCore/wtf/text/StringImpl.cpp:108) WebCore::StringImpl::substring(unsigned int, unsigned int) (third_party/WebKit/JavaScriptCore/wtf/text/StringImpl.cpp:186) WebCore::String::substring(unsigned int, unsigned int) const (third_party/WebKit/JavaScriptCore/wtf/text/WTFString.cpp:257) WebCore::KURLGooglePrivate::componentString(url_parse::Component const&) const (third_party/WebKit/WebCore/platform/KURLGoogle.cpp:313) [SNIP - some random stuff e.g. MessageLoop, DispatchToMethod etc] The report came from the `AutomationProxyVisibleTest.WindowGetViewBounds` test. Suppression (error hash=#0CAC77B0AD40A91D#): { <insert_a_suppression_name_here> // file a bug and replace it with bug_NNNNN before commiting Memcheck:Leak fun:malloc fun:_ZN3WTF10fastMallocEj fun:_ZN7WebCore10StringImpl19createUninitializedEjRPt fun:_ZN7WebCore10StringImpl6createEPKtj fun:_ZN7WebCore10StringImpl9substringEjj fun:_ZNK7WebCore6String9substringEjj fun:_ZNK7WebCore17KURLGooglePrivate15componentStringERKN9url_parse9ComponentE [SNIP] } First, check there's no similar suppression in the corresponding suppression files. It may just need some wildcarding. If there's no such suppression, copy everything in between {...} and add it to the appropriate suppressions file, e.g. if a Dr. Memory failure is an uninitalized read, add the suppression to tools/valgrind/drmemory/suppressions_full.txt. Consider removing the bottom frames of a long callstack if they unnecessarily narrow the scope, but do not make the suppression so general it precludes identifying other bugs. Make sure to file a bug (see recommendations below) and use the bug number as the name of the suppression. { Memcheck:Leak fun:malloc fun:_ZN3WTF10fastMallocEj fun:_ZN7WebCore10StringImpl19createUninitializedEjRPt fun:_ZN7WebCore10StringImpl6createEPKtj fun:_ZN7WebCore10StringImpl9substringEjj fun:_ZNK7WebCore6String9substringEjj fun:_ZNK7WebCore17KURLGooglePrivate15componentStringERKN9url_parse9ComponentE } Sometime the compiler may produce corrupted pdb file (crbug.com/371847) and cause Dr. Memory report empty stack traces. A clobber rebuild on the builder bot is required to clear the corrupted pdb file and fix the problem. Now send the patch for review.Review recommendations:
When the leak gets fixed, make sure to ask the person who fixes it to remove the suppression again -- ideally in the same CL that contains the fix. Excluding testsSome tests run slowly or poorly under heavyweight tools like Valgrind, and Dr. Memory in full mode.If they fail even without the tool (i.e., natively), just add the DISABLED_ prefix to the test case name. If tests are hanging or crashing only on Valgrind or Dr. Memory, disable them using the files in tools/valgrind/gtest_exclude/test_binary.gtest[-drmemory][_platform].txt, where test_binary is (base_unittests, ui_tests, etc), -drmemory limits the exclusion to Dr. Memory, and platform can be none (Linux, and Windows), linux, or win32. For ThreadSanitizer v2 there're no exclusion files. The only way to disable a test under TSan v2 and MSAN is to make it DISABLED_ under #if defined(THREAD_SANITIZER) and #if defined(MEMORY_SANITIZER), respectively. Please file bug(s) for any tests you disable and point at the bug(s) where you exclude the test(s)! For example, if ExampleTest.PeelOranges from unit_tests fails under Valgrind, add the following to tools/valgrind/gtest_exclude/unit_tests.gtest.txt: # Crashes when run under Valgrind. http://crbug.com/4567 ExampleTest.PeelOranges These files accept ' |