The JSON Test Results Format is a generic file format we use to record the results of each individual test in test run (whether the test is run on a bot, or run locally).
We use these files on the bots in order to determine whether a test step had any failing tests (using a separate file means that we don't need to parse the output of the test run, and hence the test can be tailored for human readability as a result). We also upload the test results to dashboards like the Flakiness Dashboard (http://test-results.appspot.com).
The test format originated with the Blink layout tests, but has since been adopted by GTest-based tests and Python unittest-based tests, so we've standardized on it for anything related to tracking test flakiness.
Here's a very simple example for one Python test:
% python mojo/tools/run_mojo_python_tests.py --write-full-results-to results.json mojom_tests.parse.ast_unittest.ASTTest.testNodeBase
Running Python unit tests under mojo/public/tools/bindings/pylib ...
Ran 1 test in 0.000s
% cat results.json
As you can see, the format consists of a one top level dictionary containing a set of metadata fields describing the test run, plus a single 'tests' key that contains the results of every test run, structured in a hierarchical trie format to reduce duplication of test suite names (as you can see from the deeply hierarchical Python test name).
The file is strictly JSON-compliant. As a part of this, the order the name appear in each object is unimportant.
Top-level field names
| Name|| Data Type||Description|
| interrupted || boolean || Required. Whether the test run was interrupted and terminated early (either via the runner bailing out or the user hitting ctrl-C, etc.) If true, this indicates that not all of the tests in the suite were run and the results are at best incomplete and possibly totally invalid.|
| num_failures_by_type|| dict|| Required. A summary of the totals of each result type. If a test was run more than once, only the first invocation's result is included in the totals. Each key is one of the result types listed below. A missing result type is the same as being present and set to zero (0).|
| path_delimiter || string|| Optional, will be mandatory. The separator string to use in between components of a tests name; normally "." for GTest- and Python-based tests and "/" for layout tests; if not present, you should default to "/" for backwards-compatibility. |
| seconds_since_epoch || float|| Required. The start time of the test run expressed as a floating-point offset in seconds from the UNIX epoch.|
| tests|| dict|| Required. The actual trie of test results. Each directory or module component in the test name is a node in the trie, and the leaf contains the dict of per-test fields as described below.|
| version|| integer|| Required. Version of the file format. Current version is 3.|
| build_number|| string|| Optional. If this test run was produced on a bot, this should be the build number of the run, e.g., "1234".|
| builder_name|| string|| Optional. If this test run was produced on a bot, this should be the builder name of the bot, e.g., "Linux Tests".|
| chromium_revision|| string|| Optional. The revision of the current Chromium checkout, if relevant, e.g. "356123".|
| has_pretty_patch|| bool|| Optional, layout test specific, deprecated. Whether the layout tests' output contains PrettyDiff-formatted diffs for test failures.|
| has_wdiff|| bool|| Optional, layout test specific, deprecated. Whether the layout tests' output contains wdiff-formatted diffs for test failures.|
| layout_tests_dir|| string|| Optional, layout test specific. Path to the LayoutTests directory for the test run (used so that we can link to the tests used in the run).|
| pixel_tests_enabled|| bool|| Optional, layout test specific. Whether the layout tests' were run with the --pixel-tests flag. |
| fixable|| integer|| Optional, deprecated. The number of tests that were run but were expected to fail.|
| num_flaky|| integer|| Optional, deprecated. The number of tests that were run more than once and produced different results each time.|
| num_passes|| integer|| Optional, deprecated. The number of successful tests; equivalent to "num_failures_by_type"["Pass"].|
| num_regressions|| integer|| Optional, deprecated. The number of tests that produced results that were unexpected failures.|
| skips|| integer|| Optional, deprecated. The number of tests that were found but not run (tests should be listed in the trie with "expected" and "actual" values of "SKIP".|
Each leaf of the 'tests' trie contains a dict containing the results of a particular test name. If a test is run multiple times, the dict contains the results for each invocation in the 'actual' field.
| Field Name|| Data Type|| Description|
| actual|| string|| Required. An ordered space-separated list of the results the test actually produced. "FAIL PASS" means that a test was run twice, failed the first time, and then passed when it was retried. If a test produces multiple different results, then it was actually flaky during the run.|
| expected|| string|| Required. An unordered space-separated list of the result types expected for the test, e.g. "FAIL PASS" means that a test is expected to either pass or fail. A test that contains multiple values is expected to be flaky.|
| || || |
| bugs|| string|| Optional. A comma-separated list of URLs to bug database entries associated with each test.|
| is_unexpected|| bool|| Optional. If present and true, the failure was unexpected (a regression). If false (or if the key is not present at all), the failure was expected and will be ignored.|
| time|| float|| Optional. If present, the time it took in seconds to execute the first invocation of the test.|
| times|| array of floats|| Optional. If present, the times in seconds of each invocation of the test.|
| has_repaint_overlay|| bool|| Optional, layout test specific. If present and true, indicates that the test output contains the data needed to draw repaint overlays to help explain the results (only used in layout tests).|
| is_missing_audio|| bool|| Optional, layout test specific. If present and true, the test was supposed to have an audio baseline to compare against, and we didn't find one.|
| is_missing_text|| bool|| Optional, layout test specific. If present and true, the test was supposed to have a text baseline to compare against, and we didn't find one. |
| is_missing_video|| bool|| Optional, layout test specific. If present and true, the test was supposed to have an image baseline to compare against and we didn't find one.|
| is_testharness_test|| bool|| Optional, layout test specific. If present, indicates that the layout test was written using the w3c's test harness and we don't necessarily have any baselines to compare against.|
| reftest_type|| string|| Optional, layout test specific. If present, one of "==" or "!=" to indicate that the test is a "reference test" and the results were expected to match the reference or not match the reference, respectively (only used in layout tests).|
Test result types
Any test may fail in one of several different ways. There are a few generic types of failures, and the layout tests contain a few additional specialized failure types.
| Result type|| Description|
| "SKIP"|| The test was not run.|
| "PASS"|| The test ran as expected.|
| "FAIL"|| The test did not run as expected.|
| "CRASH"|| The test runner crashed during the test.|
| "TIMEOUT"|| The test hung (did not complete) and was aborted.|
| "MISSING"|| Layout test specific. The test completed but we could not find an expected baseline to compare against|
| "LEAK"|| Layout test specific. Memory leaks were detected during the test execution.|
| "SLOW"|| Layout test specific. The test is expected to take longer than normal to run.|
| "TEXT"|| Layout test specific, deprecated. The test is expected to produce a text-only failure (the image, if present, will match). Normally you will see "FAIL" instead.|
| "AUDIO"|| Layout test specific, deprecated. The test is expected to produce audio output that doesn't match the expected result. Normally you will see "FAIL" instead.|
| "IMAGE"|| Layout test specific. The test produces image (and possibly text output). The image output doesn't match what we'd expect, but the text output, if present, does.|
| "IMAGE+TEXT"|| Layout test specific, deprecated. The test produces image and text output, both of which fail to match what we expect. Normally you will see "FAIL" instead.|
| "REBASELINE" || Layout test specific. The expected test result is out of date and will be ignored (any result other than a crash or timeout will be considered as passing). This test result should only ever show up on local test runs, not on bots (it is forbidden to check in a TestExpectations file with this expectation). This should never show up as an "actual" result.|
| "NEEDSREBASELINE"|| Layout test specific. The expected test result is out of date and will be ignored (as above); the auto-rebaseline-bot will look for tests of this type and automatically update them. This should never show up as an "actual" result.|
| "NEEDSMANUALREBASELINE" || Layout test specific. The expected test result is out of date and will be ignored (as above). This result may be checked in to the TestExpectations file, but the auto-rebasline-bot will ignore these entries. This should never show up as an "actual" result.|
full_results.json and failing_results.json
The layout tests produce two different variants of the above file. The "full_results.json" file matches the above definition and contains every test executed in the run. The "failling_results.json" file contains just the tests that produced unexpected results, so it is a subset of the full_results.json data. The failing_results.json file is also in the JSONP format, so it can be read via as a <script> tag from an html file run from the local filesystem without falling prey to the same-origin restrictions for local files. The failing_results.json file is converted into JSONP by containing the JSON data preceded by the string "ADD_RESULTS(" and followed by the string ");", so you can extract the JSON data by stripping off that prefix and suffix.