Time complexity tests verify the order of growth of time complexity T(n) for various operations, generally verifying that this is O(1) or O(n), rather than O(n^{2}), say. These are a form of performance test (see Adding Performance Tests), but since they have a binary answer (satisfies the bound or doesn't), and we are concerned with the overall growth, not the absolute speed, they are instead technically in the Blink Layout Tests, and produce PASS/FAIL. These are potentially flaky, and layout test failures break the CQ, so please take extra care in testing and committing these. Tests consist of:
Test formatThese tests are written in JavaScript, and stored in the Tests take this form:LayoutTests/perf directory in the Blink tree (as of 2013April). The library that these tests use is LayoutTests/resources/magnitudeperf.js .<!DOCTYPE html> <script src="../resources/magnitudeperf.js"></script> <script> if (window.testRunner) testRunner.dumpAsText(); function setupFoo(magnitude) { // ... } function testFoo(magnitude) { // ... } Magnitude.description('Verifies that fooing is linear in the number of bars. See http://crbug.com/42'); Magnitude.initialExponent = 0; Magnitude.numPoints = 10; Magnitude.trim = 1; Magnitude.tolerance = 0.10; Magnitude.run(setupFoo, testFoo, Magnitude.LINEAR); </script> run commands, each of which can have different parameters. classlistremove.html is a good example with two tests in one file.Test parametersYou can set some test parameters, specifying what range of magnitudes to test and how strict the tests should be. This is optional, but generally necessary. This is because the tests default to quite strict tolerances, and start testing from n = 1, which is usually too low, since constant terms often dominate for low n. The design reason for requiring parameters is to make the data of "over what range and to what tolerance the growth rate holds" an intrinsic part of the test: by specifying these parameters, someone reading the test file can see at a glance how big input needs to be to be "close enough to infinity", and how noisy the data is. You can also specify more trials, or a different cutoff for overall success, but this is generally not necessary. Calibration procedureTo set the parameters:
When committing, safest is to:
crbug.com/123 [ Release ] perf/foo.html [ Pass Failure ]
Common parametersYou will primarily vary the parameters initialExponent, numPoints, trim, and tolerance. For example:Magnitude.initialExponent = 0; Magnitude.numPoints = 10; Magnitude.trim = 1; Magnitude.tolerance = 0.10; This means:
Full listThe parameters are as follows: Range of magnitudes
You will usually need to modify these, as for low magnitudes low order terms (especially constants) often dominate, obscuring the high order terms (linear, quadratic, exponential) that we're interested in, but we don't want to test for 2^{20} to 2^{30}, say, both because this is slow, and has noise from memory management once you get this big. Usually around 2^{8} to 2^{15} is a good range, but it varies.
Usually leaving these at defaults (3 trials, 50% success threshold) is fine, and instead modify statistical parameters (trim and tolerance). You generally want an odd number of trials, so you can use majority rule: "Never go to sea with two chronometers; take one or three." (quoted in Fredrick Brooks, The Mythical ManMonth, p. 64), and 3 trials is usually enough. 50% (majority rule) optimizes the overall tradeoff between sensitivity and specificity (true positive rate and true negative rate). If you have the same rate for both, the complimentary rates (false negative rate and false positive rate) are attenuated linearly (as powers, i.e., on the log scale) in number of trials needed for success/failure. By "linear attenuation" we mean like "decreasing decibels of noise": if false negative rate is α and false positive rate is β, then requiring j successes for overall success or k = (n − j) + 1 failures for overall failure makes the false negative rate approximately (ignoring constant factor) α^{j} and false positive rate approximately β^{k}, which in log terms is j log α and k log β. For n = 3 and requiring 2 successes for overall success (hence 2 failures for overall failure), this doubles the attenuation of the false negative and false positive rate, which is usually enough. In rare circumstances you may want to change these:
Run time
At each magnitude, the test is run until this time is used up, then averaged. You can increase this if you have a slow test, or at running at high magnitudes where you're not getting enough iterations to get a good average. Consider if you can instead decrease the maximum magnitude. Statistical parameters
Given the times for all magnitudes, the test statistic (see below) is computed by first trimming the most extreme values (basically outliers in the times, actually derived values like ratio between successive times), then computed and checked against the tolerance. These default to quite strict values – no trim and 5% tolerance – which are quite safe, but usually too strict. You will thus need to relax them and specify how far from model the actual data is. Common reasonable values include: trim = 1; tolerance = 0.10; trim = 2; tolerance = 0.20; numPoints !Larger tolerances are ok when testing for O(1) – even values of 100% or more are ok if you're only interested in checking that it's not linear or worse (rather than not log) – but if you start getting to 50% (at which point you're not rejecting O(log n)), you have very noisy data. Interpreting outputFor linear and polynomial tests – O(n) and O(n^{2}) – the key numbers to look at are the logratios (base 2), which are effectively the exponent of the order of growth at that step. Thus for linear growth this should be almost 1, while for quadratic or greater growth it should be 2 or more. For example, given the following data: logratios (base 2): 0.44961898695576596,1.4623608881897614,2.432344314952156,2.5037162001090283,1.9487150784480256,1.2812301235421082 Maximum Absolute Deviation: 1.5037162001090283 logratios (base 2): 0.45,1.46,2.43,2.50,1.95,1.28 Maximum Absolute Deviation: 1.50 Testing procedureThe code at magnitudeperf.js is of course the ultimate reference.You can get very detailed diagnostics and intermediate calculations by running in the TestRunner (?), or setting an invalid expectation (e.g., expecting CONSTANT for data you expect to be LINEAR). In outline, testing proceeds as follows. We time the run time of the tested function at exponentially spaced magnitudes, e.g., 2^{0}, 2^{1}, 2^{2}, 2^{3}, .... We then verify if these times look like the time complexity we're expecting (constant, linear, or polynomial (quadratic or greater)). Tests are robust, nonparametric statistical tests, since timing is noisy (so need to be robust), and noise can take various forms (so nonparametric, since no particular model of noise). In more detail:
Statistical testsTests are all of the form: "use trimmed maximum absolute deviation (from model) as test statistic", meaning "compare this value against tolerance, and succeed if error doesn't exceed tolerance". In detail:
For constant, this compares each value against the median. (If deviation from median is only 20% or 30% as you vary n from 1 to 1,024, any linear or higher term is very small.) For linear and polynomial (quadratic or more), at each step, check what doubling the input does to the time – for linear it should double it, while for polynomial it should quadruple it or more. Taking log (base 2), this says the log of ratios of successive times (equivalently, difference of successive logs) should be 1 (linear) or more than 2 (quadratic or more). This is effectively looking at slope of successive steps on the loglog scale, as the slope of a monomial on the loglog scale is its exponent. These are simple tests, not sophisticated ones (we're not doing robust nonparametric linear regressions on the loglog scale, say), but they are thus transparent, and should be sufficient. More sophisticated tests that are still easily implemented include doing Ordinary Least Squares (OLS) on the loglog scale, and doing this iteratively (discarding point with greatest residual and repeating). CaveatsWhen writing tests, watch out for:
