Not all tests can be run on all machines. Currently, the dynamic suite infrastructure has no way to deal with this issue. Our intent is to use the existing autotest control file DEPENDENCIES field to express what a test needs in order to be able to run. Any label in use in the autotest lab is valid in a test’s DEPENDENCIES list.Currently, it seems that DEPENDENCIES in control files are parsed in only when tests are imported for use via the web front end, and they (may be) honored when tests are scheduled via that method.
Dependencies can also be specified when running a test via the create_job() RPC exposed by the AFE.
There are two, related, issues here. One is how individual tests will declare the things they need from a host in order to run, and how the dynamic suite infrastructure will schedule these tests on appropriate machines. The second, more complex issue, is how the dynamic suites code will ensure that -- for a given suite -- it reimages a set of hosts that will enable all tests in the suite to be scheduled.
As the dynamic suite code is already parsing each individual test control file it intends to run, it will not be too hard for it to add the contents of DEPENDENCIES to the metahost spec that it already builds up when scheduling tests with the create_job() RPC.
There are several parameters that will inform dynamic suites’ decision about which machines to reimage for a suite:
These criteria introduce some new potential failure modes for suites:
There are also some resource efficiency concerns here:
There’s a set of machines in the lab, each of which is described by a set of labels There’s also a suite (read: set) of tests, each of which is described by a set of labels (read: dependencies). For a given N, we need to find a set of N machines such that the dependencies of each test in the suite are satisfied by some machine in the set.
The difficulty is that we want to use ‘rare’ machines only if we really need them. If we weight hosts somehow, this should allow us to build a scoring function that we can use in a simulated annealing or hill-climbing kind of approximation algorithm.
At first blush, this problem felt to me like a set-cover problem. Merge all the dependencies of all tests in the suite into a single, de-dup’d list, and then find the smallest number of devices whose label sets covered the whole she-bang. The problem is that this algorithm could provide a solution that leaves some tests unsatisfied. Consider the following example:
This algorithm would merge the test deps into [RPM, BT, GOBI3K, i5], and determine that choosing both hosts would successfully cover. The problem is that the second test cannot be run in this testbed, as no machine meets all its dependencies.
Since we only have tests with a single dependency these days, we’re going to start by gathering all the deps of all the tests in the suite, de-duping them, and finding <= N machines to satisfy those deps. We may try to be smart about how we pad out the machines to reach N, if we can. If we have 10 Gobi3K tests and 2 Gobi2K tests, and N == 3, then we might try to get 2 Gobi3K devices and 1 Gobi2K, for example. If we can satisfy all tests with < N devices, and cannot pad to reach N total devices, we’ll WARN, but continue to run the suite.