about summary refs log tree commit diff
path: root/ops
diff options
context:
space:
mode:
authorsterni <sternenseemann@systemli.org>2022-09-26T21·46+0200
committersterni <sternenseemann@systemli.org>2022-10-08T10·59+0000
commit57d5988b340ec1b799882f00323010d9435892ca (patch)
treedadde22436ba1e502b2b2c50d57dc1f797950be3 /ops
parentca3bd5c7cabf517f23234501928912d55fef45b3 (diff)
feat(nix/dependency-analyzer): find deps among a list of known drvs r/5060
This was written with the same intention (and reuses a little of its
code) as cl/5060 and cl/5063: We want to be able to emit dependencies
between //nix/buildkite pipeline steps, so that no agent is occupied
with waiting on locks for derivations built by a different agent.

This dependency information is already available to the Nix store
implementation (e.g. via `nix-store --query --references`) and can also
be obtained in the Nix language which is important, since the pipeline
is generated at evaluation time. (Note: For Nix 2.3, you either need a
strong convention about how derivations expose their dependencies (which
we don't) or rely on store implementation internals (drv files).
For Nix 2.6 there is a better trick, but it also relies on the existence
of drv files.)

The actual task can be formulated as follows: Given a set of
derivations, calculate the the closest derivations also in the input
each derivation depends on. (We call these (next) known dependencies.)
This is crucial because pipeline step often depend on each other only
indirectly with any number of intermediate derivations. For cl/5064 I
determined that 6 intermediate layers is quite common for dependencies
that are perceived to be “direct”.

This problem is solved as follows:

1. Calculate the dependency graph of the combined dependency closure of
   all input derivations. This is quite easy and fairly quick thanks to
   the C++ implementation of builtins.genericClosure. One weak point of
   the current implementation is that the function to determine the
   direct derivation dependencies for Nix < 2.6 is quite hacky.

2. Take the graph from 1. and calculate a dependency graph that only
   connects the known derivations of the input, but retains all
   connections between them (minus intermediate nodes).

In practice the dependency graph is represented as an attribute set
mapping derivation paths to a list of derivation paths it depends on.
The second step is performed by adding a second list of known derivation
paths it depends on.

The main improvements over the previous concept (cl/5060 and cl/5063):

* We only try to find the closest known dependencies in the dependency
  graph whereas we would traverse emit dependencies for the entire
  dependency closure.

* We immediately store the calculation of the closest known dependency
  in the dependency graph, even for intermediate nodes. This avoids
  recalculating the connection (which was a big drawback of the previous
  approach) and makes the calculation itself cheaper.

You can run `mg build //nix/dependency-analyzer:example` to build a
visualization of the internal dependencies between `depot.ci.targets` as
discovered by dependency-analyzer.

Change-Id: If8c0cdfc8470d4b337336257d9818aaa0d51110f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6832
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Diffstat (limited to 'ops')
0 files changed, 0 insertions, 0 deletions