diff options
author | Florian Klink <flokli@flokli.de> | 2024-07-21T14·55+0200 |
---|---|---|
committer | clbot <clbot@tvl.fyi> | 2024-07-21T21·35+0000 |
commit | 62184ee35acac69666ba26c7db852a4059fa5723 (patch) | |
tree | efe0da94d5ff1211f6c7f6591b74ba69c640cdf6 | |
parent | e408783bacec6a4c3b63341b0764ea05adca3745 (diff) |
docs(tvix): document the builder API r/8393
This documents some thoughts and goals of the Tvix Build protocol, and how it is possible to express Nix builds with it. Additionally, it explains a proposed design for reference scanning. Change-Id: I4b1f3feb2278e3c7ce06de831eb8eb1715cba1c9 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12012 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: yuka <yuka@yuka.dev> Tested-by: BuildkiteCI
-rw-r--r-- | tvix/docs/src/SUMMARY.md | 3 | ||||
-rw-r--r-- | tvix/docs/src/build/index.md | 59 |
2 files changed, 62 insertions, 0 deletions
diff --git a/tvix/docs/src/SUMMARY.md b/tvix/docs/src/SUMMARY.md index d64aa4b4b1b2..cce6d8966df6 100644 --- a/tvix/docs/src/SUMMARY.md +++ b/tvix/docs/src/SUMMARY.md @@ -34,6 +34,9 @@ - [Data Model](./castore/data-model.md) - [Why not git trees?](./castore/why-not-git-trees.md) +# Builder +- [Build API](./build/index.md) + # Nix - [Specification of the Nix Language](./language-spec.md) - [Nix language version history](./lang-version.md) diff --git a/tvix/docs/src/build/index.md b/tvix/docs/src/build/index.md new file mode 100644 index 000000000000..cf9580c98a82 --- /dev/null +++ b/tvix/docs/src/build/index.md @@ -0,0 +1,59 @@ +# Builder Protocol + +The builder protocol is used by tvix-glue to trigger builds. + +One goal of the protocol is to not be too tied to the Nix implementation itself, +allowing it to be used for other builds/workloads in the future. + +This means the builder protocol is versatile enough to express the environment a +Nix build expects, while not being aware of "what any of this means". + +For example, it is not aware of how certain environment variables are set in a +nix build, but allows specifying environent variables that should be set. + +It's also not aware of what nix store paths are. Instead, it allows: + + - specifying a list of paths expected to be produced during the build + - specifying a list of castore root nodes to be present in a specified + `inputs_dir`. + - specifying which paths are write-able during build. + +In case all specified paths are produced, and the command specified in +`command_args` succeeds, the build is considered to be successful. + +This happens to be sufficient to *also* express how Nix builds works. + +Check `build/protos/build.proto` for a detailed description of the individual +fields, and the tests in `glue/src/tvix_build.rs` for some examples. + +The following sections describe some aspects of Nix builds, and how this is +(planned to be) implemented with the Tvix Build protocol. + +## Reference scanning +At the end of a build, Nix does scan a store path for references to other store +paths (*out of the set of all store paths present during the build*). +It does do this by (only) looking for a list of nixbase32-encoded hashes in +filenames (?), symlink targets and blob contents. + +While we could do this entirely outside the builder protocol, it'd mean a build +client would be required to download the produced outputs locally, and do the +refscan there. This is undesireable, as the builder already has all produced +outputs locally, and it'd make more sense for it do do it. + +Instead, we want to describe reference scanning in a generic fashion. + +One proposed way to do this is to add an additional field `refscan_needles` to +the `BuildRequest` message. +If this is an non-empty list, all paths in `outputs` are scanned for these. + +The `Build` response message would then be extended with an `outputs_needles` +field, containing the same number of elements as the existing `outputs` field. +In there, we'd have a list of numbers, indexing into `refscan_needles` +originally specified. + +For Nix, `refscan_needles` would be populated with the nixbase32 hash parts of +every input store path and output store path. The latter is necessary to scan +for references between multi-output derivations. + +This is sufficient to construct the referred store paths in each build output on +the build client. |