about summary refs log tree commit diff
path: root/tvix/docs
diff options
context:
space:
mode:
authorFlorian Klink <flokli@flokli.de>2024-07-21T14·55+0200
committerclbot <clbot@tvl.fyi>2024-07-21T21·35+0000
commit62184ee35acac69666ba26c7db852a4059fa5723 (patch)
treeefe0da94d5ff1211f6c7f6591b74ba69c640cdf6 /tvix/docs
parente408783bacec6a4c3b63341b0764ea05adca3745 (diff)
docs(tvix): document the builder API r/8393
This documents some thoughts and goals of the Tvix Build protocol, and
how it is possible to express Nix builds with it.

Additionally, it explains a proposed design for reference scanning.

Change-Id: I4b1f3feb2278e3c7ce06de831eb8eb1715cba1c9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12012
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
Diffstat (limited to 'tvix/docs')
-rw-r--r--tvix/docs/src/SUMMARY.md3
-rw-r--r--tvix/docs/src/build/index.md59
2 files changed, 62 insertions, 0 deletions
diff --git a/tvix/docs/src/SUMMARY.md b/tvix/docs/src/SUMMARY.md
index d64aa4b4b1b2..cce6d8966df6 100644
--- a/tvix/docs/src/SUMMARY.md
+++ b/tvix/docs/src/SUMMARY.md
@@ -34,6 +34,9 @@
 - [Data Model](./castore/data-model.md)
 - [Why not git trees?](./castore/why-not-git-trees.md)
 
+# Builder
+- [Build API](./build/index.md)
+
 # Nix
 - [Specification of the Nix Language](./language-spec.md)
 - [Nix language version history](./lang-version.md)
diff --git a/tvix/docs/src/build/index.md b/tvix/docs/src/build/index.md
new file mode 100644
index 000000000000..cf9580c98a82
--- /dev/null
+++ b/tvix/docs/src/build/index.md
@@ -0,0 +1,59 @@
+# Builder Protocol
+
+The builder protocol is used by tvix-glue to trigger builds.
+
+One goal of the protocol is to not be too tied to the Nix implementation itself,
+allowing it to be used for other builds/workloads in the future.
+
+This means the builder protocol is versatile enough to express the environment a
+Nix build expects, while not being aware of "what any of this means".
+
+For example, it is not aware of how certain environment variables are set in a
+nix build, but allows specifying environent variables that should be set.
+
+It's also not aware of what nix store paths are. Instead, it allows:
+
+ - specifying a list of paths expected to be produced during the build
+ - specifying a list of castore root nodes to be present in a specified
+   `inputs_dir`.
+ - specifying which paths are write-able during build.
+
+In case all specified paths are produced, and the command specified in
+`command_args` succeeds, the build is considered to be successful.
+
+This happens to be sufficient to *also* express how Nix builds works.
+
+Check `build/protos/build.proto` for a detailed description of the individual
+fields, and the tests in `glue/src/tvix_build.rs` for some examples.
+
+The following sections describe some aspects of Nix builds, and how this is
+(planned to be) implemented with the Tvix Build protocol.
+
+## Reference scanning
+At the end of a build, Nix does scan a store path for references to other store
+paths (*out of the set of all store paths present during the build*).
+It does do this by (only) looking for a list of nixbase32-encoded hashes in
+filenames (?), symlink targets and blob contents.
+
+While we could do this entirely outside the builder protocol, it'd mean a build
+client would be required to download the produced outputs locally, and do the
+refscan there. This is undesireable, as the builder already has all produced
+outputs locally, and it'd make more sense for it do do it.
+
+Instead, we want to describe reference scanning in a generic fashion.
+
+One proposed way to do this is to add an additional field `refscan_needles` to
+the `BuildRequest` message.
+If this is an non-empty list, all paths in `outputs` are scanned for these.
+
+The `Build` response message would then be extended with an `outputs_needles`
+field, containing the same number of elements as the existing `outputs` field.
+In there, we'd have a list of numbers, indexing into `refscan_needles`
+originally specified.
+
+For Nix, `refscan_needles` would be populated with the nixbase32 hash parts of
+every input store path and output store path. The latter is necessary to scan
+for references between multi-output derivations.
+
+This is sufficient to construct the referred store paths in each build output on
+the build client.