diff options
-rw-r--r-- | tvix/docs/component-flow.puml | 20 | ||||
-rw-r--r-- | tvix/docs/components.md | 74 |
2 files changed, 63 insertions, 31 deletions
diff --git a/tvix/docs/component-flow.puml b/tvix/docs/component-flow.puml index 3bcddbe7464e..5b6d79b82313 100644 --- a/tvix/docs/component-flow.puml +++ b/tvix/docs/component-flow.puml @@ -28,7 +28,7 @@ note right Immediately starts streaming derivations as they are instantiated across the dependency graph so they can be built while the evaluation is still running. - There are two types of build requests: One for regular "fire and forget" builds + There are two types of build requests: One for regular "fire and forget" builds, and another for IFD (import from derivation). These are distinct because IFD needs to be fed back into the evaluator for @@ -42,27 +42,13 @@ loop while has more derivations Coord<--Store: Success response else Store does not have path Coord-->Build: Request derivation to be built - note left - The build request optionally includes a desired store. - If a builder is aware of how to push to the store it will do so - directly when the build is finished. - - If the store is not known by the builder results will be streamed - back to the coordinator for store addition. - end note alt Build failure Coord<--Build: Fail response note left: It's up to the coordinator whether to exit on build failure else Build success - alt Known store - Build-->Store: Push outputs to store - Build<--Coord: Send success & pushed response - else Unknown store - Build<--Coord: Send success & not pushed response - Coord<--Build: Stream build outputs - Coord-->Store: Push outputs to store - end + Build-->Store: Push outputs to store + Build<--Coord: Send success & pushed response end end diff --git a/tvix/docs/components.md b/tvix/docs/components.md index 19e7baa3ec8a..a7d61948c2fa 100644 --- a/tvix/docs/components.md +++ b/tvix/docs/components.md @@ -63,12 +63,14 @@ to generate configuration without any build or store involvement. command itself. We give it filesystem access to handle things like imports or `builtins.readFile`. -In the future, we might abstract away raw filesystem access by -allowing the evaluator to request files from the coordinator (which -will query the store for it). This might get messy, and the benefits -are questionable. We might be okay with running the evaluator with -filesystem access for now and can extend the interface if the need -arises. +To support IFD, the Evaluator also needs access to store paths. This +could be implemented by having the coordinator provide an interface to retrieve +files from a store path, or by ensuring a "realized version of the store" is +accessible by the evaluator (this could be a FUSE filesystem, or the "real" +/nix/store on disk. + +We might be okay with running the evaluator with filesystem access for now and +can extend the interface if the need arises. ## Builder @@ -95,20 +97,64 @@ dominant Linux containerisation technology, by default. With a well-defined builder abstraction, it's also easy to imagine other backends such as a Kubernetes-based one in the future. +The environment in which builds happen is currently very Nix-specific. We might +want to avoid having to maintain all the intricacies of a Nix-specific +sandboxing environment in every builder, and instead only provide a more +generic interface, receiving build requests (and have the coordinator translate +derivations to that format). [^1] + +To build, the builder needs to be able to mount all build inputs into the build +environment. For this, it needs the store to expose a filesystem interface. + ## Store *Purpose:* Store takes care of storing build results. It provides a -unified interface to get file paths and upload new ones. +unified interface to get store paths and upload new ones, as well as querying +for the existence of a store path and its metadata (references, signatures, …). + +Tvix natively uses an improved store protocol. Instead of transferring around +NAR files, which don't provide an index and don't allow seekable access, a +concept similar to git tree hashing is used. + +This allows more granular substitution, chunk reusage and parallel download of +individual files, reducing bandwidth usage. +As these chunks are content-addressed, it opens up the potential for +peer-to-peer trustless substitution of most of the data, as long as we sign the +root of the index. + +Tvix still keeps the old-style signatures, NAR hashes and NAR size around. In +the case of NAR hash / NAR size, this data is strictly required in some cases. +The old-style signatures are valuable for communication with existing +implementations. -Most likely, we will end up with multiple implementations of store, a -few possible ones that come to mind are: +Old-style binary caches (like cache.nixos.org) can still be exposed via the new +interface, by doing on-the-fly (re)chunking/ingestion. -- Local -- SSH -- GCP -- S3 -- Ceph +Most likely, there will be multiple implementations of store, some storing +things locally, some exposing a "remote view". + +A few possible ones that come to mind are: + +- Local store +- SFTP/ GCP / S3 / HTTP +- NAR/NARInfo protocol: HTTP, S3 + +A remote Tvix store can be connected by simply connecting to its gRPC +interface, possibly using SSH tunneling, but there doesn't need to be an +additional "wire format" like the Nix `ssh(+ng)://` protocol. + +Settling on one interface allows composition of stores, meaning it becomes +possible to express substitution from remote caches as a proxy layer. + +It'd also be possible to write a FUSE implementation on top of the RPC +interface, exposing a lazily-substituting /nix/store mountpoint. Using this in +remote build context dramatically reduces the amount of data transferred to a +builder, as only the files really accessed during the build are substituted. # Figures ![component flow](./component-flow.svg) + +[^1]: There have already been some discussions in the Nix community, to switch + to REAPI: + https://discourse.nixos.org/t/a-proposal-for-replacing-the-nix-worker-protocol/20926/22 |