about summary refs log tree commit diff
diff options
context:
space:
mode:
-rw-r--r--tvix/docs/component-flow.puml20
-rw-r--r--tvix/docs/components.md74
2 files changed, 63 insertions, 31 deletions
diff --git a/tvix/docs/component-flow.puml b/tvix/docs/component-flow.puml
index 3bcddbe7464e..5b6d79b82313 100644
--- a/tvix/docs/component-flow.puml
+++ b/tvix/docs/component-flow.puml
@@ -28,7 +28,7 @@ note right
     Immediately starts streaming derivations as they are instantiated across
     the dependency graph so they can be built while the evaluation is still running.
 
-    There are two types of build requests: One for regular "fire and forget" builds
+    There are two types of build requests: One for regular "fire and forget" builds,
     and another for IFD (import from derivation).
 
     These are distinct because IFD needs to be fed back into the evaluator for
@@ -42,27 +42,13 @@ loop while has more derivations
         Coord<--Store: Success response
     else Store does not have path
         Coord-->Build: Request derivation to be built
-        note left
-            The build request optionally includes a desired store.
-            If a builder is aware of how to push to the store it will do so
-            directly when the build is finished.
-
-            If the store is not known by the builder results will be streamed
-            back to the coordinator for store addition.
-        end note
 
         alt Build failure
             Coord<--Build: Fail response
             note left: It's up to the coordinator whether to exit on build failure
         else Build success
-            alt Known store
-                Build-->Store: Push outputs to store
-                Build<--Coord: Send success & pushed response
-            else Unknown store
-                Build<--Coord: Send success & not pushed response
-                Coord<--Build: Stream build outputs
-                Coord-->Store: Push outputs to store
-            end
+            Build-->Store: Push outputs to store
+            Build<--Coord: Send success & pushed response
         end
 
     end
diff --git a/tvix/docs/components.md b/tvix/docs/components.md
index 19e7baa3ec8a..a7d61948c2fa 100644
--- a/tvix/docs/components.md
+++ b/tvix/docs/components.md
@@ -63,12 +63,14 @@ to generate configuration without any build or store involvement.
 command itself. We give it filesystem access to handle things like
 imports or `builtins.readFile`.
 
-In the future, we might abstract away raw filesystem access by
-allowing the evaluator to request files from the coordinator (which
-will query the store for it). This might get messy, and the benefits
-are questionable. We might be okay with running the evaluator with
-filesystem access for now and can extend the interface if the need
-arises.
+To support IFD, the Evaluator also needs access to store paths. This
+could be implemented by having the coordinator provide an interface to retrieve
+files from a store path, or by ensuring a "realized version of the store" is
+accessible by the evaluator (this could be a FUSE filesystem, or the "real"
+/nix/store on disk.
+
+We might be okay with running the evaluator with filesystem access for now and
+can extend the interface if the need arises.
 
 ## Builder
 
@@ -95,20 +97,64 @@ dominant Linux containerisation technology, by default.
 With a well-defined builder abstraction, it's also easy to imagine
 other backends such as a Kubernetes-based one in the future.
 
+The environment in which builds happen is currently very Nix-specific. We might
+want to avoid having to maintain all the intricacies of a Nix-specific
+sandboxing environment in every builder, and instead only provide a more
+generic interface, receiving build requests (and have the coordinator translate
+derivations to that format). [^1]
+
+To build, the builder needs to be able to mount all build inputs into the build
+environment. For this, it needs the store to expose a filesystem interface.
+
 ## Store
 
 *Purpose:* Store takes care of storing build results. It provides a
-unified interface to get file paths and upload new ones.
+unified interface to get store paths and upload new ones, as well as querying
+for the existence of a store path and its metadata (references, signatures, …).
+
+Tvix natively uses an improved store protocol. Instead of transferring around
+NAR files, which don't provide an index and don't allow seekable access, a
+concept similar to git tree hashing is used.
+
+This allows more granular substitution, chunk reusage and parallel download of
+individual files, reducing bandwidth usage.
+As these chunks are content-addressed, it opens up the potential for
+peer-to-peer trustless substitution of most of the data, as long as we sign the
+root of the index.
+
+Tvix still keeps the old-style signatures, NAR hashes and NAR size around. In
+the case of NAR hash / NAR size, this data is strictly required in some cases.
+The old-style signatures are valuable for communication with existing
+implementations.
 
-Most likely, we will end up with multiple implementations of store, a
-few possible ones that come to mind are:
+Old-style binary caches (like cache.nixos.org) can still be exposed via the new
+interface, by doing on-the-fly (re)chunking/ingestion.
 
-- Local
-- SSH
-- GCP
-- S3
-- Ceph
+Most likely, there will be multiple implementations of store, some storing
+things locally, some exposing a "remote view".
+
+A few possible ones that come to mind are:
+
+- Local store
+- SFTP/ GCP / S3 / HTTP
+- NAR/NARInfo protocol: HTTP, S3
+
+A remote Tvix store can be connected by simply connecting to its gRPC
+interface, possibly using SSH tunneling, but there doesn't need to be an
+additional "wire format" like the Nix `ssh(+ng)://` protocol.
+
+Settling on one interface allows composition of stores, meaning it becomes
+possible to express substitution from remote caches as a proxy layer.
+
+It'd also be possible to write a FUSE implementation on top of the RPC
+interface, exposing a lazily-substituting /nix/store mountpoint. Using this in
+remote build context dramatically reduces the amount of data transferred to a
+builder, as only the files really accessed during the build are substituted.
 
 # Figures
 
 ![component flow](./component-flow.svg)
+
+[^1]: There have already been some discussions in the Nix community, to switch
+  to REAPI:
+  https://discourse.nixos.org/t/a-proposal-for-replacing-the-nix-worker-protocol/20926/22