about summary refs log tree commit diff
diff options
context:
space:
mode:
-rw-r--r--tvix/eval/docs/build-references.md87
1 files changed, 83 insertions, 4 deletions
diff --git a/tvix/eval/docs/build-references.md b/tvix/eval/docs/build-references.md
index 788b25c2be6c..badcea11550e 100644
--- a/tvix/eval/docs/build-references.md
+++ b/tvix/eval/docs/build-references.md
@@ -97,6 +97,9 @@ C++ Nix has several builtins that interface directly with string contexts:
 * `appendContext`: adds a given string context to the string in the same format
   as returned by `getContext`
 
+Most of the string manipulation operations will propagate the context to the
+result based on their parameters' contexts.
+
 ## Placeholders
 
 C++ Nix has `builtins.placeholder`, which given the name of an output (e.g.
@@ -111,16 +114,24 @@ passes etc.
 
 ## Tvix & string contexts
 
-Tvix does not track string contexts in its evaluator at all. Instead we are
-investigating implementing a system which allows us to drop string contexts in
-favour of reference scanning derivation attributes.
+In the past, Tvix did not track string contexts in its evaluator at all, see
+the historical section for more information about that.
+
+Tvix tracks string contexts in every `NixString` structure via a
+`HashSet<BuildReference>` and offers an API to combine the references while
+keeping the exact internal structure of that data private.
+
+## Historical attempt: Persistent reference tracking
+
+We were investigating implementing a system which allows us to drop string
+contexts in favour of reference scanning derivation attributes.
 
 This means that instead of maintaining and passing around a string context data
 structure in eval, we maintain a data structure of *known paths* from the same
 evaluation elsewhere in Tvix, and scan each derivation attribute against this
 set of known paths when instantiating derivations.
 
-Until proven otherwise, we take the stance that the system of string contexts as
+We believed we could take the stance that the system of string contexts as
 implemented in C++ Nix is likely an implementation detail that should not be
 leaking to the language surface as it does now.
 
@@ -173,3 +184,71 @@ discrepancies between Tvix and C++ Nix, which can make initial comparisons of
 derivations generated by the two systems difficult. If this occurs we need to
 discuss how to approach it, but initially we will implement the mutating
 builtins as no-ops.
+
+### Why this did not work for us?
+
+Nix has a feature to perform environmental checks of your derivation, e.g.
+"these derivation outputs should not be referenced in this derivation", this was
+introduced in Nix 2.2 by
+https://github.com/NixOS/nix/commit/3cd15c5b1f5a8e6de87d5b7e8cc2f1326b420c88.
+
+Unfortunately, this feature introduced a very unfortunate and critical bug: all
+usage of this feature with contextful strings will actually force the
+derivation to depend at least at build time on those specific paths, see
+https://github.com/NixOS/nix/issues/4629.
+
+For example, if you wanted to `disallowedReferences` to a package and you used a
+derivation as a path, you would actually register that derivation as a input
+derivation of that derivation.
+
+This bug is still unfixed in Nix and it seems that fixing it would require
+introducing different ways to evaluate Nix derivations to preserve the
+output path calculation for Nix expressions so far.
+
+All of this would be fine if the bug behavior was uniform in the sense that no
+one tried to force-workaround it. Since Nixpkgs 23.05, due to
+https://github.com/NixOS/nixpkgs/pull/211783 this is not true anymore.
+
+If you let nixpkgs be the disjoint union of bootstrapping derivations $A$ and
+`stdenv.mkDerivation`-built derivations $B$.
+
+$A$ suffers from the bug and $B$ doesn't by the forced usage of
+`unsafeDiscardStringContext` on those special checking fields.
+
+This means that to build hash-compatible $A$ **and** $B$, we need to
+distinguish $A$ and $B$. A lot of hacks could be imagined to support this
+problem.
+
+Let's assume we have a solution to that problem, it means that we are able to
+detect implicitly when a set of specific fields are
+`unsafeDiscardStringContext`-ed.
+
+Thus, we could use that same trick to implement `unsafeDiscardStringContext`
+entirely for all fields actually.
+
+Now, to implement `unsafeDiscardStringContext` in the persistent reference
+tracking model, you will need to store a disallowed list of strings that should
+not trigger a reference when we are scanning a derivation parameters.
+
+But assume you have something like:
+
+```nix
+derivation {
+   buildInputs = [
+     stdenv.cc
+   ];
+
+   disallowedReferences = [ stdenv.cc ];
+}
+```
+
+If you unregister naively the `stdenv.cc` reference, it will silence the fact
+that it is part of the `buildInputs`, so you will observe that Nix will fail
+the derivation during environmental check, but Tvix would silently force remove
+that reference.
+
+Until proven otherwise, it seems highly difficult to have the fine-grained
+information to prevent reference tracking of those specific fields. It is not a
+failure of the persistent reference tracking, it is an unresolved critical bug
+of Nix that only nixpkgs really workarounded for `stdenv.mkDerivation`-based
+derivations.