about summary refs log tree commit diff
diff options
context:
space:
mode:
authorsterni <sternenseemann@systemli.org>2022-11-23T18·51+0100
committersterni <sternenseemann@systemli.org>2022-11-25T10·27+0000
commit094f07ad9ab1d8afb44f1e98bca639c5a1fdd0e9 (patch)
treed69bb35f0e31399ce6c8dcb49d29346d2eee1a89
parent344d3050f3432d775e60704dd1032d1398179291 (diff)
docs(tvix): document pointer equality in (C++) Nix r/5317
semi-obscure = #nix-lang on freenode

Change-Id: Id235f7019bbee137bbde57a552c755261e23b58f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7368
Reviewed-by: tazjin <tazjin@tvl.su>
Reviewed-by: Adam Joseph <adam@westernsemico.com>
Tested-by: BuildkiteCI
-rw-r--r--tvix/docs/value-pointer-equality.md202
1 files changed, 202 insertions, 0 deletions
diff --git a/tvix/docs/value-pointer-equality.md b/tvix/docs/value-pointer-equality.md
new file mode 100644
index 0000000000..a57273c6b5
--- /dev/null
+++ b/tvix/docs/value-pointer-equality.md
@@ -0,0 +1,202 @@
+# Value Pointer Equality in Nix
+
+## Introduction
+
+It is a piece of semi-obscure Nix trivia that while functions are generally not
+comparable, they can be compared in certain situations. This is actually quite an
+important fact, as it is essential for the evaluation of nixpkgs: The attribute sets
+used to represent platforms in nixpkgs, like `stdenv.buildPlatform`,  contain functions,
+such as `stdenv.buildPlatform.canExecute`. When writing cross logic, one invariably
+ends up writing expressions that compare these sets, e.g. `stdenv.buildPlatform !=
+stdenv.hostPlatform`. Since attribute set equality is the equality of their attribute
+names and values, we also end up comparing the functions within them.  We can summarize
+the relevant part of this behavior for platform comparisons in the following (true)
+Nix expressions:
+
+* `stdenv.hostPlatform.canExecute != stdenv.hostPlatform.canExecute`
+* `stdenv.hostPlatform == stdenv.hostPlatform`
+
+This fact is commonly referred to as pointer equality of functions (or function pointer
+equality) which is not an entirely accurate name, as we'll see. This account of the
+behavior states that, while functions are incomparable in general, they are comparable
+insofar, as they occupy the same spot in an attribute set.
+
+However, [a maybe lesser known trick][puck-issue] is to write a function such as the
+following to allow comparing functions:
+
+```nix
+let
+  pointerEqual = lhs: rhs: { x = lhs; } == { x = rhs; };
+
+  f = name: "Hello, my name is ${name}";
+  g = name: "Hello, my name is ${name}";
+in
+[
+  (pointerEqual f f) # => true
+  (pointerEqual f g) # => false
+]
+```
+
+Here, clearly, the function is not contained at the same position in one and the same
+attribute set, but at the same position in two entirely different attribute sets. We can
+also see that we are not comparing the functions themselves (e.g. their AST), but
+rather if they are the same individual value (i.e. pointer equal).
+
+So what is _actually_ going on?
+
+## Nix (pointer) Equality in C++ Nix
+
+TIP: The summary presented here is up to date as of 2022-11-23 and was tested with Nix 2.3 and 2.11.
+
+The function implementing equality in C++ Nix is `EvalState::eqValues` which starts with
+[the following bit of code][eqValues-pointer-eq]:
+
+```cpp
+bool EvalState::eqValues(Value & v1, Value & v2)
+{
+    forceValue(v1);
+    forceValue(v2);
+
+    /* !!! Hack to support some old broken code that relies on pointer
+       equality tests between sets.  (Specifically, builderDefs calls
+       uniqList on a list of sets.)  Will remove this eventually. */
+    if (&v1 == &v2) return true;
+```
+
+So this immediately looks more like pointer equality of arbitrary *values* instead of functions. In fact
+there is [no special code facilitating function equality][eqValues-function-eq]:
+
+```cpp
+        /* Functions are incomparable. */
+        case nFunction:
+            return false;
+```
+
+So one takeaway of this is that pointer equality is neither dependent on functions nor attribute sets.
+In fact, we can also write our `pointerEqual` function as:
+
+```nix
+lhs: rhs: [ lhs ] == [ rhs ]
+```
+
+It's interesting that `EvalState::eqValues` forces the left and right hand value before trying pointer
+equality. It explains that `let x = throw ""; in x == x` does not evaluate successfully, but it is puzzling why
+`let f = x: x; in f == f` does not return `true`. In fact, why do we need to wrap the values in a list or
+attribute set at all for our `pointerEqual` function to work?
+
+The answer lies in [the code that evaluates `ExprOpEq`][ExprOpEq],
+i.e. an expression involving the `==` operator:
+
+```cpp
+void ExprOpEq::eval(EvalState & state, Env & env, Value & v)
+{
+    Value v1; e1->eval(state, env, v1);
+    Value v2; e2->eval(state, env, v2);
+    v.mkBool(state.eqValues(v1, v2));
+}
+```
+
+As you can see, two _distinct_ `Value` structs are created, so they can never be pointer equal even
+if the `union` inside points to the same bit of memory. We can thus understand what actually happens
+when we check the equality of an attribute set (or list), by looking at the following expression:
+
+```nix
+let
+  x = { name = throw "nameless"; };
+in
+
+x == x # => causes an evaluation error
+```
+
+Because `x` can't be pointer equal, as it'll end up in the distinct structs `v1` and `v2`, it needs to be compared
+by value. For this reason, the `name` attribute will be forced and an evaluation error caused.
+If we rewrite the expression to use…
+
+```nix
+{ inherit x; } == { inherit x; } # => true
+```
+
+…, it'll work: The two attribute sets are compared by value, but their `x` attribute turns out to be pointer
+equal _after_ forcing it. This does not throw, since forcing an attribute set does not force its attributes'
+values (as forcing a list doesn't force its elements).
+
+As we have seen, pointer equality can not only be used to compare function values, but also other
+otherwise incomparable values, such as lists and attribute sets that would cause an evaluation
+error if they were forced recursively. We can even switch out the `throw` for an `abort`. The limitation is
+of course that we need to use a value that behaves differently depending on whether it is forced
+“normally” (think `builtins.seq`) or recursively (think `builtins.deepSeq`), so thunks will generally be
+evaluated before pointer equality can kick into effect.
+
+## Summary
+
+When comparing two Nix values, we must force both of them (non-recursively!), but are
+allowed to short-circuit the comparison based on pointer equality, i.e. if they are at
+the same exact value in memory, they are deemed equal immediately. This is completely
+independent of what type of value they are. If they are not pointer equal, they are
+(recursively) compared by value as expected.
+
+However, when evaluating the Nix expression `a == b`, we *must* invoke our implementation's
+value equality function in a way that `a` and `b` themselves can never be deemed pointer equal.
+Any values we encounter while recursing during the equality check must be compared by
+pointer as described above, though.
+
+## Other Comparisons
+
+The `!=` operator uses `EvalState::eqValues` internally as well, so it behaves exactly as `!(a == b)`.
+
+The `>`, `<`, `>=` and `<=` operators all desugar to [CompareValues][] eventually
+which generally looks at the value type before comparing. It does, however, rely on
+`EvalState::eqValues` for list comparisons, so it is possible to compare lists with
+e.g. functions in them, as long as they are equal by pointer:
+
+```nix
+let
+  f = x: x + 42;
+in
+
+[
+  ([ f 2 ] > [ f 1 ]) # => true
+  ([ f 2 ] > [ (x: x) 1]) # => error: cannot compare a function with a function
+]
+```
+
+Finally, since `builtins.elem` relies on `EvalState::eqValues`, you can check for
+a function by pointer equality:
+
+```nix
+let
+  f = x: f x;
+in
+builtins.elem f [ f 2 3 ] # => true
+```
+
+## Stability of the Feature
+
+Keen readers will have noticed the following comment in the C++ Nix source code,
+indicating that pointer comparison may be removed in the future.
+
+```cpp
+    /* !!! Hack to support some old broken code that relies on pointer
+       equality tests between sets.  (Specifically, builderDefs calls
+       uniqList on a list of sets.)  Will remove this eventually. */
+```
+
+Now, I can't speak for the upstream C++ Nix developers, but sure can speculate.
+As already pointed out, this feature is currently needed for evaluating nixpkgs.
+While it's use could realistically be eliminated (only bothersome spot is probably
+the `emulator` function, but that should also be doable), removing the feature
+would seriously compromise C++ Nix's ability to evaluate historical nixpkgs
+revision which is arguably a strength of the system.
+
+Another indication that it is likely here to stay is that it has already
+[outlived builderDefs][], even though
+it was (apparently) reintroduced just for this use case. More research into
+the history of this feature would still be prudent, especially the reason for
+its original introduction (maybe performance?).
+
+[puck-issue]: https://github.com/NixOS/nix/issues/3371
+[eqValues-pointer-eq]: https://github.com/NixOS/nix/blob/05d0892443bbe92a6b6a1ee7b1d37ea05782d918/src/libexpr/eval.cc#L2342-L2350
+[eqValues-function-eq]: https://github.com/NixOS/nix/blob/05d0892443bbe92a6b6a1ee7b1d37ea05782d918/src/libexpr/eval.cc#L2405-L2407
+[ExprOpEq]: https://github.com/NixOS/nix/blob/05d0892443bbe92a6b6a1ee7b1d37ea05782d918/src/libexpr/eval.cc#L1856-L1861
+[outlived builderDefs]: https://github.com/NixOS/nixpkgs/issues/4210
+[CompareValues]: https://github.com/NixOS/nix/blob/master/src/libexpr/primops.cc#L536-L574