diff options
author | Florian Klink <flokli@flokli.de> | 2024-06-13T19·17+0300 |
---|---|---|
committer | clbot <clbot@tvl.fyi> | 2024-06-14T08·00+0000 |
commit | 5077ca70deb8ca8e84abb9608e08bf4485d3ec4b (patch) | |
tree | 7396411393a5b88b25d3adb08adcd3a5478e8e21 /tvix/eval | |
parent | 6947dc4349fa85cb702f46acfe3255c907096b12 (diff) |
chore(tvix/eval): move eval docs to tvix/docs r/8270
Change-Id: I75b33c43456389de6e521b4f0ad46d68bc9e98f6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11809 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
Diffstat (limited to 'tvix/eval')
-rw-r--r-- | tvix/eval/docs/abandoned/thread-local-vm.md | 233 | ||||
-rw-r--r-- | tvix/eval/docs/bindings.md | 133 | ||||
-rw-r--r-- | tvix/eval/docs/build-references.md | 254 | ||||
-rw-r--r-- | tvix/eval/docs/builtins.md | 138 | ||||
-rw-r--r-- | tvix/eval/docs/catchable-errors.md | 131 | ||||
-rw-r--r-- | tvix/eval/docs/known-optimisation-potential.md | 162 | ||||
-rw-r--r-- | tvix/eval/docs/language-issues.md | 46 | ||||
-rw-r--r-- | tvix/eval/docs/opcodes-attrsets.md | 122 | ||||
-rw-r--r-- | tvix/eval/docs/recursive-attrs.md | 68 | ||||
-rw-r--r-- | tvix/eval/docs/vm-loop.md | 315 |
10 files changed, 0 insertions, 1602 deletions
diff --git a/tvix/eval/docs/abandoned/thread-local-vm.md b/tvix/eval/docs/abandoned/thread-local-vm.md deleted file mode 100644 index c6a2d5e07e5c..000000000000 --- a/tvix/eval/docs/abandoned/thread-local-vm.md +++ /dev/null @@ -1,233 +0,0 @@ -# We can't have nice things because IFD - -The thread-local VM work below was ultimately not merged because it -was decided that it would be harmful for `tvix::eval::Value` to -implement `Eq`, `Hash`, or any of the other `std` traits. - -Implementing `std` traits on `Value` was deemed harmful because IFD -can cause arbitrary amounts of compilation to occur, including -network transactions with builders. Obviously it would be -unexpected and error-prone to have a `PartialEq::eq()` which does -something like this. This problem does not manifest within the -"nixpkgs compatibility only" scope, or in any undeprecated language -feature other than IFD. Although IFD is outside the "nixpkgs -compatibility scope", it [has been added to the TVL compatibility -scope](https://cl.tvl.fyi/c/depot/+/7193/comment/3418997b_0dbd0b65/). - -This was the sole reason for not merging. - -The explanation below may be useful in case future circumstances -affect the relevance of the reasoning above. - -The implementation can be found in these CLs: - -- [refactor(tvix/eval): remove lifetime parameter from VM<'o>](https://cl.tvl.fyi/c/depot/+/7194) -- [feat(tvix/eval): [FOUNDLING] thread-local VM](https://cl.tvl.fyi/c/depot/+/7195) -- [feat(tvix/eval): [FOUNDLING] VM::vm_xxx convenience methods](https://cl.tvl.fyi/c/depot/+/7196) -- [refactor(tvix/eval): [FOUNDLING]: drop explicit `&mut vm` parameter](https://cl.tvl.fyi/c/depot/+/7197) - -# Thread-local storage for tvix::eval::vm::VM - -## The problem - -`Value::force()` takes a `&mut VM` argument, since forcing a value -requires executing opcodes. This means that `Value::nix_eq()` too -must take a `&mut VM`, since any sensible definition of equality -will have to force thunks. - -Unfortunately Rust's `PartialEq::eq()` function does not accept any -additional arguments like this, so `Value` cannot implement -`PartialEq`. Worse, structs which *contain* `Value`s can't -implement `PartialEq` either. This means `Value`, and anything -containing it, cannot be the key for a `BTreeMap` or `HashMap`. We -can't even insert `Value`s into a `HashSet`! - -There are other situations like this that don't involve `PartialEq`, -but it's the most glaring one. The main problem is that you need a -`VM` in order to force thunks, and thunks can be anywhere in a -`Value`. - -## Solving the problem with thread-locals - -We could avoid threading the `&mut VM` through the entire codebase -by making it a thread-local. - -To do this without a performance hit, we need to use LLVM -thread-locals, which are the same cost as references to `static`s -but load relative to -[`llvm.threadlocal.address`][threadlocal-intrinsic] instead of -relative to the data segment. Unfortunately `#[thread_local]` [is -unstable][thread-local-unstable] and [unsafe in -general][thread-local-unsafe] for most of the cases where we would -want to use it. There is one [exception][tls-const-init], however: -if a `!thread_local()` has a `const` initializer, the compiler will -insert a `#[thread_local]`; this special case is both safe and -stable. - -The difficult decision is what the type of the thread-local should -be. Since you can't get a mutable reference to a `thread_local!()` -it will have to be some interior-mutability-bestowing wrapper around -our current `struct VM`. Here are the choices: - -### `RefCell<VM>` - -This is the obvious first choice, since it lets you borrow a -`RefMut<Target=VM>`. The problem here is that we want to keep the -codebase written such that all the functions in `impl VM` still take -a `&mut self`. This means that there will be an active mutable -borrow for the duration of `VM::call_builtin()`. So if we implement -`PartialEq` by having `eq()` attempt a second mutable borrow from -the thread-local storage, it will fail since there is already an -active borrow. - -The problem here is that you can't "unborrow" a `RefMut` except by -dropping it. There's no way around this. - -#### Problem: Uglification - -The only solution here is to rewrite all the functions in `impl VM` -so they don't take any kind of `self` argument, and then have them -do a short-lived `.borrow_mut()` from the thread-local `RefCell` -*separately, each time* they want to modify one of the fields of -`VM` (currently `frames`, `stack`, `with_stack`, `warnings`). This -means that if you had a code sequence like this: - -``` -impl VM { - fn foo(&mut self, ...) { - ... - self.frame().ip += 1; - self.some_other_method(); - self.frame().ip += 1; -``` - -You would need to add *two separate `borrow_mut()`s*, one for each -of the `self.frame().ip+=1` statements. You can't just do one big -`borrow_mut()` because `some_other_method()` will call -`borrow_mut()` and panic. - -#### Problem: Performance - -The `RefCell<VM>` approach also has a fairly huge performance hit, -because every single modification to any part of `VM` will require a -reference count increment/decrement, and a conditional branch based -on the check (which will never fail) that the `RefCell` isn't -already mutably borrowed. It will also impede a lot of rustc's -optimizations. - -### `Cell<VM>` - -This is a non-starter because it means that in order to mutate any -field of `VM`, you have to move the entire `struct VM` out of the -`Cell`, mutate it, and move it back in. - -### `Cell<Box<VM>>` - -Now we're getting warmer. Here, we can move the `Box<VM>` out of -the cell with a single pointer-sized memory access. - -We don't want to do the "uglification" described in the previous -section. We are very fortunate that, sometime in mid-2019, the Rust -dieties [decreed by fiat][fiat-decree] that `&Cell<T>` and `&mut T` -are bit-for-bit identical, and even gave us mortals safe wrappers -[`from_mut()`][from_mut] and [`get_mut()`][get_mut] around -`mem::transmute()`. - -So now, when a `VM` method (which takes `&mut self`) calls out to -some external code (like a builtin), instead of passing the `&mut -self` to the external code it can call `Cell::from_mut(&mut self)`, -and then `Cell::swap()` that into the thread-local storage cell for -the duration of the external code. After the external code returns, -it can `Cell::swap()` it back. This whole dance gets wrapped in a -lexical block, and the borrow checker sees that the `&Cell<Box<VM>>` -returned by `Cell::from_mut()` lives only until the end of the -lexical block, *so we get the `&mut self` back after the close-brace -for that block*. NLL FTW. This sounds like a lot of work, but it -should compile down to two pointer-sized loads and two pointer-sized -stores, and it is incurred basically only for `OpBuiltin`. - -This all works, with only two issues: - -1. `vm.rs` needs to be very careful to do the thread-local cell swap - dance before calling anything that might call `PartialEq::eq()` - (or any other method that expects to be able to pull the `VM` out - of thread-local storage). There is no compile-time check that we - did the dance in all the right places. If we forget to do the - dance somewhere we'll get a runtime panic from `Option::expect()` - (see next section). - -2. Since we need to call `Cell::from_mut()` on a `Box<VM>` rather - than a bare `VM`, we still need to rewrite all of `vm.rs` so that - every function takes a `&mut Box<VM>` instead of a `&mut self`. - This creates a huge amount of "noise" in the code. - -Fortunately, it turns out that nearly all the "noise" that arises -from the second point can be eliminated by taking advantage of -[deref coercions][deref-coercions]! This was the last "shoe to -drop". - -There is still the issue of having to be careful about calls from -`vm.rs` to things outside that file, but it's manageable. - -### `Cell<Option<Box<VM>>>` - -In order to get the "safe and stable `#[thread_local]`" -[exception][tls-const-init] we need a `const` initializer, which -means we need to be able to put something into the `Cell` that isn't -a `VM`. So the type needs to be `Cell<Option<Box<VM>>>`. - -Recall that you can't turn an `Option<&T>` into an `&Option<T>`. -The latter type has the "is this a `Some` or `None`" bit immediately -adjacent to the bits representing `T`. So if I hand you a `t:&T` -and you wrap it as `Some(t)`, those bits aren't adjacent in memory. -This means that all the VM methods need to operate on an -`Option<Box<VM>>` -- we can't just wrap a `Some()` around `&mut -self` "at the last minute" before inserting it into the thread-local -storage cell. Fortunately deref coercions save the day here too -- -the coercion is inferred through both layers (`Box` and `Option`) of -wrapper, so there is no additional noise in the code. - -Note that Rust is clever and can find some sequence of bits that -aren't a valid `T`, so `sizeof(Option<T>)==sizeof(T)`. And in fact, -`Box<T>` is one of these cases (and this is guaranteed). So the -`Option` has no overhead. - -# Closing thoughts, language-level support - -This would have been easier with language-level support. - -## What wouldn't help - -Although it [it was decreed][fiat-decree] that `Cell<T>` and `&mut -T` are interchangeable, a `LocalKey<Cell<T>>` isn't quite the same -thing as a `Cell<T>`, so it wouldn't be safe for the standard -library to contain something like this: - -``` -impl<T> LocalKey<Cell<T>> { - fn get_mut(&self) -> &mut T { - unsafe { - // ... mem::transmute() voodoo goes here ... -``` - -The problem here is that you can call `LocalKey<Cell<T>>::get_mut()` twice and -end up with two `&mut T`s that point to the same thing (mutable aliasing) which -results in undefined behavior. - -## What would help - -The ideal solution is for Rust to let you call arbitrary methods -`T::foo(&mut self...)` on a `LocalKey<Cell<T>>`. This way you can -have one (and only one) `&mut T` at any syntactical point in your -program -- the `&mut self`. - - -[tls-const-init]: https://github.com/rust-lang/rust/pull/90774 -[thread-local-unstable]: https://github.com/rust-lang/rust/issues/29594 -[thread-local-unsafe-generally]: https://github.com/rust-lang/rust/issues/54366 -[fiat-decree]: https://github.com/rust-lang/rust/issues/43038 -[from_mut]: https://doc.rust-lang.org/stable/std/cell/struct.Cell.html#method.from_mut -[get_mut]: https://doc.rust-lang.org/stable/std/cell/struct.Cell.html#method.get_mut -[thread-local-unsafe]: [https://github.com/rust-lang/rust/issues/54366] -[deref-coercions]: https://doc.rust-lang.org/book/ch15-02-deref.html#implicit-deref-coercions-with-functions-and-methods -[threadlocal-intrinsic]: https://llvm.org/docs/LangRef.html#llvm-threadlocal-address-intrinsic diff --git a/tvix/eval/docs/bindings.md b/tvix/eval/docs/bindings.md deleted file mode 100644 index 17867acfe9f9..000000000000 --- a/tvix/eval/docs/bindings.md +++ /dev/null @@ -1,133 +0,0 @@ -Compilation of bindings -======================= - -Compilation of Nix bindings is one of the most mind-bending parts of Nix -evaluation. The implementation of just the compilation is currently almost 1000 -lines of code, excluding the various insane test cases we dreamt up for it. - -## What is a binding? - -In short, any attribute set or `let`-expression. Tvix currently does not treat -formals in function parameters (e.g. `{ name ? "fred" }: ...`) the same as these -bindings. - -They have two very difficult features: - -1. Keys can mutually refer to each other in `rec` sets or `let`-bindings, - including out of definition order. -2. Attribute sets can be nested, and parts of one attribute set can be defined - in multiple separate bindings. - -Tvix resolves as much of this logic statically (i.e. at compile-time) as -possible, but the procedure is quite complicated. - -## High-level concept - -The idea behind the way we compile bindings is to fully resolve nesting -statically, and use the usual mechanisms (i.e. recursion/thunking/value -capturing) for resolving dynamic values. - -This is done by compiling bindings in several phases: - -1. An initial compilation phase *only* for plain inherit statements (i.e. - `inherit name;`), *not* for namespaced inherits (i.e. `inherit (from) - name;`). - -2. A declaration-only phase, in which we use the compiler's scope tracking logic - to calculate the physical runtime stack indices (further referred to as - "stack slots" or just "slots") that all values will end up in. - - In this phase, whenever we encounter a nested attribute set, it is merged - into a custom data structure that acts like a synthetic AST node. - - This can be imagined similar to a rewrite like this: - - ```nix - # initial code: - { - a.b = 1; - a.c = 2; - } - - # rewritten form: - { - a = { - b = 1; - c = 2; - }; - } - ``` - - The rewrite applies to attribute sets and `let`-bindings alike. - - At the end of this phase, we know the stack slots of all namespaces for - inheriting from, all values inherited from them, and all values (and - optionally keys) of bindings at the current level. - - Only statically known keys are actually merged, so any dynamic keys that - conflict will lead to a "key already defined" error at runtime. - -3. A compilation phase, in which all values (and, when necessary, keys) are - actually compiled. In this phase the custom data structure used for merging - is encountered when compiling values. - - As this data structure acts like an AST node, the process begins recursively - for each nested attribute set. - -At the end of this process we have bytecode that leaves the required values (and -optionally keys) on the stack. In the case of attribute sets, a final operation -is emitted that constructs the actual attribute set structure at runtime. For -`let`-bindings a final operation is emitted that removes these locals from the -stack when the scope ends. - -## Moving parts - -WARNING: This documents the *current* implementation. If you only care about the -conceptual aspects, see above. - -There's a few types involved: - -* `PeekableAttrs`: peekable iterator over an attribute path (e.g. `a.b.c`) -* `BindingsKind`: enum defining the kind of bindings (attrs/recattrs/let) -* `AttributeSet`: struct holding the bindings kind, the AST nodes with inherits - (both namespaced and not), and an internal representation of bindings - (essentially a vector of tuples of the peekable attrs and the expression to - compile for the value). -* `Binding`: enum describing the kind of binding (namespaced inherit, attribute - set, plain binding of *any other value type*) -* `KeySlot`: enum describing the location in which a key slot is placed at - runtime (nowhere, statically known value in a slot, dynamic value in a slot) -* `TrackedBinding`: struct representing statically known information about a - single binding (its key slot, value slot and `Binding`) -* `TrackedBindings`: vector of tracked bindings, which implements logic for - merging attribute sets together - -And quite a few methods on `Compiler`: - -* `compile_bindings`: entry point for compiling anything that looks like a - binding, this calls out to the functions below. -* `compile_plain_inherits`: takes all inherits of a bindings node and compiles - the ones that are trivial to compile (i.e. just plain inherits without a - namespace). The `rnix` parser does not represent namespaced/plain inherits in - different nodes, so this function also aggregates the namespaced inherits and - returns them for further use -* `declare_namespaced_inherits`: passes over all namespaced inherits and - declares them on the locals stack, as well as inserts them into the provided - `TrackedBindings` -* `declare_bindings`: declares all regular key/value bindings in a bindings - scope, but without actually compiling their keys or values. - - There's a lot of heavy lifting going on here: - - 1. It invokes the various pieces of logic responsible for merging nested - attribute sets together, creating intermediate data structures in the value - slots of bindings that can be recursively processed the same way. - 2. It decides on the key slots of expressions based on the kind of bindings, - and the type of expression providing the key. -* `bind_values`: runs the actual compilation of values. Notably this function is - responsible for recursively compiling merged attribute sets when it encounters - a `Binding::Set` (on which it invokes `compile_bindings` itself). - -In addition to these several methods (such as `compile_attr_set`, -`compile_let_in`, ...) invoke the binding-kind specific logic and then call out -to the functions above. diff --git a/tvix/eval/docs/build-references.md b/tvix/eval/docs/build-references.md deleted file mode 100644 index badcea11550e..000000000000 --- a/tvix/eval/docs/build-references.md +++ /dev/null @@ -1,254 +0,0 @@ -Build references in derivations -=============================== - -This document describes how build references are calculated in Tvix. Build -references are used to determine which store paths should be available to a -builder during the execution of a build (i.e. the full build closure of a -derivation). - -## String contexts in C++ Nix - -In C++ Nix, each string value in the evaluator carries an optional so-called -"string context". - -These contexts are themselves a list of strings that take one of the following -formats: - -1. `!<output_name>!<drv_path>` - - This format describes a build reference to a specific output of a derivation. - -2. `=<drv_path>` - - This format is used for a special case where a derivation attribute directly - refers to a derivation path (e.g. by accessing `.drvPath` on a derivation). - - Note: In C++ Nix this case is quite special and actually requires a - store-database query during evaluation. - -3. `<path>` - a non-descript store path input, usually a plain source file (e.g. - from something like `src = ./.` or `src = ./foo.txt`). - - In the case of `unsafeDiscardOutputDependency` this is used to pass a raw - derivation file, but *not* pull in its outputs. - -Lets introduce names for these (in the same order) to make them easier to -reference below: - -```rust -enum BuildReference { - /// !<output_name>!<drv_path> - SingleOutput(OutputName, DrvPath), - - /// =<drv_path> - DrvClosure(DrvPath), - - /// <path> - Path(StorePath), -} -``` - -String contexts are, broadly speaking, created whenever a string is the result -of a computation (e.g. string interpolation) that used a *computed* path or -derivation in any way. - -Note: This explicitly does *not* include simply writing a literal string -containing a store path (whether valid or not). That is only permitted through -the `storePath` builtin. - -## Derivation inputs - -Based on the data above, the fields `inputDrvs` and `inputSrcs` of derivations -are populated in `builtins.derivationStrict` (the function which -`builtins.derivation`, which isn't actually a builtin, wraps). - -`inputDrvs` is represented by a map of derivation paths to the set of their -outputs that were referenced by the context. - -TODO: What happens if the set is empty? Somebody claimed this means all outputs. - -`inputSrcs` is represented by a set of paths. - -These are populated by the above references as follows: - -* `SingleOutput` entries are merged into `inputDrvs` -* `Path` entries are inserted into `inputSrcs` -* `DrvClosure` leads to a special store computation (`computeFSClosure`), which - finds all paths referenced by the derivation and then inserts all of them into - the fields as above (derivations with _all_ their outputs) - -This is then serialised in the derivation and passed down the pipe. - -## Builtins interfacing with contexts - -C++ Nix has several builtins that interface directly with string contexts: - -* `unsafeDiscardStringContext`: throws away a string's string context (if - present) -* `hasContext`: returns `true`/`false` depending on whether the string has - context -* `unsafeDiscardOutputDependency`: drops dependencies on the *outputs* of a - `.drv` in the context, passing only the literal `.drv` itself - - Note: This is only used for special test-cases in nixpkgs, and deprecated Nix - commands like `nix-push`. -* `getContext`: returns the string context in serialised form as a Nix attribute - set -* `appendContext`: adds a given string context to the string in the same format - as returned by `getContext` - -Most of the string manipulation operations will propagate the context to the -result based on their parameters' contexts. - -## Placeholders - -C++ Nix has `builtins.placeholder`, which given the name of an output (e.g. -`out`) creates a hashed string representation of that output name. If that -string is used anywhere in input attributes, the builder will replace it with -the actual name of the corresponding output of the current derivation. - -C++ Nix does not use contexts for this, it blindly creates a rewrite map of -these placeholder strings to the names of all outputs, and runs the output -replacement logic on all environment variables it creates, attribute files it -passes etc. - -## Tvix & string contexts - -In the past, Tvix did not track string contexts in its evaluator at all, see -the historical section for more information about that. - -Tvix tracks string contexts in every `NixString` structure via a -`HashSet<BuildReference>` and offers an API to combine the references while -keeping the exact internal structure of that data private. - -## Historical attempt: Persistent reference tracking - -We were investigating implementing a system which allows us to drop string -contexts in favour of reference scanning derivation attributes. - -This means that instead of maintaining and passing around a string context data -structure in eval, we maintain a data structure of *known paths* from the same -evaluation elsewhere in Tvix, and scan each derivation attribute against this -set of known paths when instantiating derivations. - -We believed we could take the stance that the system of string contexts as -implemented in C++ Nix is likely an implementation detail that should not be -leaking to the language surface as it does now. - -### Tracking "known paths" - -Every time a Tvix evaluation does something that causes a store interaction, a -"known path" is created. On the language surface, this is the result of one of: - -1. Path literals (e.g. `src = ./.`). -2. Calls to `builtins.derivationStrict` yielding a derivation and its output - paths. -3. Calls to `builtins.path`. - -Whenever one of these occurs, some metadata that persists for the duration of -one evaluation should be created in Nix. This metadata needs to be available in -`builtins.derivationStrict`, and should be able to respond to these queries: - -1. What is the set of all known paths? (used for e.g. instantiating an - Aho-Corasick type string searcher) -2. What is the _type_ of a path? (derivation path, derivation output, source - file) -3. What are the outputs of a derivation? -4. What is the derivation of an output? - -These queries will need to be asked of the metadata when populating the -derivation fields. - -Note: Depending on how we implement `builtins.placeholder`, it might be useful -to track created placeholders in this metadata, too. - -### Context builtins - -Context-reading builtins can be implemented in Tvix by adding `hasContext` and -`getContext` with the appropriate reference-scanning logic. However, we should -evaluate how these are used in nixpkgs and whether their uses can be removed. - -Context-mutating builtins can be implemented by tracking their effects in the -value representation of Tvix, however we should consider not doing this at all. - -`unsafeDiscardOutputDependency` should probably never be used and we should warn -or error on it. - -`unsafeDiscardStringContext` is often used as a workaround for avoiding IFD in -inconvenient places (e.g. in the TVL depot pipeline generation). This is -unnecessary in Tvix. We should evaluate which other uses exist, and act on them -appropriately. - -The initial danger with diverging here is that we might cause derivation hash -discrepancies between Tvix and C++ Nix, which can make initial comparisons of -derivations generated by the two systems difficult. If this occurs we need to -discuss how to approach it, but initially we will implement the mutating -builtins as no-ops. - -### Why this did not work for us? - -Nix has a feature to perform environmental checks of your derivation, e.g. -"these derivation outputs should not be referenced in this derivation", this was -introduced in Nix 2.2 by -https://github.com/NixOS/nix/commit/3cd15c5b1f5a8e6de87d5b7e8cc2f1326b420c88. - -Unfortunately, this feature introduced a very unfortunate and critical bug: all -usage of this feature with contextful strings will actually force the -derivation to depend at least at build time on those specific paths, see -https://github.com/NixOS/nix/issues/4629. - -For example, if you wanted to `disallowedReferences` to a package and you used a -derivation as a path, you would actually register that derivation as a input -derivation of that derivation. - -This bug is still unfixed in Nix and it seems that fixing it would require -introducing different ways to evaluate Nix derivations to preserve the -output path calculation for Nix expressions so far. - -All of this would be fine if the bug behavior was uniform in the sense that no -one tried to force-workaround it. Since Nixpkgs 23.05, due to -https://github.com/NixOS/nixpkgs/pull/211783 this is not true anymore. - -If you let nixpkgs be the disjoint union of bootstrapping derivations $A$ and -`stdenv.mkDerivation`-built derivations $B$. - -$A$ suffers from the bug and $B$ doesn't by the forced usage of -`unsafeDiscardStringContext` on those special checking fields. - -This means that to build hash-compatible $A$ **and** $B$, we need to -distinguish $A$ and $B$. A lot of hacks could be imagined to support this -problem. - -Let's assume we have a solution to that problem, it means that we are able to -detect implicitly when a set of specific fields are -`unsafeDiscardStringContext`-ed. - -Thus, we could use that same trick to implement `unsafeDiscardStringContext` -entirely for all fields actually. - -Now, to implement `unsafeDiscardStringContext` in the persistent reference -tracking model, you will need to store a disallowed list of strings that should -not trigger a reference when we are scanning a derivation parameters. - -But assume you have something like: - -```nix -derivation { - buildInputs = [ - stdenv.cc - ]; - - disallowedReferences = [ stdenv.cc ]; -} -``` - -If you unregister naively the `stdenv.cc` reference, it will silence the fact -that it is part of the `buildInputs`, so you will observe that Nix will fail -the derivation during environmental check, but Tvix would silently force remove -that reference. - -Until proven otherwise, it seems highly difficult to have the fine-grained -information to prevent reference tracking of those specific fields. It is not a -failure of the persistent reference tracking, it is an unresolved critical bug -of Nix that only nixpkgs really workarounded for `stdenv.mkDerivation`-based -derivations. diff --git a/tvix/eval/docs/builtins.md b/tvix/eval/docs/builtins.md deleted file mode 100644 index dba4c48c65e1..000000000000 --- a/tvix/eval/docs/builtins.md +++ /dev/null @@ -1,138 +0,0 @@ -Nix builtins -============ - -Nix has a lot of built-in functions, some of which are accessible in -the global scope, and some of which are only accessible through the -global `builtins` attribute set. - -This document is an attempt to track all of these builtins, but -without documenting their functionality. - -See also https://nixos.org/manual/nix/stable/expressions/builtins.html - -The `impl` column indicates implementation status in tvix: -- implemented: "" (empty cell) -- not yet implemented, but not blocked: `todo` -- not yet implemented, but blocked by other prerequisites: - - `store`: awaiting eval<->store api(s) - - `context`: awaiting support for string contexts - -| name | global | arity | pure | impl | -|-------------------------------|--------|-------|-------|---------| -| abort | true | 1 | | | -| add | false | 2 | true | | -| addErrorContext | false | ? | | context | -| all | false | 2 | true | | -| any | false | 2 | true | | -| appendContext | false | ? | | | -| attrNames | false | 1 | true | | -| attrValues | false | | true | | -| baseNameOf | true | | | | -| bitAnd | false | | | | -| bitOr | false | | | | -| bitXor | false | | | | -| builtins | true | | | | -| catAttrs | false | | | | -| compareVersions | false | | | | -| concatLists | false | | | | -| concatMap | false | | | | -| concatStringsSep | false | | | | -| currentSystem | false | | | | -| currentTime | false | | false | | -| deepSeq | false | | | | -| derivation | true | | | store | -| derivationStrict | true | | | store | -| dirOf | true | | | | -| div | false | | | | -| elem | false | | | | -| elemAt | false | | | | -| false | true | | | | -| fetchGit | true | | | store | -| fetchMercurial | true | | | store | -| fetchTarball | true | | | store | -| fetchurl | false | | | store | -| filter | false | | | | -| filterSource | false | | | store | -| findFile | false | | false | todo | -| foldl' | false | | | | -| fromJSON | false | | | | -| fromTOML | true | | | | -| functionArgs | false | | | | -| genList | false | | | | -| genericClosure | false | | | todo | -| getAttr | false | | | | -| getContext | false | | | | -| getEnv | false | | false | | -| hasAttr | false | | | | -| hasContext | false | | | | -| hashFile | false | | false | | -| hashString | false | | | | -| head | false | | | | -| import | true | | | | -| intersectAttrs | false | | | | -| isAttrs | false | | | | -| isBool | false | | | | -| isFloat | false | | | | -| isFunction | false | | | | -| isInt | false | | | | -| isList | false | | | | -| isNull | true | | | | -| isPath | false | | | | -| isString | false | | | | -| langVersion | false | | | | -| length | false | | | | -| lessThan | false | | | | -| listToAttrs | false | | | | -| map | true | | | | -| mapAttrs | false | | | | -| match | false | | | | -| mul | false | | | | -| nixPath | false | | | todo | -| nixVersion | false | | | todo | -| null | true | | | | -| parseDrvName | false | | | | -| partition | false | | | | -| path | false | | sometimes | store | -| pathExists | false | | false | | -| placeholder | true | | | context | -| readDir | false | | false | | -| readFile | false | | false | | -| removeAttrs | true | | | | -| replaceStrings | false | | | | -| scopedImport | true | | | | -| seq | false | | | | -| sort | false | | | | -| split | false | | | | -| splitVersion | false | | | | -| storeDir | false | | | store | -| storePath | false | | | store | -| stringLength | false | | | | -| sub | false | | | | -| substring | false | | | | -| tail | false | | | | -| throw | true | | | | -| toFile | false | | | store | -| toJSON | false | | | | -| toPath | false | | | | -| toString | true | | | | -| toXML | true | | | | -| trace | false | | | | -| true | true | | | | -| tryEval | false | | | | -| typeOf | false | | | | -| unsafeDiscardOutputDependency | false | | | | -| unsafeDiscardStringContext | false | | | | -| unsafeGetAttrPos | false | | | todo | -| valueSize | false | | | todo | - -## Added after C++ Nix 2.3 (without Flakes enabled) - -| name | global | arity | pure | impl | -|---------------|--------|-------|-------|-------| -| break | false | 1 | | todo | -| ceil | false | 1 | true | | -| fetchTree | true | 1 | | todo | -| floor | false | 1 | true | | -| groupBy | false | 2 | true | | -| traceVerbose | false | 2 | | todo | -| zipAttrsWith | false | 2 | true | todo | diff --git a/tvix/eval/docs/catchable-errors.md b/tvix/eval/docs/catchable-errors.md deleted file mode 100644 index ce320a921777..000000000000 --- a/tvix/eval/docs/catchable-errors.md +++ /dev/null @@ -1,131 +0,0 @@ -# (Possible) Implementation(s) of Catchable Errors for `builtins.tryEval` - -## Terminology - -Talking about “catchable errors” in Nix in general is a bit precarious since -there is no properly established terminology. Also, the existing terms are less -than apt. The reason for this lies in the fact that catchable errors (or -whatever you want to call them) don't properly _exist_ in the language: While -Nix's `builtins.tryEval` is (originally) based on the C++ exception system, -it specifically lacks the ability of such systems to have an exception _value_ -whilst handling it. Consequently, these errors don't have an obvious name -as they never appear _in_ the Nix language. They just have to be named in the -respective Nix implementation: - -- In C++ Nix the only term for such errors is `AssertionError` which is the - name of the (C++) exception used in the implementation internally. This - term isn't great, though, as `AssertionError`s can not only be generated - using `assert`, but also using `throw` and failed `NIX_PATH` resolutions. - Were this terminology to be used in documentation addressing Nix language - users, it would probably only serve confusion. - -- Tvix currently (as of r/7573) uses the term catchable errors. This term - relates to nothing in the language as such: Errors are not caught, we rather - try to evaluate an expression. Catching also sort of implies that a value - representation of the error is attainable (like in an exception system) which - is untrue. - -In light of this I (sterni) would like to suggest “tryable errors” as an -alternative term going forward which isn't inaccurate and relates to terms -already established by language internal naming. - -However, this document will continue using the term catchable error until the -naming is adjusted in Tvix itself. - -## Implementation - -Below we discuss different implementation approaches in Tvix in order to arrive -at a proposal for the new one. The historical discussion is intended as a basis -for discussing the proposal: Are we committing to an old or current mistake? Are -we solving all problems that cropped up or were solved at any given point in -time? - -### Original - -The original implementation of `tryEval` in cl/6924 was quite straightforward: -It would simply interrupt the propagation of a potential catchable error to the -top level (which usually happened using the `?` operator) in the builtin and -construct the appropriate representation of an unsuccessful evaluation if the -error was deemed catchable. It had, however, multiple problems: - -- The VM was originally written without `tryEval` in mind, i.e. it largely - assumed that an error would always cause execution to be terminated. This - problem was later solved (cl/6940). -- Thunks could not be `tryEval`-ed multiple times (b/281). This was another - consequence of VM architecture at the time: Thunks would be blackholed - before evaluation was started and the error could occur. Due to the - interaction of the generator-based VM code and `Value::force` the part - of the code altering the thunk state would never be informed about the - evaluation result in case of a failure, so the thunk would remain - blackholed leading to a crash if the same thunk was `tryEval`-ed or - forced again. To solve this issue, amjoseph completely overhauled - the implementation. - -One key point about this implementation is that it is based on the assumption -that catchable errors can only be generated in thunks, i.e. expressions causing -them are never evaluated strictly. This can be illustrated using C++ Nix: - -```console -> nix-instantiate --eval -E '[ (assert false; true) (builtins.throw "") <nixpkgs> ]' -[ <CODE> <CODE> <CODE> ] -``` - -If this wasn't the case, the VM could encounter the error in a situation where -the error would not have needed to pass through the `tryEval` builtin, causing -evaluation to abort. - -### Present - -The current system (mostly implemented in cl/9289) uses a very different -approach: Instead of relying on the thunk boundary, catchable errors are no -longer errors, but special values. They are created at the relevant points (e.g. -`builtins.throw`) and propagated whenever they are encountered by VM ops or -builtins. Finally, they either encounter `builtins.tryEval` (and are converted to -an ordinary value again) or the top level where they become a normal error again. - -The problems with this mostly stem from the confusion between values and errors -that it necessitates: - -- In most circumstances, catchable errors end up being errors again, as `tryEval` - is not used a lot. So `throw`s usually end up causing evaluation to abort. - Consequently, not only `Value::Catchable` is necessary, but also a corresponding - error variant that is _only_ created if a catchable value remains at the end of - evaluation. A requirement that was missed until cl/10991 (!) which illustrate - how strange that architecture is. A consequence of this is that catchable - errors have no location information at all. -- `Value::Catchable` is similar to other internal values in Tvix, but is much - more problematic. Aside from thunks, internal values only exist for a brief - amount of time on the stack and it is very clear what parts of the VM or - builtins need to handle them. This means that the rest of the implementation - need to consider them, keeping the complexity caused by the internal value - low. `Value::Catchable`, on the other hand, may exist anywhere and be passed - to any VM op or builtin, so it needs to be correctly propagated _everywhere_. - This causes a lot of noise in the code as well as a big potential for bugs. - Essentially, catchable errors require as much attention by the Tvix developer - as laziness. This doesn't really correlate to the importance of the two - features to the Nix language. - -### Future? - -The core assumption of the original solution does offer a path forward: After -cl/9289 we should be in a better position to introspect an error occurring from -within the VM code, but we need a better way of storing such an error to prevent -another b/281. If catchable errors can only be generated in thunks, we can just -use the thunk representation for this. This would mean that `Thunk::force_` -would need to check if evaluation was successful and (in case of failure) -change the thunk representation - -- either to the original `ThunkRepr::Suspended` which would be simple, but of - course mean duplicated evaluation work in some expressions. In fact, this - would probably leave a lot of easy performance on the table for use cases we - would like to support, e.g. tree walkers for nixpkgs. -- or to a new `ThunkRepr` variant that stores the kind of the error and all - necessary location info so stack traces can work properly. This of course - reintroduces some of the difficulty of having two kinds of errors, but it is - hopefully less problematic, as the thunk boundary (i.e. `Thunk::force`) is - where errors would usually occur. - -Besides the question whether this proposal can actually be implemented, another -consideration is whether the underlying assumption will hold in the future, i.e. -can we implement optimizations for thunk elimination in a way that thunks that -generate catchable errors are never eliminated? diff --git a/tvix/eval/docs/known-optimisation-potential.md b/tvix/eval/docs/known-optimisation-potential.md deleted file mode 100644 index 0ab185fe1be2..000000000000 --- a/tvix/eval/docs/known-optimisation-potential.md +++ /dev/null @@ -1,162 +0,0 @@ -Known Optimisation Potential -============================ - -There are several areas of the Tvix evaluator code base where -potentially large performance gains can be achieved through -optimisations that we are already aware of. - -The shape of most optimisations is that of moving more work into the -compiler to simplify the runtime execution of Nix code. This leads, in -some cases, to drastically higher complexity in both the compiler -itself and in invariants that need to be guaranteed between the -runtime and the compiler. - -For this reason, and because we lack the infrastructure to adequately -track their impact (WIP), we have not yet implemented these -optimisations, but note the most important ones here. - -* Use "open upvalues" [hard] - - Right now, Tvix will immediately close over all upvalues that are - created and clone them into the `Closure::upvalues` array. - - Instead of doing this, we can statically determine most locals that - are closed over *and escape their scope* (similar to how the - `compiler::scope::Scope` struct currently tracks whether locals are - used at all). - - If we implement the machinery to track this, we can implement some - upvalues at runtime by simply sticking stack indices in the upvalue - array and only copy the values where we know that they escape. - -* Avoid `with` value duplication [easy] - - If a `with` makes use of a local identifier in a scope that can not - close before the with (e.g. not across `LambdaCtx` boundaries), we - can avoid the allocation of the phantom value and duplication of the - `NixAttrs` value on the stack. In this case we simply push the stack - index of the known local. - -* Multiple attribute selection [medium] - - An instruction could be introduced that avoids repeatedly pushing an - attribute set to/from the stack if multiple keys are being selected - from it. This occurs, for example, when inheriting from an attribute - set or when binding function formals. - -* Split closure/function representation [easy] - - Functions have fewer fields that need to be populated at runtime and - can directly use the `value::function::Lambda` representation where - possible. - -* Apply `compiler::optimise_select` to other set operations [medium] - - In addition to selects, statically known attribute resolution could - also be used for things like `?` or `with`. The latter might be a - little more complicated but is worth investigating. - -* Inline fully applied builtins with equivalent operators [medium] - - Some `builtins` have equivalent operators, e.g. `builtins.sub` - corresponds to the `-` operator, `builtins.hasAttr` to the `?` - operator etc. These operators additionally compile to a primitive - VM opcode, so they should be just as cheap (if not cheaper) as - a builtin application. - - In case the compiler encounters a fully applied builtin (i.e. - no currying is occurring) and the `builtins` global is unshadowed, - it could compile the equivalent operator bytecode instead: For - example, `builtins.sub 20 22` would be compiled as `20 - 22`. - This would ensure that equivalent `builtins` can also benefit - from special optimisations we may implement for certain operators - (in the absence of currying). E.g. we could optimise access - to the `builtins` attribute set which a call to - `builtins.getAttr "foo" builtins` should also profit from. - -* Avoid nested `VM::run` calls [hard] - - Currently when encountering Nix-native callables (thunks, closures) - the VM's run loop will nest and return the value of the nested call - frame one level up. This makes the Rust call stack almost mirror the - Nix call stack, which is usually undesirable. - - It is possible to detect situations where this is avoidable and - instead set up the VM in such a way that it continues and produces - the desired result in the same run loop, but this is kind of tricky - to get right - especially while other parts are still in flux. - - For details consult the commit with Gerrit change ID - `I96828ab6a628136e0bac1bf03555faa4e6b74ece`, in which the initial - attempt at doing this was reverted. - -* Avoid thunks if only identifier closing is required [medium] - - Some constructs, like `with`, mostly do not change runtime behaviour - if thunked. However, they are wrapped in thunks to ensure that - deferred identifiers are resolved correctly. - - This can be avoided, as we statically analyse the scope and should - be able to tell whether any such logic was required. - -* Intern literals [easy] - - Currently, the compiler emits a separate entry in the constant - table for each literal. So the program `1 + 1 + 1` will have - three entries in its `Chunk::constants` instead of only one. - -* Do some list and attribute set operations in place [hard] - - Algorithms that can not do a lot of work inside `builtins` like `map`, - `filter` or `foldl'` usually perform terribly if they use data structures like - lists and attribute sets. - - `builtins` can do work in place on a copy of a `Value`, but naïvely expressed - recursive algorithms will usually use `//` and `++` to do a single change to a - `Value` at a time, requiring a full copy of the data structure each time. - It would be a big improvement if we could do some of these operations in place - without requiring a new copy. - - There are probably two approaches: We could determine statically if a value is - reachable from elsewhere and emit a special in place instruction if not. An - easier alternative is probably to rely on reference counting at runtime: If no - other reference to a value exists, we can extend the list or update the - attribute set in place. - - An **alternative** to this is using [persistent data - structures](https://en.wikipedia.org/wiki/Persistent_data_structure) or at the - very least [immutable data structures](https://docs.rs/im/latest/im/) that can - be copied more efficiently than the stock structures we are using at the - moment. - -* Skip finalising unfinalised thunks or non-thunks instead of crashing [easy] - - Currently `OpFinalise` crashes the VM if it is called on values that don't - need to be finalised. This helps catching miscompilations where `OpFinalise` - operates on the wrong `StackIdx`. In the case of function argument patterns, - however, this means extra VM stack and instruction overhead for dynamically - determining if finalisation is necessary or not. This wouldn't be necessary - if `OpFinalise` would just noop on any values that don't need to be finalised - (anymore). - -* Phantom binding for from expression of inherits [easy] - - The from expression of an inherit is reevaluated for each inherit. This can - be demonstrated using the following Nix expression which, counter-intuitively, - will print “plonk” twice. - - ```nix - let - inherit (builtins.trace "plonk" { a = null; b = null; }) a b; - in - builtins.seq a (builtins.seq b null) - ``` - - In most Nix code, the from expression is just an identifier, so it is not - terribly inefficient, but in some cases a more expensive expression may - be used. We should create a phantom binding for the from expression that - is reused in the inherits, so only a single thunk is created for the from - expression. - - Since we discovered this, C++ Nix has implemented a similar optimization: - <https://github.com/NixOS/nix/pull/9847>. diff --git a/tvix/eval/docs/language-issues.md b/tvix/eval/docs/language-issues.md deleted file mode 100644 index 152e6594a1d0..000000000000 --- a/tvix/eval/docs/language-issues.md +++ /dev/null @@ -1,46 +0,0 @@ -# Nix language issues - -In the absence of a language standard, what Nix (the language) is, is prescribed -by the behavior of the C++ Nix implementation. Still, there are reasons not to -accept some behavior: - -* Tvix aims for nixpkgs compatibility only. This means we can ignore behavior in - edge cases nixpkgs doesn't trigger as well as obscure features it doesn't use - (e.g. `__overrides`). -* Some behavior of the Nix evaluator seems to be unintentional or an - implementation detail leaking out into language behavior. - -Especially in the latter case, it makes sense to raise the respective issue and -maybe to get rid of the behavior in all implementations for good. Below is an -(incomplete) list of such issues: - -* [Behaviour of nested attribute sets depends on definition order][i7111] -* [Partially constructed attribute sets are observable during dynamic attr names construction][i7012] -* [Nix parsers merges multiple attribute set literals for the same key incorrectly depending on definition order][i7115] - -On the other hand, there is behavior that seems to violate one's expectation -about the language at first, but has good enough reasons from an implementor's -perspective to keep them: - -* Dynamic keys are forbidden in `let` and `inherit`. This makes sure that we - only need to do runtime identifier lookups for `with`. More dynamic (i.e. - runtime) lookups would make the scoping system even more complicated as well - as hurt performance. -* Dynamic attributes of `rec` sets are not added to its scope. This makes sense - for the same reason. -* Dynamic and nested attributes in attribute sets don't get merged. This is a - tricky one, but avoids doing runtime (recursive) merges of attribute sets. - Instead all necessary merging can be inferred statically, i.e. the C++ Nix - implementation already merges at parse time, making nested attribute keys - syntactic sugar effectively. - -Other behavior is just odd, surprising or underdocumented: - -* `builtins.foldl'` doesn't force the initial accumulator (but all other - intermediate accumulator values), differing from e.g. Haskell, see - the [relevant PR discussion][p7158]. - -[i7111]: https://github.com/NixOS/nix/issues/7111 -[i7012]: https://github.com/NixOS/nix/issues/7012 -[i7115]: https://github.com/NixOS/nix/issues/7115 -[p7158]: https://github.com/NixOS/nix/pull/7158 diff --git a/tvix/eval/docs/opcodes-attrsets.md b/tvix/eval/docs/opcodes-attrsets.md deleted file mode 100644 index 7026f3319dda..000000000000 --- a/tvix/eval/docs/opcodes-attrsets.md +++ /dev/null @@ -1,122 +0,0 @@ -# attrset-opcodes - -The problem with attrset literals is twofold: - -1. The keys of attribute sets may be dynamically evaluated. - - Access: - - ```nix - let - k = "foo"; - attrs = { /* etc. */ }; - in attrs."${k}" - ``` - - Literal: - ```nix - let - k = "foo"; - in { - "${k}" = 42; - } - ``` - - The problem with this is that the attribute set key is not known at - compile time, and needs to be dynamically evaluated by the VM as an - expression. - - For the most part this should be pretty simple, assuming a - theoretical instruction set: - - ``` - 0000 OP_CONSTANT(0) # key "foo" - 0001 OP_CONSTANT(1) # value 42 - 0002 OP_ATTR_SET(1) # construct attrset from 2 stack values - ``` - - The operation pushing the key needs to be replaced with one that - leaves a single value (the key) on the stack, i.e. the code for the - expression, e.g.: - - ``` - 0000..000n <operations leaving a string value on the stack> - 000n+1 OP_CONSTANT(1) # value 42 - 000n+2 OP_ATTR_SET(1) # construct attrset from 2 stack values - ``` - - This is fairly easy to do by simply recursing in the compiler when - the key expression is encountered. - -2. The keys of attribute sets may be nested. - - This is the non-trivial part of dealing with attribute set - literals. Specifically, the nesting can be arbitrarily deep and the - AST does not guarantee that related set keys are located - adjacently. - - Furthermore, this frequently occurs in practice in Nix. We need a - bytecode representation that makes it possible to construct nested - attribute sets at runtime. - - Proposal: AttrPath values - - If we can leave a value representing an attribute path on the - stack, we can offload the construction of nested attribute sets to - the `OpAttrSet` operation. - - Under the hood, OpAttrSet in practice constructs a `Map<NixString, - Value>` attribute set in most cases. This means it expects to pop - the value of the key of the stack, but is otherwise free to do - whatever it wants with the underlying map. - - In a simple example, we could have code like this: - - ```nix - { - a.b = 15; - } - ``` - - This would be compiled to a new `OpAttrPath` instruction that - constructs and pushes an attribute path from a given number of - fragments (which are popped off the stack). - - For example, - - ``` - 0000 OP_CONSTANT(0) # key "a" - 0001 OP_CONSTANT(1) # key "b" - 0002 OP_ATTR_PATH(2) # construct attrpath from 2 fragments - 0003 OP_CONSTANT(2) # value 42 - 0004 OP_ATTRS(1) # construct attrset from one pair - ``` - - Right before `0004` the stack would be left like this: - - [ AttrPath[a,b], 42 ] - - Inside of the `OP_ATTRS` instruction we could then begin - construction of the map and insert the nested attribute sets as - required, as well as validate that there are no duplicate keys. - -3. Both of these cases can occur simultaneously, but this is not a - problem as the opcodes combine perfectly fine, e.g.: - - ```nix - let - k = "a"; - in { - "${k}".b = 42; - } - ``` - - results in - - ``` - 0000..000n <operations leaving a string value on the stack> - 000n+1 OP_CONSTANT(1) # key "b" - 000n+2 OP_ATTR_PATH(2) # construct attrpath from 2 fragments - 000n+3 OP_CONSTANT(2) # value 42 - 000n+4 OP_ATTR_SET(1) # construct attrset from 2 stack values - ``` diff --git a/tvix/eval/docs/recursive-attrs.md b/tvix/eval/docs/recursive-attrs.md deleted file mode 100644 index c30cfd33e6c7..000000000000 --- a/tvix/eval/docs/recursive-attrs.md +++ /dev/null @@ -1,68 +0,0 @@ -Recursive attribute sets -======================== - -The construction behaviour of recursive attribute sets is very -specific, and a bit peculiar. - -In essence, there are multiple "phases" of scoping that take place -during attribute set construction: - -1. Every inherited value without an explicit source is inherited only - from the **outer** scope in which the attribute set is enclosed. - -2. A new scope is opened in which all recursive keys are evaluated. - This only considers **statically known keys**, attributes can - **not** recurse into dynamic keys in `self`! - - For example, this code is invalid in C++ Nix: - - ``` - nix-repl> rec { ${"a"+""} = 2; b = a * 10; } - error: undefined variable 'a' at (string):1:26 - ``` - -3. Finally, a third scope is opened in which dynamic keys are - evaluated. - -This behaviour, while possibly a bit strange and unexpected, actually -simplifies the implementation of recursive attribute sets in Tvix as -well. - -Essentially, a recursive attribute set like this: - -```nix -rec { - inherit a; - b = a * 10; - ${"c" + ""} = b * 2; -} -``` - -Can be compiled like the following expression: - -```nix -let - inherit a; -in let - b = a * 10; - in { - inherit a b; - ${"c" + ""} = b * 2; - } -``` - -Completely deferring the resolution of recursive identifiers to the -existing handling of recursive scopes (i.e. deferred access) in let -bindings. - -In practice, we can further specialise this and compile each scope -directly into the form expected by `OpAttrs` (that is, leaving -attribute names on the stack) before each value's position. - -C++ Nix's Implementation ------------------------- - -* [`ExprAttrs`](https://github.com/NixOS/nix/blob/2097c30b08af19a9b42705fbc07463bea60dfb5b/src/libexpr/nixexpr.hh#L241-L268) - (AST representation of attribute sets) -* [`ExprAttrs::eval`](https://github.com/NixOS/nix/blob/075bf6e5565aff9fba0ea02f3333c82adf4dccee/src/libexpr/eval.cc#L1333-L1414) -* [`addAttr`](https://github.com/NixOS/nix/blob/master/src/libexpr/parser.y#L98-L156) (`ExprAttrs` construction in the parser) diff --git a/tvix/eval/docs/vm-loop.md b/tvix/eval/docs/vm-loop.md deleted file mode 100644 index 6266d34709cb..000000000000 --- a/tvix/eval/docs/vm-loop.md +++ /dev/null @@ -1,315 +0,0 @@ -tvix-eval VM loop -================= - -This document describes the new tvix-eval VM execution loop implemented in the -chain focusing around cl/8104. - -## Background - -The VM loop implemented in Tvix prior to cl/8104 had several functions: - -1. Advancing the instruction pointer for a chunk of Tvix bytecode and - executing instructions in a loop until a result was yielded. - -2. Tracking Nix call frames as functions/thunks were entered/exited. - -3. Catching trampoline requests returned from instructions to force suspended - thunks without increasing stack size *where possible*. - -4. Handling trampolines through an inner trampoline loop, switching between a - code execution mode and execution of subsequent trampolines. - -This implementation of the trampoline logic was added on to the existing VM, -which previously always recursed for thunk forcing. There are some cases (for -example values that need to be forced *inside* of the execution of a builtin) -where trampolines could not previously be used, and the VM recursed anyways. - -As a result of this trampoline logic being added "on top" of the existing VM -loop the code became quite difficult to understand. This led to several bugs, -for example: b/251, b/246, b/245, and b/238. - -These bugs were tricky to deal with, as we had to try and make the VM do -things that are somewhat difficult to fit into its model. We could of course -keep extending the trampoline logic to accommodate all sorts of concepts (such -as finalisers), but that seems like it does not solve the root problem. - -## New VM loop - -In cl/8104, a unified new solution is implemented with which the VM is capable -of evaluating everything without increasing the call stack size. - -This is done by introducing a new frame stack in the VM, on which execution -frames are enqueued that are either: - -1. A bytecode frame, consisting of Tvix bytecode that evaluates compiled Nix - code. -2. A generator frame, consisting of some VM logic implemented in pure Rust - code that can be *suspended* when it hits a point where the VM would - previously need to recurse. - -We do this by making use of the `async` *keyword* in Rust, but notably -*without* introducing asynchronous I/O or concurrency in tvix-eval (the -complexity of which is currently undesirable for us). - -Specifically, when writing a Rust function that uses the `async` keyword, such -as: - -```rust -async fn some_builtin(input: Value) -> Result<Value, ErrorKind> { - let mut out = NixList::new(); - - for element in input.to_list()? { - let result = do_something_that_requires_the_vm(element).await; - out.push(result); - } - - Ok(out) -} -``` - -The compiler actually generates a state-machine under-the-hood which allows -the execution of that function to be *suspended* whenever it hits an `await`. - -We use the [`genawaiter`][] crate that gives us a data structure and simple -interface for getting instances of these state machines that can be stored in -a struct (in our case, a *generator frame*). - -The execution of the VM then becomes the execution of an *outer loop*, which -is responsible for selecting the next generator frame to execute, and two -*inner loops*, which drive the execution of a bytecode frame or generator -frame forward until it either yields a value or asks to be suspended in favour -of another frame. - -All "communication" between frames happens solely through values left on the -stack: Whenever a frame of either type runs to completion, it is expected to -leave a *single* value on the stack. It follows that the whole VM, upon -completion of the last (or initial, depending on your perspective) frame -yields its result as the return value. - -The core of the VM restructuring is cl/8104, unfortunately one of the largest -single commit changes we've had to make yet, as it touches pretty much all -areas of tvix-eval. The introduction of the generators and the -message/response system we built to request something from the VM, suspend a -generator, and wait for the return is in cl/8148. - -The next sections describe in detail how the three different loops work. - -### Outer VM loop - -The outer VM loop is responsible for selecting the next frame to run, and -dispatching it correctly to inner loops, as well as determining when to shut -down the VM and return the final result. - -``` - ╭──────────────────╮ - ╭────────┤ match frame kind ├──────╮ - │ ╰──────────────────╯ │ - │ │ - ┏━━━━━━━━━━━━┷━━━━━┓ ╭───────────┴───────────╮ -───►┃ frame_stack.pop()┃ ▼ ▼ - ┗━━━━━━━━━━━━━━━━━━┛ ┏━━━━━━━━━━━━━━━━┓ ┏━━━━━━━━━━━━━━━━━┓ - ▲ ┃ bytecode frame ┃ ┃ generator frame ┃ - │ ┗━━━━━━━━┯━━━━━━━┛ ┗━━━━━━━━┯━━━━━━━━┛ - │[yes, cont.] │ │ - │ ▼ ▼ - ┏━━━━━━━━┓ │ ╔════════════════╗ ╔═════════════════╗ -◄───┨ return ┃ │ ║ inner bytecode ║ ║ inner generator ║ - ┗━━━━━━━━┛ │ ║ loop ║ ║ loop ║ - ▲ │ ╚════════╤═══════╝ ╚════════╤════════╝ - │ ╭────┴─────╮ │ │ - │ │ has next │ ╰───────────┬───────────╯ - [no] ╰───┤ frame? │ │ - ╰────┬─────╯ ▼ - │ ┏━━━━━━━━━━━━━━━━━┓ - │ ┃ frame completed ┃ - ╰─────────────────────────┨ or suspended ┃ - ┗━━━━━━━━━━━━━━━━━┛ -``` - -Initially, the VM always pops a frame from the frame stack and then inspects -the type of frame it found. As a consequence the next frame to execute is -always the frame at the top of the stack, and setting up a VM initially for -code execution is done by leaving a bytecode frame with the code to execute on -the stack and passing control to the outer loop. - -Control is dispatched to either of the inner loops (depending on the type of -frame) and the cycle continues once they return. - -When an inner loop returns, it has either finished its execution (and left its -result value on the *value stack*), or its frame has requested to be -suspended. - -Frames request suspension by re-enqueueing *themselves* through VM helper -methods, and then leaving the frame they want to run *on top* of themselves in -the frame stack before yielding control back to the outer loop. - -The inner control loops inform the outer loops about whether the frame has -been *completed* or *suspended* by returning a boolean. - -### Inner bytecode loop - -The inner bytecode loop drives the execution of some Tvix bytecode by -continously looking at the next instruction to execute, and dispatching to the -instruction handler. - -``` - ┏━━━━━━━━━━━━━┓ -◄──┨ return true ┃ - ┗━━━━━━━━━━━━━┛ - ▲ - ╔════╧═════╗ - ║ OpReturn ║ - ╚══════════╝ - ▲ - ╰──┬────────────────────────────╮ - │ ▼ - │ ╔═════════════════════╗ - ┏━━━━━━━━┷━━━━━┓ ║ execute instruction ║ -───►┃ inspect next ┃ ╚══════════╤══════════╝ - ┃ instruction ┃ │ - ┗━━━━━━━━━━━━━━┛ │ - ▲ ╭─────┴─────╮ - ╰──────────────────────┤ suspends? │ - [no] ╰─────┬─────╯ - │ - │ - ┏━━━━━━━━━━━━━━┓ │ -◄──┨ return false ┃───────────────────────╯ - ┗━━━━━━━━━━━━━━┛ [yes] -``` - -With this refactoring, the compiler now emits a special `OpReturn` instruction -at the end of bytecode chunks. This is a signal to the runtime that the chunk -has completed and that its current value should be returned, without having to -perform instruction pointer arithmetic. - -When `OpReturn` is encountered, the inner bytecode loop returns control to the -outer loop and informs it (by returning `true`) that the bytecode frame has -completed. - -Any other instruction may also request a suspension of the bytecode frame (for -example, instructions that need to force a value). In this case the inner loop -is responsible for setting up the frame stack correctly, and returning `false` -to inform the outer loop of the suspension - -### Inner generator loop - -The inner generator loop is responsible for driving the execution of a -generator frame by continously calling [`Gen::resume`][] until it requests a -suspension (as a result of which control is returned to the outer loop), or -until the generator is done and yields a value. - -``` - ┏━━━━━━━━━━━━━┓ -◄──┨ return true ┃ ◄───────────────────╮ - ┗━━━━━━━━━━━━━┛ │ - │ - [Done] │ - ╭──────────────────┴─────────╮ - │ inspect generator response │◄────────────╮ - ╰──────────────────┬─────────╯ │ - [yielded] │ ┏━━━━━━━━┷━━━━━━━━┓ - │ ┃ gen.resume(msg) ┃◄── - ▼ ┗━━━━━━━━━━━━━━━━━┛ - ╭────────────╮ ▲ - │ same-frame │ │ - │ request? ├────────────────╯ - ╰─────┬──────╯ [yes] - ┏━━━━━━━━━━━━━━┓ │ -◄──┨ return false ┃ ◄──────────────────╯ - ┗━━━━━━━━━━━━━━┛ [no] -``` - -On each execution of a generator frame, `resume_with` is called with a -[`VMResponse`][] (i.e. a message *from* the VM *to* the generator). For a newly -created generator, the initial message is just `Empty`. - -A generator may then respond by signaling that it has finished execution -(`Done`), in which case the inner generator loop returns control to the outer -loop and informs it that this generator is done (by returning `true`). - -A generator may also respond by signaling that it needs some data from the VM. -This is implemented through a request-response pattern, in which the generator -returns a `Yielded` message containing a [`VMRequest`][]. These requests can be -very simple ("Tell me the current store path") or more complex ("Call this Nix -function with these values"). - -Requests are divided into two classes: Same-frame requests (requests that can be -responded to *without* returning control to the outer loop, i.e. without -executing a *different* frame), and multi-frame generator requests. Based on the -type of request, the inner generator loop will either handle it right away and -send the response in a new `resume_with` call, or return `false` to the outer -generator loop after setting up the frame stack. - -Most of this logic is implemented in cl/8148. - -[`Gen::resume`]: https://docs.rs/genawaiter/0.99.1/genawaiter/rc/struct.Gen.html#method.resume_with -[`VMRequest`]: https://cs.tvl.fyi/depot@2696839770c1ccb62929ff2575a633c07f5c9593/-/blob/tvix/eval/src/vm/generators.rs?L44 -[`VMResponse`]: https://cs.tvl.fyi/depot@2696839770c1ccb62929ff2575a633c07f5c9593/-/blob/tvix/eval/src/vm/generators.rs?L169 - -## Advantages & Disadvantages of the approach - -This approach has several advantages: - -* The execution model is much simpler than before, making it fairly - straightforward to build up a mental model of what the VM does. - -* All "out of band requests" inside the VM are handled through the same - abstraction (generators). - -* Implementation is not difficult, albeit a little verbose in some cases (we - can argue about whether or not to introduce macros for simplifying it). - -* Several parts of the VM execution are now much easier to document, - potentially letting us onboard tvix-eval contributors faster. - -* The linear VM execution itself is much easier to trace now, with for example - the `RuntimeObserver` (and by extension `tvixbolt`) giving much clearer - output now. - -But it also comes with some disadvantages: - -* Even though we "only" use the `async` keyword without a full async-I/O - runtime, we still encounter many of the drawbacks of the fragmented Rust - async ecosystem. - - The biggest issue with this is that parts of the standard library become - unavailable to us, for example the built-in `Vec::sort_by` can no longer be - used for sorting in Nix because our comparators themselves are `async`. - - This led us to having to implement some logic on our own, as the design of - `async` in Rust even makes it difficult to provide usecase-generic - implementations of concepts like sorting. - -* We need to allocate quite a few new structures on the heap in order to drive - generators, as generators involve storing `Future` types (with unknown - sizes) inside of structs. - - In initial testing this seems to make no significant difference in - performance (our performance in an actual nixpkgs-eval is still bottlenecked - by I/O concerns and reference scanning), but is something to keep in mind - later on when we start optimising more after the low-hanging fruits have - been reaped. - -## Alternatives considered - -1. Tacking on more functionality onto the existing VM loop - implementation to accomodate problems as they show up. This is not - preferred as the code is already getting messy. - -2. Making tvix-eval a fully `async` project, pulling in something like Tokio - or `async-std` as a runtime. This is not preferred due to the massively - increased complexity of those solutions, and all the known issues of fully - buying in to the async ecosystem. - - tvix-eval fundamentally should work for use-cases besides building Nix - packages (e.g. for `//tvix/serde`), and its profile should be as slim as - possible. - -3. Convincing the Rust developers that Rust needs a way to guarantee - constant-stack-depth tail calls through something like a `tailcall` - keyword. - -4. ... ? - -[`genawaiter`]: https://docs.rs/genawaiter/ |