about summary refs log tree commit diff
path: root/tvix/eval/docs/vm-loop.md
diff options
context:
space:
mode:
Diffstat (limited to 'tvix/eval/docs/vm-loop.md')
-rw-r--r--tvix/eval/docs/vm-loop.md315
1 files changed, 0 insertions, 315 deletions
diff --git a/tvix/eval/docs/vm-loop.md b/tvix/eval/docs/vm-loop.md
deleted file mode 100644
index 6266d34709cb..000000000000
--- a/tvix/eval/docs/vm-loop.md
+++ /dev/null
@@ -1,315 +0,0 @@
-tvix-eval VM loop
-=================
-
-This document describes the new tvix-eval VM execution loop implemented in the
-chain focusing around cl/8104.
-
-## Background
-
-The VM loop implemented in Tvix prior to cl/8104 had several functions:
-
-1. Advancing the instruction pointer for a chunk of Tvix bytecode and
-   executing instructions in a loop until a result was yielded.
-
-2. Tracking Nix call frames as functions/thunks were entered/exited.
-
-3. Catching trampoline requests returned from instructions to force suspended
-   thunks without increasing stack size *where possible*.
-
-4. Handling trampolines through an inner trampoline loop, switching between a
-   code execution mode and execution of subsequent trampolines.
-
-This implementation of the trampoline logic was added on to the existing VM,
-which previously always recursed for thunk forcing. There are some cases (for
-example values that need to be forced *inside* of the execution of a builtin)
-where trampolines could not previously be used, and the VM recursed anyways.
-
-As a result of this trampoline logic being added "on top" of the existing VM
-loop the code became quite difficult to understand. This led to several bugs,
-for example: b/251, b/246, b/245, and b/238.
-
-These bugs were tricky to deal with, as we had to try and make the VM do
-things that are somewhat difficult to fit into its model. We could of course
-keep extending the trampoline logic to accommodate all sorts of concepts (such
-as finalisers), but that seems like it does not solve the root problem.
-
-## New VM loop
-
-In cl/8104, a unified new solution is implemented with which the VM is capable
-of evaluating everything without increasing the call stack size.
-
-This is done by introducing a new frame stack in the VM, on which execution
-frames are enqueued that are either:
-
-1. A bytecode frame, consisting of Tvix bytecode that evaluates compiled Nix
-   code.
-2. A generator frame, consisting of some VM logic implemented in pure Rust
-   code that can be *suspended* when it hits a point where the VM would
-   previously need to recurse.
-
-We do this by making use of the `async` *keyword* in Rust, but notably
-*without* introducing asynchronous I/O or concurrency in tvix-eval (the
-complexity of which is currently undesirable for us).
-
-Specifically, when writing a Rust function that uses the `async` keyword, such
-as:
-
-```rust
-async fn some_builtin(input: Value) -> Result<Value, ErrorKind> {
-  let mut out = NixList::new();
-
-  for element in input.to_list()? {
-    let result = do_something_that_requires_the_vm(element).await;
-    out.push(result);
-  }
-
-  Ok(out)
-}
-```
-
-The compiler actually generates a state-machine under-the-hood which allows
-the execution of that function to be *suspended* whenever it hits an `await`.
-
-We use the [`genawaiter`][] crate that gives us a data structure and simple
-interface for getting instances of these state machines that can be stored in
-a struct (in our case, a *generator frame*).
-
-The execution of the VM then becomes the execution of an *outer loop*, which
-is responsible for selecting the next generator frame to execute, and two
-*inner loops*, which drive the execution of a bytecode frame or generator
-frame forward until it either yields a value or asks to be suspended in favour
-of another frame.
-
-All "communication" between frames happens solely through values left on the
-stack: Whenever a frame of either type runs to completion, it is expected to
-leave a *single* value on the stack. It follows that the whole VM, upon
-completion of the last (or initial, depending on your perspective) frame
-yields its result as the return value.
-
-The core of the VM restructuring is cl/8104, unfortunately one of the largest
-single commit changes we've had to make yet, as it touches pretty much all
-areas of tvix-eval. The introduction of the generators and the
-message/response system we built to request something from the VM, suspend a
-generator, and wait for the return is in cl/8148.
-
-The next sections describe in detail how the three different loops work.
-
-### Outer VM loop
-
-The outer VM loop is responsible for selecting the next frame to run, and
-dispatching it correctly to inner loops, as well as determining when to shut
-down the VM and return the final result.
-
-```
-                          ╭──────────────────╮
-                 ╭────────┤ match frame kind ├──────╮
-                 │        ╰──────────────────╯      │
-                 │                                  │
-    ┏━━━━━━━━━━━━┷━━━━━┓                ╭───────────┴───────────╮
-───►┃ frame_stack.pop()┃                ▼                       ▼
-    ┗━━━━━━━━━━━━━━━━━━┛       ┏━━━━━━━━━━━━━━━━┓      ┏━━━━━━━━━━━━━━━━━┓
-                 ▲             ┃ bytecode frame ┃      ┃ generator frame ┃
-                 │             ┗━━━━━━━━┯━━━━━━━┛      ┗━━━━━━━━┯━━━━━━━━┛
-                 │[yes, cont.]          │                       │
-                 │                      ▼                       ▼
-    ┏━━━━━━━━┓   │             ╔════════════════╗      ╔═════════════════╗
-◄───┨ return ┃   │             ║ inner bytecode ║      ║ inner generator ║
-    ┗━━━━━━━━┛   │             ║      loop      ║      ║      loop       ║
-        ▲        │             ╚════════╤═══════╝      ╚════════╤════════╝
-        │   ╭────┴─────╮                │                       │
-        │   │ has next │                ╰───────────┬───────────╯
-   [no] ╰───┤  frame?  │                            │
-            ╰────┬─────╯                            ▼
-                 │                         ┏━━━━━━━━━━━━━━━━━┓
-                 │                         ┃ frame completed ┃
-                 ╰─────────────────────────┨  or suspended   ┃
-                                           ┗━━━━━━━━━━━━━━━━━┛
-```
-
-Initially, the VM always pops a frame from the frame stack and then inspects
-the type of frame it found. As a consequence the next frame to execute is
-always the frame at the top of the stack, and setting up a VM initially for
-code execution is done by leaving a bytecode frame with the code to execute on
-the stack and passing control to the outer loop.
-
-Control is dispatched to either of the inner loops (depending on the type of
-frame) and the cycle continues once they return.
-
-When an inner loop returns, it has either finished its execution (and left its
-result value on the *value stack*), or its frame has requested to be
-suspended.
-
-Frames request suspension by re-enqueueing *themselves* through VM helper
-methods, and then leaving the frame they want to run *on top* of themselves in
-the frame stack before yielding control back to the outer loop.
-
-The inner control loops inform the outer loops about whether the frame has
-been *completed* or *suspended* by returning a boolean.
-
-### Inner bytecode loop
-
-The inner bytecode loop drives the execution of some Tvix bytecode by
-continously looking at the next instruction to execute, and dispatching to the
-instruction handler.
-
-```
-   ┏━━━━━━━━━━━━━┓
-◄──┨ return true ┃
-   ┗━━━━━━━━━━━━━┛
-          ▲
-     ╔════╧═════╗
-     ║ OpReturn ║
-     ╚══════════╝
-          ▲
-          ╰──┬────────────────────────────╮
-             │                            ▼
-             │                 ╔═════════════════════╗
-    ┏━━━━━━━━┷━━━━━┓           ║ execute instruction ║
-───►┃ inspect next ┃           ╚══════════╤══════════╝
-    ┃  instruction ┃                      │
-    ┗━━━━━━━━━━━━━━┛                      │
-             ▲                      ╭─────┴─────╮
-             ╰──────────────────────┤ suspends? │
-                       [no]         ╰─────┬─────╯
-                                          │
-                                          │
-   ┏━━━━━━━━━━━━━━┓                       │
-◄──┨ return false ┃───────────────────────╯
-   ┗━━━━━━━━━━━━━━┛              [yes]
-```
-
-With this refactoring, the compiler now emits a special `OpReturn` instruction
-at the end of bytecode chunks. This is a signal to the runtime that the chunk
-has completed and that its current value should be returned, without having to
-perform instruction pointer arithmetic.
-
-When `OpReturn` is encountered, the inner bytecode loop returns control to the
-outer loop and informs it (by returning `true`) that the bytecode frame has
-completed.
-
-Any other instruction may also request a suspension of the bytecode frame (for
-example, instructions that need to force a value). In this case the inner loop
-is responsible for setting up the frame stack correctly, and returning `false`
-to inform the outer loop of the suspension
-
-### Inner generator loop
-
-The inner generator loop is responsible for driving the execution of a
-generator frame by continously calling [`Gen::resume`][] until it requests a
-suspension (as a result of which control is returned to the outer loop), or
-until the generator is done and yields a value.
-
-```
-   ┏━━━━━━━━━━━━━┓
-◄──┨ return true ┃ ◄───────────────────╮
-   ┗━━━━━━━━━━━━━┛                     │
-                                       │
-                               [Done]  │
-                    ╭──────────────────┴─────────╮
-                    │ inspect generator response │◄────────────╮
-                    ╰──────────────────┬─────────╯             │
-                            [yielded]  │              ┏━━━━━━━━┷━━━━━━━━┓
-                                       │              ┃ gen.resume(msg) ┃◄──
-                                       ▼              ┗━━━━━━━━━━━━━━━━━┛
-                                 ╭────────────╮                ▲
-                                 │ same-frame │                │
-                                 │  request?  ├────────────────╯
-                                 ╰─────┬──────╯      [yes]
-   ┏━━━━━━━━━━━━━━┓                    │
-◄──┨ return false ┃ ◄──────────────────╯
-   ┗━━━━━━━━━━━━━━┛                [no]
-```
-
-On each execution of a generator frame, `resume_with` is called with a
-[`VMResponse`][] (i.e. a message *from* the VM *to* the generator). For a newly
-created generator, the initial message is just `Empty`.
-
-A generator may then respond by signaling that it has finished execution
-(`Done`), in which case the inner generator loop returns control to the outer
-loop and informs it that this generator is done (by returning `true`).
-
-A generator may also respond by signaling that it needs some data from the VM.
-This is implemented through a request-response pattern, in which the generator
-returns a `Yielded` message containing a [`VMRequest`][]. These requests can be
-very simple ("Tell me the current store path") or more complex ("Call this Nix
-function with these values").
-
-Requests are divided into two classes: Same-frame requests (requests that can be
-responded to *without* returning control to the outer loop, i.e. without
-executing a *different* frame), and multi-frame generator requests. Based on the
-type of request, the inner generator loop will either handle it right away and
-send the response in a new `resume_with` call, or return `false` to the outer
-generator loop after setting up the frame stack.
-
-Most of this logic is implemented in cl/8148.
-
-[`Gen::resume`]: https://docs.rs/genawaiter/0.99.1/genawaiter/rc/struct.Gen.html#method.resume_with
-[`VMRequest`]: https://cs.tvl.fyi/depot@2696839770c1ccb62929ff2575a633c07f5c9593/-/blob/tvix/eval/src/vm/generators.rs?L44
-[`VMResponse`]: https://cs.tvl.fyi/depot@2696839770c1ccb62929ff2575a633c07f5c9593/-/blob/tvix/eval/src/vm/generators.rs?L169
-
-## Advantages & Disadvantages of the approach
-
-This approach has several advantages:
-
-* The execution model is much simpler than before, making it fairly
-  straightforward to build up a mental model of what the VM does.
-
-* All "out of band requests" inside the VM are handled through the same
-  abstraction (generators).
-
-* Implementation is not difficult, albeit a little verbose in some cases (we
-  can argue about whether or not to introduce macros for simplifying it).
-
-* Several parts of the VM execution are now much easier to document,
-  potentially letting us onboard tvix-eval contributors faster.
-
-* The linear VM execution itself is much easier to trace now, with for example
-  the `RuntimeObserver` (and by extension `tvixbolt`) giving much clearer
-  output now.
-
-But it also comes with some disadvantages:
-
-* Even though we "only" use the `async` keyword without a full async-I/O
-  runtime, we still encounter many of the drawbacks of the fragmented Rust
-  async ecosystem.
-
-  The biggest issue with this is that parts of the standard library become
-  unavailable to us, for example the built-in `Vec::sort_by` can no longer be
-  used for sorting in Nix because our comparators themselves are `async`.
-
-  This led us to having to implement some logic on our own, as the design of
-  `async` in Rust even makes it difficult to provide usecase-generic
-  implementations of concepts like sorting.
-
-* We need to allocate quite a few new structures on the heap in order to drive
-  generators, as generators involve storing `Future` types (with unknown
-  sizes) inside of structs.
-
-  In initial testing this seems to make no significant difference in
-  performance (our performance in an actual nixpkgs-eval is still bottlenecked
-  by I/O concerns and reference scanning), but is something to keep in mind
-  later on when we start optimising more after the low-hanging fruits have
-  been reaped.
-
-## Alternatives considered
-
-1. Tacking on more functionality onto the existing VM loop
-   implementation to accomodate problems as they show up. This is not
-   preferred as the code is already getting messy.
-
-2. Making tvix-eval a fully `async` project, pulling in something like Tokio
-   or `async-std` as a runtime. This is not preferred due to the massively
-   increased complexity of those solutions, and all the known issues of fully
-   buying in to the async ecosystem.
-
-   tvix-eval fundamentally should work for use-cases besides building Nix
-   packages (e.g. for `//tvix/serde`), and its profile should be as slim as
-   possible.
-
-3. Convincing the Rust developers that Rust needs a way to guarantee
-   constant-stack-depth tail calls through something like a `tailcall`
-   keyword.
-
-4. ... ?
-
-[`genawaiter`]: https://docs.rs/genawaiter/