diff options
author | Adam Joseph <adam@westernsemico.com> | 2022-11-05T19·10-0700 |
---|---|---|
committer | clbot <clbot@tvl.fyi> | 2022-11-08T08·41+0000 |
commit | e816d3a9dce30dc49ca830d705ccb6e5bfd569e8 (patch) | |
tree | 216bda7f2ecee01d8a35f92fbed3b65d01e2a781 /tvix | |
parent | 83dabf8955bae6d8dce4a06745d72f661fb5a0e0 (diff) |
docs(tvix/eval): document abandoned thread-local vm r/5261
This commit adds a markdown document which explains how the thread-local VM infrastructure works, in case it is useful in the future. Change-Id: Id10e32a9e3c5fa38a15d4bec9800f7234c59234a Reviewed-on: https://cl.tvl.fyi/c/depot/+/7193 Tested-by: BuildkiteCI Autosubmit: Adam Joseph <adam@westernsemico.com> Reviewed-by: tazjin <tazjin@tvl.su>
Diffstat (limited to 'tvix')
-rw-r--r-- | tvix/eval/docs/abandoned/thread-local-vm.md | 233 |
1 files changed, 233 insertions, 0 deletions
diff --git a/tvix/eval/docs/abandoned/thread-local-vm.md b/tvix/eval/docs/abandoned/thread-local-vm.md new file mode 100644 index 000000000000..c6a2d5e07e5c --- /dev/null +++ b/tvix/eval/docs/abandoned/thread-local-vm.md @@ -0,0 +1,233 @@ +# We can't have nice things because IFD + +The thread-local VM work below was ultimately not merged because it +was decided that it would be harmful for `tvix::eval::Value` to +implement `Eq`, `Hash`, or any of the other `std` traits. + +Implementing `std` traits on `Value` was deemed harmful because IFD +can cause arbitrary amounts of compilation to occur, including +network transactions with builders. Obviously it would be +unexpected and error-prone to have a `PartialEq::eq()` which does +something like this. This problem does not manifest within the +"nixpkgs compatibility only" scope, or in any undeprecated language +feature other than IFD. Although IFD is outside the "nixpkgs +compatibility scope", it [has been added to the TVL compatibility +scope](https://cl.tvl.fyi/c/depot/+/7193/comment/3418997b_0dbd0b65/). + +This was the sole reason for not merging. + +The explanation below may be useful in case future circumstances +affect the relevance of the reasoning above. + +The implementation can be found in these CLs: + +- [refactor(tvix/eval): remove lifetime parameter from VM<'o>](https://cl.tvl.fyi/c/depot/+/7194) +- [feat(tvix/eval): [FOUNDLING] thread-local VM](https://cl.tvl.fyi/c/depot/+/7195) +- [feat(tvix/eval): [FOUNDLING] VM::vm_xxx convenience methods](https://cl.tvl.fyi/c/depot/+/7196) +- [refactor(tvix/eval): [FOUNDLING]: drop explicit `&mut vm` parameter](https://cl.tvl.fyi/c/depot/+/7197) + +# Thread-local storage for tvix::eval::vm::VM + +## The problem + +`Value::force()` takes a `&mut VM` argument, since forcing a value +requires executing opcodes. This means that `Value::nix_eq()` too +must take a `&mut VM`, since any sensible definition of equality +will have to force thunks. + +Unfortunately Rust's `PartialEq::eq()` function does not accept any +additional arguments like this, so `Value` cannot implement +`PartialEq`. Worse, structs which *contain* `Value`s can't +implement `PartialEq` either. This means `Value`, and anything +containing it, cannot be the key for a `BTreeMap` or `HashMap`. We +can't even insert `Value`s into a `HashSet`! + +There are other situations like this that don't involve `PartialEq`, +but it's the most glaring one. The main problem is that you need a +`VM` in order to force thunks, and thunks can be anywhere in a +`Value`. + +## Solving the problem with thread-locals + +We could avoid threading the `&mut VM` through the entire codebase +by making it a thread-local. + +To do this without a performance hit, we need to use LLVM +thread-locals, which are the same cost as references to `static`s +but load relative to +[`llvm.threadlocal.address`][threadlocal-intrinsic] instead of +relative to the data segment. Unfortunately `#[thread_local]` [is +unstable][thread-local-unstable] and [unsafe in +general][thread-local-unsafe] for most of the cases where we would +want to use it. There is one [exception][tls-const-init], however: +if a `!thread_local()` has a `const` initializer, the compiler will +insert a `#[thread_local]`; this special case is both safe and +stable. + +The difficult decision is what the type of the thread-local should +be. Since you can't get a mutable reference to a `thread_local!()` +it will have to be some interior-mutability-bestowing wrapper around +our current `struct VM`. Here are the choices: + +### `RefCell<VM>` + +This is the obvious first choice, since it lets you borrow a +`RefMut<Target=VM>`. The problem here is that we want to keep the +codebase written such that all the functions in `impl VM` still take +a `&mut self`. This means that there will be an active mutable +borrow for the duration of `VM::call_builtin()`. So if we implement +`PartialEq` by having `eq()` attempt a second mutable borrow from +the thread-local storage, it will fail since there is already an +active borrow. + +The problem here is that you can't "unborrow" a `RefMut` except by +dropping it. There's no way around this. + +#### Problem: Uglification + +The only solution here is to rewrite all the functions in `impl VM` +so they don't take any kind of `self` argument, and then have them +do a short-lived `.borrow_mut()` from the thread-local `RefCell` +*separately, each time* they want to modify one of the fields of +`VM` (currently `frames`, `stack`, `with_stack`, `warnings`). This +means that if you had a code sequence like this: + +``` +impl VM { + fn foo(&mut self, ...) { + ... + self.frame().ip += 1; + self.some_other_method(); + self.frame().ip += 1; +``` + +You would need to add *two separate `borrow_mut()`s*, one for each +of the `self.frame().ip+=1` statements. You can't just do one big +`borrow_mut()` because `some_other_method()` will call +`borrow_mut()` and panic. + +#### Problem: Performance + +The `RefCell<VM>` approach also has a fairly huge performance hit, +because every single modification to any part of `VM` will require a +reference count increment/decrement, and a conditional branch based +on the check (which will never fail) that the `RefCell` isn't +already mutably borrowed. It will also impede a lot of rustc's +optimizations. + +### `Cell<VM>` + +This is a non-starter because it means that in order to mutate any +field of `VM`, you have to move the entire `struct VM` out of the +`Cell`, mutate it, and move it back in. + +### `Cell<Box<VM>>` + +Now we're getting warmer. Here, we can move the `Box<VM>` out of +the cell with a single pointer-sized memory access. + +We don't want to do the "uglification" described in the previous +section. We are very fortunate that, sometime in mid-2019, the Rust +dieties [decreed by fiat][fiat-decree] that `&Cell<T>` and `&mut T` +are bit-for-bit identical, and even gave us mortals safe wrappers +[`from_mut()`][from_mut] and [`get_mut()`][get_mut] around +`mem::transmute()`. + +So now, when a `VM` method (which takes `&mut self`) calls out to +some external code (like a builtin), instead of passing the `&mut +self` to the external code it can call `Cell::from_mut(&mut self)`, +and then `Cell::swap()` that into the thread-local storage cell for +the duration of the external code. After the external code returns, +it can `Cell::swap()` it back. This whole dance gets wrapped in a +lexical block, and the borrow checker sees that the `&Cell<Box<VM>>` +returned by `Cell::from_mut()` lives only until the end of the +lexical block, *so we get the `&mut self` back after the close-brace +for that block*. NLL FTW. This sounds like a lot of work, but it +should compile down to two pointer-sized loads and two pointer-sized +stores, and it is incurred basically only for `OpBuiltin`. + +This all works, with only two issues: + +1. `vm.rs` needs to be very careful to do the thread-local cell swap + dance before calling anything that might call `PartialEq::eq()` + (or any other method that expects to be able to pull the `VM` out + of thread-local storage). There is no compile-time check that we + did the dance in all the right places. If we forget to do the + dance somewhere we'll get a runtime panic from `Option::expect()` + (see next section). + +2. Since we need to call `Cell::from_mut()` on a `Box<VM>` rather + than a bare `VM`, we still need to rewrite all of `vm.rs` so that + every function takes a `&mut Box<VM>` instead of a `&mut self`. + This creates a huge amount of "noise" in the code. + +Fortunately, it turns out that nearly all the "noise" that arises +from the second point can be eliminated by taking advantage of +[deref coercions][deref-coercions]! This was the last "shoe to +drop". + +There is still the issue of having to be careful about calls from +`vm.rs` to things outside that file, but it's manageable. + +### `Cell<Option<Box<VM>>>` + +In order to get the "safe and stable `#[thread_local]`" +[exception][tls-const-init] we need a `const` initializer, which +means we need to be able to put something into the `Cell` that isn't +a `VM`. So the type needs to be `Cell<Option<Box<VM>>>`. + +Recall that you can't turn an `Option<&T>` into an `&Option<T>`. +The latter type has the "is this a `Some` or `None`" bit immediately +adjacent to the bits representing `T`. So if I hand you a `t:&T` +and you wrap it as `Some(t)`, those bits aren't adjacent in memory. +This means that all the VM methods need to operate on an +`Option<Box<VM>>` -- we can't just wrap a `Some()` around `&mut +self` "at the last minute" before inserting it into the thread-local +storage cell. Fortunately deref coercions save the day here too -- +the coercion is inferred through both layers (`Box` and `Option`) of +wrapper, so there is no additional noise in the code. + +Note that Rust is clever and can find some sequence of bits that +aren't a valid `T`, so `sizeof(Option<T>)==sizeof(T)`. And in fact, +`Box<T>` is one of these cases (and this is guaranteed). So the +`Option` has no overhead. + +# Closing thoughts, language-level support + +This would have been easier with language-level support. + +## What wouldn't help + +Although it [it was decreed][fiat-decree] that `Cell<T>` and `&mut +T` are interchangeable, a `LocalKey<Cell<T>>` isn't quite the same +thing as a `Cell<T>`, so it wouldn't be safe for the standard +library to contain something like this: + +``` +impl<T> LocalKey<Cell<T>> { + fn get_mut(&self) -> &mut T { + unsafe { + // ... mem::transmute() voodoo goes here ... +``` + +The problem here is that you can call `LocalKey<Cell<T>>::get_mut()` twice and +end up with two `&mut T`s that point to the same thing (mutable aliasing) which +results in undefined behavior. + +## What would help + +The ideal solution is for Rust to let you call arbitrary methods +`T::foo(&mut self...)` on a `LocalKey<Cell<T>>`. This way you can +have one (and only one) `&mut T` at any syntactical point in your +program -- the `&mut self`. + + +[tls-const-init]: https://github.com/rust-lang/rust/pull/90774 +[thread-local-unstable]: https://github.com/rust-lang/rust/issues/29594 +[thread-local-unsafe-generally]: https://github.com/rust-lang/rust/issues/54366 +[fiat-decree]: https://github.com/rust-lang/rust/issues/43038 +[from_mut]: https://doc.rust-lang.org/stable/std/cell/struct.Cell.html#method.from_mut +[get_mut]: https://doc.rust-lang.org/stable/std/cell/struct.Cell.html#method.get_mut +[thread-local-unsafe]: [https://github.com/rust-lang/rust/issues/54366] +[deref-coercions]: https://doc.rust-lang.org/book/ch15-02-deref.html#implicit-deref-coercions-with-functions-and-methods +[threadlocal-intrinsic]: https://llvm.org/docs/LangRef.html#llvm-threadlocal-address-intrinsic |