depot/tvix/eval/src/opcode.rs, branch refs/r/8925

refactor(tvix/eval): ensure VM operations fit in a single byte

2024-08-19T11:02:50+00:00

This replaces the OpCode enum with a new Op enum which is guaranteed to fit in a single byte. Instead of carrying enum variants with data, every variant that has runtime data encodes it into the `Vec` that a `Chunk` now carries. This has several advantages: * Less stack space is required at runtime, and fewer allocations are required while compiling. * The OpCode doesn't need to carry "weird" special-cased data variants anymore. * It is faster (albeit, not by much). On my laptop, results consistently look approximately like this: Benchmark 1: ./before -E '(import {}).firefox.outPath' --log-level ERROR --no-warnings Time (mean ± σ): 8.224 s ± 0.272 s [User: 7.149 s, System: 0.688 s] Range (min … max): 7.759 s … 8.583 s 10 runs Benchmark 2: ./after -E '(import {}).firefox.outPath' --log-level ERROR --no-warnings Time (mean ± σ): 8.000 s ± 0.198 s [User: 7.036 s, System: 0.633 s] Range (min … max): 7.718 s … 8.334 s 10 runs See notes below for why the performance impact might be less than expected. * It is faster while at the same time dropping some optimisations we previously performed. This has several disadvantages: * The code is closer to how one would write it in C or Go. * Bit shifting! * There is (for now) slightly more code than before. On performance I have the following thoughts at the moment: In order to prepare for adding GC, there's a couple of places in Tvix where I'd like to fence off certain kinds of complexity (such as mutating bytecode, which, for various reaons, also has to be part of data that is subject to GC). With this change, we can drop optimisations like retroactively modifying existing bytecode and *still* achieve better performance than before. I believe that this is currently worth it to pave the way for changes that are more significant for performance. In general this also opens other avenues of optimisation: For example, we can profile which argument sizes actually exist and remove the copy overhead of varint decoding (which does show up in profiles) by using more adequately sized types for, e.g., constant indices. Known regressions: * Op::Constant is no longer printing its values in disassembly (this can be fixed, I just didn't get around to it, will do separately). Change-Id: Id9b3a4254623a45de03069dbdb70b8349e976743 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12191 Tested-by: BuildkiteCI Reviewed-by: flokli

refactor(tvix/eval): let OpCoerceToString select the CoercionKind

2023-12-29T21:34:45+00:00

Change-Id: I92d58ef216d7e0766af70f019b3dcd445284a95d Reviewed-on: https://cl.tvl.fyi/c/depot/+/10344 Reviewed-by: tazjin Tested-by: BuildkiteCI

fix(tvix/eval): remove incorrect imports when coercing

2023-12-14T13:15:23+00:00

The default behavior of string coercion in C++ Nix is to weakly coerce and import to store if necessary. There is a flag to make it strongly coerce (coerceMore) and a flag that controls whether path values have the corresponding file/directory imported into the store before returning the (store) path as a string (copyToStore). We need to implement our equivalent to the copyToStore (import_paths) flag for the benefit of weak coercions that don't import into the store (dirOf, baseNameOf, readFile, ...) and strong coercions that don't import into the store (toString). This makes coerce_to_string as well as CoercionKind weirder and more versatile, but prevents us from reimplementing parts of the coercion logic constantly as can be seen in the case of baseNameOf. Note that it is not possible to test this properly in //tvix/eval tests due to the lack of an appropriate EvalIO implementation being available. Tests should be added to //tvix/glue down the line. Change-Id: I8fb8ab99c7fe08e311d2ba1c36960746bf22f566 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10361 Autosubmit: sterni Tested-by: BuildkiteCI Reviewed-by: Adam Joseph

fix(tvix/eval): fix branching on catchable defaults (b/343)

2023-12-12T14:55:48+00:00

This commit adds Opcode::OpJumpIfCatchable, which can be inserted ahead of most VM operations which expect a boolean on the stack, in order to handle catchables in branching position properly. Other than remembering to patch the jump, no other changes should be required. This commit also fixes b/343 by emitting this new opcode when compiling if-then-else. There are probably other places where we need to do the same thing. Change-Id: I48de3010014c0bbeba15d34fc0d4800e0bb5a1ef Reviewed-on: https://cl.tvl.fyi/c/depot/+/10288 Tested-by: BuildkiteCI Reviewed-by: tazjin Autosubmit: Adam Joseph

docs(tvix/eval): document remaining opcodes

2023-09-08T21:38:42+00:00

Change-Id: Iad8a5f78930872719b6481ddf1bbad8532bfa888 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7981 Reviewed-by: flokli Tested-by: BuildkiteCI

docs(tvix/eval): document attribute set related opcodes

2023-09-08T21:38:42+00:00

Change-Id: Ib98a9fe8c9aa3f0e61c27d10285c0926cda7a969 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7979 Tested-by: BuildkiteCI Reviewed-by: flokli

docs(tvix/eval): add documentation strings for some OpCode variants

2023-09-08T21:38:42+00:00

Change-Id: I42e610740b3687e1fd897d36694cce425751a8bc Reviewed-on: https://cl.tvl.fyi/c/depot/+/7975 Tested-by: BuildkiteCI Autosubmit: tazjin Reviewed-by: flokli

fix(tvix/eval): only finalise formal arguments if defaulting

2023-06-20T10:07:44+00:00

When dealing with a formal argument in a function argument pattern that has a default expression, there are two different things that can happen at runtime: Either we select its value from the passed attribute successfully or we need to use the default expression. Both of these may be thunks and both of these may need finalisers. However, in the former case this is taken care of elsewhere, the value will always be finalised already if necessary. In the latter case we may need to finalise the thunk resulting from the default expression. However, the thunk corresponding to the expression may never end up in the local's stack slot. Since finalisation goes by stack slot (and not constants), we need to prevent a case where we don't fall back to the default expression, but finalise anyways. Previously, we worked around this by making `OpFinalise` ignore non-thunks. Since finalisation of already evaluated thunks still crashed, the faulty compilation of function pattern arguments could still cause a crash. As a new approach, we reinstate the old behavior of `OpFinalise` to crash whenever encountering something that is either not a thunk or doesn't need finalisation. This can also help catching (similar) miscompilations in the future. To then prevent the crash, we need to track whether we have fallen back or not at runtime. This is done using an additional phantom on the stack that holds a new `FinaliseRequest` value. When it comes to finalisation we check this value and conditionally execute `OpFinalise` based on its value. Resolves b/261 and b/265 (partially). Change-Id: Ic04fb80ec671a2ba11fa645090769c335fb7f58b Reviewed-on: https://cl.tvl.fyi/c/depot/+/8705 Reviewed-by: tazjin Tested-by: BuildkiteCI Autosubmit: sterni

fix(tvix/eval): type check function argument with set pattern

2023-06-07T15:17:20+00:00

C++ Nix forces and typechecks the passed argument even if it is not necessary in order to compute the return value of the function. I discovered this when I thought our formals miscompilation might be that we are too strict, but doesn't look like it in this case. Change-Id: Ifb3c92592293052c489d1e3ae8c7c54e4b6b4dc6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8701 Tested-by: BuildkiteCI Autosubmit: sterni Reviewed-by: tazjin

refactor(tvix/eval): flatten call stack of VM using generators

2023-03-13T20:30:59+00:00

Warning: This is probably the biggest refactor in tvix-eval history, so far. This replaces all instances of trampolines and recursion during evaluation of the VM loop with generators. A generator is an asynchronous function that can be suspended to yield a message (in our case, vm::generators::GeneratorRequest) and receive a response (vm::generators::GeneratorResponsee). The `genawaiter` crate provides an interpreter for generators that can drive their execution and lets us move control flow between the VM and suspended generators. To do this, massive changes have occured basically everywhere in the code. On a high-level: 1. The VM is now organised around a frame stack. A frame is either a call frame (execution of Tvix bytecode) or a generator frame (a running or suspended generator). The VM has an outer loop that pops a frame off the frame stack, and then enters an inner loop either driving the execution of the bytecode or the execution of a generator. Both types of frames have several branches that can result in the frame re-enqueuing itself, and enqueuing some other work (in the form of a different frame) on top of itself. The VM will eventually resume the frame when everything "above" it has been suspended. In this way, the VM's new frame stack takes over much of the work that was previously achieved by recursion. 2. All methods previously taking a VM have been refactored into async functions that instead emit/receive generator messages for communication with the VM. Notably, this includes *all* builtins. This has had some other effects: - Some test have been removed or commented out, either because they tested code that was mostly already dead (nix_eq) or because they now require generator scaffolding which we do not have in place for tests (yet). - Because generator functions are technically async (though no async IO is involved), we lose the ability to use much of the Rust standard library e.g. in builtins. This has led to many algorithms being unrolled into iterative versions instead of iterator combinations, and things like sorting had to be implemented from scratch. - Many call sites that previously saw a `Result<..., ErrorKind>` bubble up now only see the result value, as the error handling is encapsulated within the generator loop. This reduces number of places inside of builtin implementations where error context can be attached to calls that can fail. Currently what we gain in this tradeoff is significantly more detailed span information (which we still need to bubble up, this commit does not change the error display). We'll need to do some analysis later of how useful the errors turn out to be and potentially introduce some methods for attaching context to a generator frame again. This change is very difficult to do in stages, as it is very much an "all or nothing" change that affects huge parts of the codebase. I've tried to isolate changes that can be isolated into the parent CLs of this one, but this change is still quite difficult to wrap one's mind and I'm available to discuss it and explain things to any reviewer. Fixes: b/238, b/237, b/251 and potentially others. Change-Id: I39244163ff5bbecd169fe7b274df19262b515699 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8104 Reviewed-by: raitobezarius Reviewed-by: Adam Joseph Tested-by: BuildkiteCI