about summary refs log tree commit diff
path: root/tvix/eval/src/compiler
AgeCommit message (Collapse)AuthorFilesLines
2024-08-19 r/8519 refactor(tvix/eval): ensure VM operations fit in a single byteVincent Ambo3-217/+193
This replaces the OpCode enum with a new Op enum which is guaranteed to fit in a single byte. Instead of carrying enum variants with data, every variant that has runtime data encodes it into the `Vec<u8>` that a `Chunk` now carries. This has several advantages: * Less stack space is required at runtime, and fewer allocations are required while compiling. * The OpCode doesn't need to carry "weird" special-cased data variants anymore. * It is faster (albeit, not by much). On my laptop, results consistently look approximately like this: Benchmark 1: ./before -E '(import <nixpkgs> {}).firefox.outPath' --log-level ERROR --no-warnings Time (mean ± σ): 8.224 s ± 0.272 s [User: 7.149 s, System: 0.688 s] Range (min … max): 7.759 s … 8.583 s 10 runs Benchmark 2: ./after -E '(import <nixpkgs> {}).firefox.outPath' --log-level ERROR --no-warnings Time (mean ± σ): 8.000 s ± 0.198 s [User: 7.036 s, System: 0.633 s] Range (min … max): 7.718 s … 8.334 s 10 runs See notes below for why the performance impact might be less than expected. * It is faster while at the same time dropping some optimisations we previously performed. This has several disadvantages: * The code is closer to how one would write it in C or Go. * Bit shifting! * There is (for now) slightly more code than before. On performance I have the following thoughts at the moment: In order to prepare for adding GC, there's a couple of places in Tvix where I'd like to fence off certain kinds of complexity (such as mutating bytecode, which, for various reaons, also has to be part of data that is subject to GC). With this change, we can drop optimisations like retroactively modifying existing bytecode and *still* achieve better performance than before. I believe that this is currently worth it to pave the way for changes that are more significant for performance. In general this also opens other avenues of optimisation: For example, we can profile which argument sizes actually exist and remove the copy overhead of varint decoding (which does show up in profiles) by using more adequately sized types for, e.g., constant indices. Known regressions: * Op::Constant is no longer printing its values in disassembly (this can be fixed, I just didn't get around to it, will do separately). Change-Id: Id9b3a4254623a45de03069dbdb70b8349e976743 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12191 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2024-08-07 r/8453 feat(tvix/eval): Forbid Hash{Map,Set}, use Fx insteadAspen Smith3-12/+11
Per https://nnethercote.github.io/perf-book/hashing.html, we have basically no reason to use the default hasher over a faster, non-DoS-resistant hasher. This gives a nice perf boost basically for free: hello outpath time: [704.76 ms 714.91 ms 725.63 ms] change: [-7.2391% -6.1018% -4.9189%] (p = 0.00 < 0.05) Performance has improved. Change-Id: If5587f444ed3af69f8af4eead6af3ea303b4ae68 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12046 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de> Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com> Autosubmit: aspen <root@gws.fyi>
2024-07-07 r/8355 fix(tvix/repl): Share globals and sourcemap across evaluationsAspen Smith1-1/+1
Now that we can bind (potentially lazy, potentially lambda-containing) values in the REPL and then reference them in subsequent evaluations, it's important that the values to which we construct shared references are shared across those subsequent evaluations - otherwise, we get panics due to unknown source map locations, or dropped weak references to globals. This change assigns both the globals and the source map as fields on the Repl after the first evaluation, and then passes those in (to the EvaluationBuilder) on subsequent evaluations. On the EvaluationBuilder side, there's some panicking introduced - this is intentional, as my intent is for the builder to be configured statically enough that panicking is the best way to report errors here (it's always a bug to misconfigure an Evaluation, and we'd never want to handle it dynamically). Change-Id: I37225697235c22b683ca48a17d30fa8fedd12d1b Reviewed-on: https://cl.tvl.fyi/c/depot/+/11960 Reviewed-by: flokli <flokli@flokli.de> Autosubmit: aspen <root@gws.fyi> Tested-by: BuildkiteCI
2024-07-06 r/8352 refactor(tvix/eval): Construct globals in EvaluationBuilder::buildAspen Smith1-6/+0
Construct the Rc<GlobalsMap> for the evaluation as part of EvaluiationBuilder::build, rather than deferring it until we actually compile. This changes nothing functionally, but gets us one step closer to sharing this globals map across evaluations. Change-Id: Id92e9fb88d974d763056d4f15ce61962ab776e84 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11957 Tested-by: BuildkiteCI Autosubmit: aspen <root@gws.fyi> Reviewed-by: flokli <flokli@flokli.de>
2024-07-05 r/8345 feat(tvix/eval): Allow passing in an env to evaluationAspen Smith4-10/+57
Allow passing in a top-level env, a map from name to value, to evaluation. The intent is to support bound identifiers in the REPL just like upstream nix does. Getting this working involves mucking around a bit with internals - most notably, locals now only optionally have a Span (since locals don't have an easy span we can use) - and getting that working requires propagating some minor hacks to places where we currently *need* a span (and which would require too much changing now to make spans optional; my guess is that that would essentially end up making spans optional throughout the codebase). Also, some extra care has to be taken to close out the scope in the case that we do pass in an env, to avoid breaking our assumptions about the size of the stack when we return from the toplevel Change-Id: Ie475b2d3dfc72ccbf298d2a3ea28c63ac877d653 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11953 Tested-by: BuildkiteCI Autosubmit: aspen <root@gws.fyi> Reviewed-by: flokli <flokli@flokli.de>
2024-07-05 r/8343 refactor(tvix/eval): Drop LightSpan entirelyAspen Smith1-6/+2
This was made unnecessary in c92d06271 (feat(tvix/eval): drop LightSpan::Delayed, 2023-12-08) because it didn't improve benchmarks as much as expected and has been vestigial since; this continues the cleanup by just removing it altogether Change-Id: I21ec7ae9b52a5cccd2092696a5a87f658194d672 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11949 Autosubmit: aspen <root@gws.fyi> Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2024-04-09 r/7882 fix(tvix): Avoid buffering file into memory in builtins.hashFileConnor Brewster1-4/+5
Right now `builtins.hashFile` always reads the entire file into memory before hashing, which is not ideal for large files. This replaces `read_to_string` with `open_file` which allows calculating the hash of the file without buffering it entirely into memory. Other callers can continue to buffer into memory if they choose, but they still use the `open_file` VM request and then call `read_to_string` or `read_to_end` on the `std::io::Reader`. Fixes b/380 Change-Id: Ifa1c8324bcee8f751604b0b449feab875c632fda Reviewed-on: https://cl.tvl.fyi/c/depot/+/11236 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-02-21 r/7591 feat(tvix/eval): Store string context alongside dataAspen Smith1-1/+1
Previously, Nix strings were represented as a Box (within Value) pointing to a tuple of an optional context, and another Box pointing to the actual string allocation itself. This is pretty inefficient, both in terms of memory usage (we use 48 whole bytes for a None context!) and in terms of the extra indirection required to get at the actual data. It was necessary, however, because with native Rust DSTs if we had something like `struct NixString(Option<NixContext>, BStr)` we could only pass around *fat* pointers to that value (with the length in the pointer) and that'd make Value need to be bigger (which is a waste of both memory and cache space, since that memory would be unused for all other Values). Instead, this commit implements *manual* allocation of a packed string representation, with the length *in the allocation* as a field past the context. This requires a big old pile of unsafe Rust, but the payoff is clear: hello outpath time: [882.18 ms 897.16 ms 911.23 ms] change: [-15.143% -13.819% -12.500%] (p = 0.00 < 0.05) Performance has improved. Fortunately this change can be localized entirely within value/string.rs, since we were abstracting things out nicely. Change-Id: Ibf56dd16c9c503884f64facbb7f0ac596463efb6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10852 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: aspen <root@gws.fyi>
2024-02-20 r/7571 refactor(tvix/eval): add SourceCode directly into error typesVincent Ambo3-32/+41
With this change it's no longer necessary to track the SourceCode struct separately from the evaluation for error reporting: It's just stored directly in the errors. This also ends up resolving an issue in compiler::bindings, where we cloned the Arc containing file references way too often. In fact those clones probably compensate for all additional SourceCode clones during error construction now. Change-Id: Ice93bf161e61f8ea3d48103435e20c53e6aa8c3a Reviewed-on: https://cl.tvl.fyi/c/depot/+/10986 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-02-13 r/7508 feat(tvix/eval): Box Value::CatchableAspen Smith2-3/+3
This is now the only enum variant for Value that is larger than 8 bytes (it's 16 bytes), so boxing it (especially since it's not perf-critical) allows us to get the Value size down to only 16 bytes! Change-Id: I98598e2b762944448bef982e8ff7da6d6683c4aa Reviewed-on: https://cl.tvl.fyi/c/depot/+/10798 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: aspen <root@gws.fyi>
2024-02-13 r/7507 revert(tvix/eval): Don't double-box Path valuesAspen Smith1-6/+3
This reverts commit d3d41552cf1f6485f8ebc597a2128a0d15b030a5. This was well-intentioned, but now the boxed Path values are actually the *largest* Value enum variants, at 16 bytes (because they're fat-pointers, with a len) instead of 8 bytes like all the other values. Having the double reference is a reasonable price to pay (it seems; more benchmarks may end up disagreeing) for a smaller Value repr. Change-Id: I0d3e84f646c8f5ffd0b7259c4e456637eea360f7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10797 Tested-by: BuildkiteCI Autosubmit: aspen <root@gws.fyi> Reviewed-by: sterni <sternenseemann@systemli.org>
2024-02-02 r/7467 refactor(tvix/eval): Box Value::StringAspen Smith2-8/+8
NixString is *quite* large - like 80 bytes - because of the extra capacity value for BString and because of the context. We want to keep Value small since we're passing it around a lot, so let's box the NixString inside Value::String to save on some memory, and make cloning ostensibly a little cheaper Change-Id: I343c8b4e7f61dc3dcbbaba4382efb3b3e5bbabb2 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10729 Tested-by: BuildkiteCI Reviewed-by: sterni <sternenseemann@systemli.org>
2024-02-01 r/7463 feat(tvix/eval): Don't emit OpForce for non-thunk constantsAspen Smith1-0/+9
In the compiler, skip emitting an OpForce if the last op was an OpConstant for a non-thunk constant. This gives a small (~1% on my machine) perf boost, eg when evaluating hello.outPath: ❯ hyperfine \ "./before --no-warnings -E '(import <nixpkgs> {}).hello.outPath'" \ "./after --no-warnings -E '(import <nixpkgs> {}).hello.outPath'" Benchmark 1: ./before --no-warnings -E '(import <nixpkgs> {}).hello.outPath' Time (mean ± σ): 1.151 s ± 0.022 s [User: 1.003 s, System: 0.151 s] Range (min … max): 1.123 s … 1.184 s 10 runs Benchmark 2: ./after --no-warnings -E '(import <nixpkgs> {}).hello.outPath' Time (mean ± σ): 1.140 s ± 0.022 s [User: 0.989 s, System: 0.152 s] Range (min … max): 1.115 s … 1.175 s 10 runs Summary ./after --no-warnings -E '(import <nixpkgs> {}).hello.outPath' ran 1.01 ± 0.03 times faster than ./before --no-warnings -E '(import <nixpkgs> {}).hello.outPath' Change-Id: I2105fd431d4bad699087907e16c789418e9a4062 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10714 Reviewed-by: sterni <sternenseemann@systemli.org> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2024-02-01 r/7461 refactor(tvix/eval): Don't double-box Path valuesAspen Smith1-3/+6
PathBuf internally contains a heap pointer (an OsString), so we were in effect double-boxing here. Removing the extra layer by making Tvix::Value represented by a Box<Path> rather than a Box<PathBuf> saves us an indirection, while still avoiding the extra memory overhead of the capacity which was the reason we were boxing PathBuf in the first place. Change-Id: I8c185b9d4646161d1921917f83e87421496a3e24 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10725 Reviewed-by: sterni <sternenseemann@systemli.org> Autosubmit: aspen <root@gws.fyi> Tested-by: BuildkiteCI
2024-01-31 r/7460 fix(tvix): Represent strings as byte arraysAspen Smith2-3/+4
C++ nix uses C-style zero-terminated char pointers to represent strings internally - however, up to this point, tvix has used Rust `String` and `str` for string values. Since those are required to be valid utf-8, we haven't been able to properly represent all the string values that Nix supports. To fix that, this change converts the internal representation of the NixString struct from `Box<str>` to `BString`, from the `bstr` crate - this is a wrapper around a `Vec<u8>` with extra functions for treating that byte vector as a "morally string-like" value, which is basically exactly what we need. Since this changes a pretty fundamental assumption about a pretty core type, there are a *lot* of changes in a lot of places to make this work, but I've tried to keep the general philosophy and intent of most of the code in most places intact. Most notably, there's nothing that's been done to make the derivation stuff in //tvix/glue work with non-utf8 strings everywhere, instead opting to just convert to String/str when passing things into that - there *might* be something to be done there, but I don't know what the rules should be and I don't want to figure them out in this change. To deal with OS-native paths in a way that also works in WASM for tvixbolt, this also adds a dependency on the "os_str_bytes" crate. Fixes: b/189 Fixes: b/337 Change-Id: I5e6eb29c62f47dd91af954f5e12bfc3d186f5526 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10200 Reviewed-by: tazjin <tazjin@tvl.su> Reviewed-by: flokli <flokli@flokli.de> Reviewed-by: sterni <sternenseemann@systemli.org> Autosubmit: aspen <root@gws.fyi> Tested-by: BuildkiteCI
2024-01-25 r/7448 feat(tvix/eval): track pattern binding namesFlorian Klink1-3/+11
These need to be preserved at least for builtins.toXML. Also, we incorrectly only wrote an <attrspat> in case ellipsis was true, but that's not the case. Change-Id: I6bff9c47c2922f878d5c43e48280cda9c9ddb692 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10686 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: aspen <root@gws.fyi>
2024-01-24 r/7447 fix(tvix/eval/value/function): use BTreeMap for function arg namesFlorian Klink1-2/+2
At least toXML wants to get these out in a sorted fashion. Change-Id: I6373d7488fff7c40dc2ddeeecd03ba537c92c4af Reviewed-on: https://cl.tvl.fyi/c/depot/+/10685 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-12-29 r/7272 refactor(tvix/eval): let OpCoerceToString select the CoercionKindAdam Joseph1-1/+8
Change-Id: I92d58ef216d7e0766af70f019b3dcd445284a95d Reviewed-on: https://cl.tvl.fyi/c/depot/+/10344 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-12-12 r/7193 fix(tvix/eval): add unimplemented __curPos and builtins.filterSourceAdam Joseph1-0/+1
This commit adds __curPos (to the global scope, yuck) and builtins.filterSource. These are not implemented; forcing them will produce the same result as `throw "message"`. Unfortunately these two post-2.3 features are used throughout nixpkgs. Since an unresolved indentifier is a catchable error, this breaks the entire release eval. With this commit, it simply causes those broken packages that use these features to appear as they are: broken. Change-Id: Ib43dea571f6a9fab4d54869349f80ee4ec5424c2 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10297 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI Autosubmit: Adam Joseph <adam@westernsemico.com>
2023-12-12 r/7192 fix(tvix/eval): propagate catchables through `&&`Adam Joseph1-0/+2
Change-Id: I7bb5ac1ef47b41c47269e64cee0e90eb64c6fcce Reviewed-on: https://cl.tvl.fyi/c/depot/+/10322 Autosubmit: Adam Joseph <adam@westernsemico.com> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-12-12 r/7191 fix(tvix/eval): make `||` propagate catchablesAdam Joseph1-0/+2
Change-Id: I42f994d7c9228368d5f6c30c4730c24666f7bc69 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10320 Autosubmit: Adam Joseph <adam@westernsemico.com> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-12-12 r/7190 fix(tvix/eval): fix nested assertions b/340Adam Joseph1-0/+2
This commit fixes our handling of `throw` within an `assert` condition. Fixes: b/340 Change-Id: I40a383639ec266da50a853f16216b1b7868495da Reviewed-on: https://cl.tvl.fyi/c/depot/+/10318 Autosubmit: Adam Joseph <adam@westernsemico.com> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-12-12 r/7186 fix(tvix/eval): fix catchables in named formalsAdam Joseph1-11/+27
Fixes b/348. Change-Id: I5e8d56b5fd26a19eac32ec5e11baf93765691dc8 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10296 Autosubmit: Adam Joseph <adam@westernsemico.com> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-12-12 r/7180 fix(tvix/eval): fix recovering from throws in implicationsAdam Joseph1-0/+2
This fixes b/345. Change-Id: Ic0d3b6ffacd2a5e0050d22354d08320b69a4fe13 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10290 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: Adam Joseph <adam@westernsemico.com>
2023-12-12 r/7178 fix(tvix/eval): fix branching on catchable defaults (b/343)Adam Joseph1-0/+6
This commit adds Opcode::OpJumpIfCatchable, which can be inserted ahead of most VM operations which expect a boolean on the stack, in order to handle catchables in branching position properly. Other than remembering to patch the jump, no other changes should be required. This commit also fixes b/343 by emitting this new opcode when compiling if-then-else. There are probably other places where we need to do the same thing. Change-Id: I48de3010014c0bbeba15d34fc0d4800e0bb5a1ef Reviewed-on: https://cl.tvl.fyi/c/depot/+/10288 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: Adam Joseph <adam@westernsemico.com>
2023-11-05 r/6955 chore(tvix): fix trivial clippy lintsVincent Ambo1-4/+4
Relates to b/321. Change-Id: I37284f89b186e469eb432e2bbedb37aa125a6ad4 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9961 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de> Autosubmit: tazjin <tazjin@tvl.su>
2023-09-24 r/6650 fix(tvix/eval): fix b/281 by adding Value::CatchableAdam Joseph2-9/+9
This commit makes catchable errors a variant of Value. The main downside of this approach is that we lose the ability to use Rust's `?` syntax for propagating catchable errors. Change-Id: Ibe89438d8a70dcec29e016df692b5bf88a5cad13 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9289 Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: Adam Joseph <adam@westernsemico.com> Tested-by: BuildkiteCI
2023-09-24 r/6649 refactor(tvix/eval): factor CatchableErrorKind out of ErrorKindAdam Joseph1-2/+4
This commit creates a separate enum for "catchable" errors (the kind that `builtins.tryEval` can detect). Change-Id: Ie81d1112526d852255d9842f67045f88eab192af Reviewed-on: https://cl.tvl.fyi/c/depot/+/9287 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: Adam Joseph <adam@westernsemico.com>
2023-09-22 r/6624 docs(tvix/eval): fix some broken docstr referencesFlorian Klink1-1/+1
There's some more left, but they've been renamed/refactored out of sight. Change-Id: I41579dedc74342b4c5f8cb39d2995b5b0c90b0f4 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9372 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com> Autosubmit: flokli <flokli@flokli.de>
2023-06-21 r/6341 fix(tvix/eval): use realpaths for import cachesterni1-0/+1
I've noticed this behavior when writing the admittedly cursed test case included in this CL. Alternatively we could use some sort of machinery using `builtins.trace`, but I don't think we capture stderr anywhere. I've elected to put this into the eval cache itself while C++ Nix does it in builtins.import already, namely via `realisePath`. We don't have an equivalent for this yet, since we don't support any kind of IfD, but we could revise that later. In any case, it seems good to encapsulate `ImportCache` in this way, as it'll also allow using file hashes as identifiers, for example. C++ Nix also does our equivalent of canon_path in `builtins.import` which we still don't, but I suspect it hardly makes a difference. Change-Id: I05004737ca2458a4c67359d9e7d9a2f2154a0a0f Reviewed-on: https://cl.tvl.fyi/c/depot/+/8839 Autosubmit: sterni <sternenseemann@systemli.org> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-06-20 r/6336 fix(tvix/eval): only finalise formal arguments if defaultingsterni1-34/+165
When dealing with a formal argument in a function argument pattern that has a default expression, there are two different things that can happen at runtime: Either we select its value from the passed attribute successfully or we need to use the default expression. Both of these may be thunks and both of these may need finalisers. However, in the former case this is taken care of elsewhere, the value will always be finalised already if necessary. In the latter case we may need to finalise the thunk resulting from the default expression. However, the thunk corresponding to the expression may never end up in the local's stack slot. Since finalisation goes by stack slot (and not constants), we need to prevent a case where we don't fall back to the default expression, but finalise anyways. Previously, we worked around this by making `OpFinalise` ignore non-thunks. Since finalisation of already evaluated thunks still crashed, the faulty compilation of function pattern arguments could still cause a crash. As a new approach, we reinstate the old behavior of `OpFinalise` to crash whenever encountering something that is either not a thunk or doesn't need finalisation. This can also help catching (similar) miscompilations in the future. To then prevent the crash, we need to track whether we have fallen back or not at runtime. This is done using an additional phantom on the stack that holds a new `FinaliseRequest` value. When it comes to finalisation we check this value and conditionally execute `OpFinalise` based on its value. Resolves b/261 and b/265 (partially). Change-Id: Ic04fb80ec671a2ba11fa645090769c335fb7f58b Reviewed-on: https://cl.tvl.fyi/c/depot/+/8705 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI Autosubmit: sterni <sternenseemann@systemli.org>
2023-06-14 r/6297 fix(tvix/eval): don't thunk home relative pathssterni1-11/+10
C++ Nix resolves home relative paths at [parse] time. This is not an option for us, since it prevents being able to separate the compilation and execution phase later (e.g. precompiled nix expressions). However, a practical consequence of this is that paths expressions are always literals (strict) and never thunks. [parse]: https://github.com/NixOS/nix/blob/7066d21a0ddb421967980094222c4bc1f5a0f45a/src/libexpr/parser.y#L518-L527 Change-Id: Ie4b9dc68f62c86d6c7fd5f1c9460c850d97ed1ca Reviewed-on: https://cl.tvl.fyi/c/depot/+/7041 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-06-07 r/6244 fix(tvix/eval): use normal thunking behavior for default in formalssterni1-9/+2
When comparing to C++ Nix, we notice that the thunking of default expressions in function formals corresponds to their normal thunking, e.g. literals are not thunked. This means that we can just invoke compile() without much of a care and trust that it will sort it out correctly. If function formals blow up as a result of this, it likely indicates that the expression is treated incorrectly by compile(), not compile_param_pattern(). Change-Id: I64acbff2f251423eb72ce43e56a0603379305e1d Reviewed-on: https://cl.tvl.fyi/c/depot/+/8704 Autosubmit: sterni <sternenseemann@systemli.org> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-06-07 r/6243 fix(tvix/eval): type check function argument with set patternsterni1-0/+1
C++ Nix forces and typechecks the passed argument even if it is not necessary in order to compute the return value of the function. I discovered this when I thought our formals miscompilation might be that we are too strict, but doesn't look like it in this case. Change-Id: Ifb3c92592293052c489d1e3ae8c7c54e4b6b4dc6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8701 Tested-by: BuildkiteCI Autosubmit: sterni <sternenseemann@systemli.org> Reviewed-by: tazjin <tazjin@tvl.su>
2023-06-07 r/6242 refactor(tvix/eval): don't track idx twice in compile_param_patternsterni1-9/+7
Change-Id: I27f9105ddb20d84342550b2a73b479a7764ee3fe Reviewed-on: https://cl.tvl.fyi/c/depot/+/8699 Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: sterni <sternenseemann@systemli.org> Tested-by: BuildkiteCI
2023-05-29 r/6217 fix(tvix/eval): thunk lambda expressionssterni1-5/+3
As cl/8658 and b/274 reveal, lambda expressions are also wrapped in thunks. Resolves b/274. Change-Id: I02fe5c8730ac76748d940e4f4427116587875275 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8662 Autosubmit: sterni <sternenseemann@systemli.org> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-05-29 r/6216 fix(tvix/eval): thunk HasAttr expressionssterni1-1/+3
HasAttrs was weird because with longer attribute paths it would sometimes not turn out to be a thunk. If it was a thunk, it'd usually still do some eval strictly which we'll want to avoid. Verified against C++ Nix using a new test suite introduced in a later CL. Change-Id: I6d047ccc68d046bb268462f170a3c4f3c5ddeffe Reviewed-on: https://cl.tvl.fyi/c/depot/+/8656 Autosubmit: sterni <sternenseemann@systemli.org> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-05-29 r/6215 fix(tvix/eval): thunk legacy let to match regular onesterni1-1/+3
Probably no real world code broken by this overzealous evaluation, but let's be thorough! Change-Id: Ib405a677182eab7940ace940c68e107573473a54 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8655 Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: sterni <sternenseemann@systemli.org> Tested-by: BuildkiteCI
2023-05-29 r/6214 fix(tvix/eval): thunk unary operator applicationssterni1-1/+1
Unary operator applications are thunked which can easily be observed by nix-instantiate --eval -E '[ (!true) (-1) ]' Unfortunately, there are few simple expressions where this makes a difference in the end result. Thus it only cropped up when using nixpkgs for cross compilation: Here we would compile the expression !(stdenv.cc.isGNU or false) to assemble python3Minimal's passthru attribute set (at least this seems to be the most likely explanation from the backtraces I've studied). This means that an unthunked <stdenv.cc.isGNU or false> OpForce OpInvert would be performed in order to assemble this attribute set, causing stdenv.cc to be evaluated too early, causing an infinite recursion. Resolves b/273. It seems that having a test suite that doesn't use --strict and relies on thunks rendered as <CODE> would be beneficial for catching such issues. I've not been able to find a test case with --strict that demonstrates the problem fixed in this CL. Change-Id: I640a5827b963f5b9d0f86fa2142e75e3a6bbee78 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8654 Tested-by: BuildkiteCI Autosubmit: sterni <sternenseemann@systemli.org> Reviewed-by: tazjin <tazjin@tvl.su>
2023-05-25 r/6205 feat(tvix/eval): unthunk empty lists and attribute setsVincent Ambo2-0/+8
Change-Id: Ie66cb1b163a544d45d113fd0f866286f230b0188 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7960 Tested-by: BuildkiteCI Reviewed-by: grfn <grfn@gws.fyi>
2023-05-25 r/6204 feat(tvix/eval): implement unthunking in compilerVincent Ambo1-1/+22
This feature allows the compiler to detect situations where the created thunk is useless and can be merged into the parent chunk instead. The only case where the compiler does this initially is when statically optimising a select expression. For example, previously the expression `builtins.length` compiled into two thunks: 1. An "inner" thunk which contained an `OpConstant` that had the optimised `length` builtin in it. 2. An "outer" thunk which contained an `OpConstant` to access the inner thunk, and the trailing OpForce of the top-level program. With this change, the inner thunk is skipped completely and the outer chunk directly contains the `length` builtin access. This can be applied in several situations, some easier than others, and we will add them in as we go along. Change-Id: Ie44521445fce1199f99b5b17712833faea9bc357 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7959 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-03-17 r/6027 fix(tvix/eval): use span of `set` for OpForce in attribute accessVincent Ambo1-2/+2
Emits the span of the `set` that is being accessed in the `force` operation of an attribute access. Looking at traces, it's a lot more useful to get information about *what* is being forced, as in cases like `foo.bar` it can be misleading to have an error highlight `bar`, when the error occured while forcing `foo` to be able to access `bar` in the first place. Change-Id: Id46ff28f20c67cb4971727ac52cc4811795cea2d Reviewed-on: https://cl.tvl.fyi/c/depot/+/8272 Reviewed-by: flokli <flokli@flokli.de> Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-03-17 r/6024 feat(tvix/eval): add generator "name" to NativeError kindVincent Ambo1-1/+4
This produces traces in which we can see what kind of native code was run. Note that these "names" are named after the generator message, so these aren't *really* intended for end-user consumption, but we can give them saner names later. Example: https://gist.github.com/tazjin/82b24e92ace8e821008954867ee05057 This already makes the traces a little easier to parse. Change-Id: Idcd601baf84f492211b732ea0f04b377112e10d0 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8268 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su>
2023-03-17 r/6023 feat(tvix/eval): enrich errors with VM's frame stack informationVincent Ambo1-1/+2
When emitting an error at runtime, the VM will now use the new `NativeError` and `BytecodeError` error kinds (which just wrap inner errors) to create a set of diagnostics to emit. The primary diagnostic is emitted last, with `error` type (so it will be coloured red in terminals), the other ones will be emitted with `note` type, highlighting the causal chain. Example: https://gist.github.com/tazjin/25feba7d211702453c9ebd5f8fd378e4 This is currently quite verbose, and we can cut down on this further, but the purpose of this commit is to surface more information first of all before worrying about the exact display. Change-Id: I058104a178c37031c0db6b4b3e4f4170cf76087d Reviewed-on: https://cl.tvl.fyi/c/depot/+/8266 Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-03-13 r/5983 fix(tvix/eval): emit warnings from builtins.import againVincent Ambo1-4/+3
Wires up generator logic to emit warnings that already have spans attached again. Change-Id: I9f878cec3b9d4f6f7819e7c71bab7ae70bd3f08b Reviewed-on: https://cl.tvl.fyi/c/depot/+/8224 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-13 r/5970 fix(tvix/eval): correctly thunk deferred formals accessVincent Ambo1-1/+11
Formals can be initialised with deferred default values (see the test cases), in which case they need an extra thunk to have something that can be finalised appropriately when the setup is done. Fixes: b/255 Change-Id: I380e3770be68eaa83ace96d450c7cead32dacc9f Reviewed-on: https://cl.tvl.fyi/c/depot/+/8196 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-13 r/5969 refactor(tvix/eval): box PathBufVincent Ambo1-3/+6
This shaves another 8 bytes off Value. How did that type get so big?! Change-Id: I65e9b59a1636bd57e3cc4aec5fea16887070b832 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8153 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-03-13 r/5968 chore(tvix/eval): remove `From<SmolStr> for NixString` instanceVincent Ambo1-5/+5
No longer needed, and in some cases caused some extra work. Change-Id: I64e8e7292573bdc92a9c7a8e470e33f8c526f311 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8152 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-03-13 r/5964 refactor(tvix/eval): flatten call stack of VM using generatorsVincent Ambo2-76/+95
Warning: This is probably the biggest refactor in tvix-eval history, so far. This replaces all instances of trampolines and recursion during evaluation of the VM loop with generators. A generator is an asynchronous function that can be suspended to yield a message (in our case, vm::generators::GeneratorRequest) and receive a response (vm::generators::GeneratorResponsee). The `genawaiter` crate provides an interpreter for generators that can drive their execution and lets us move control flow between the VM and suspended generators. To do this, massive changes have occured basically everywhere in the code. On a high-level: 1. The VM is now organised around a frame stack. A frame is either a call frame (execution of Tvix bytecode) or a generator frame (a running or suspended generator). The VM has an outer loop that pops a frame off the frame stack, and then enters an inner loop either driving the execution of the bytecode or the execution of a generator. Both types of frames have several branches that can result in the frame re-enqueuing itself, and enqueuing some other work (in the form of a different frame) on top of itself. The VM will eventually resume the frame when everything "above" it has been suspended. In this way, the VM's new frame stack takes over much of the work that was previously achieved by recursion. 2. All methods previously taking a VM have been refactored into async functions that instead emit/receive generator messages for communication with the VM. Notably, this includes *all* builtins. This has had some other effects: - Some test have been removed or commented out, either because they tested code that was mostly already dead (nix_eq) or because they now require generator scaffolding which we do not have in place for tests (yet). - Because generator functions are technically async (though no async IO is involved), we lose the ability to use much of the Rust standard library e.g. in builtins. This has led to many algorithms being unrolled into iterative versions instead of iterator combinations, and things like sorting had to be implemented from scratch. - Many call sites that previously saw a `Result<..., ErrorKind>` bubble up now only see the result value, as the error handling is encapsulated within the generator loop. This reduces number of places inside of builtin implementations where error context can be attached to calls that can fail. Currently what we gain in this tradeoff is significantly more detailed span information (which we still need to bubble up, this commit does not change the error display). We'll need to do some analysis later of how useful the errors turn out to be and potentially introduce some methods for attaching context to a generator frame again. This change is very difficult to do in stages, as it is very much an "all or nothing" change that affects huge parts of the codebase. I've tried to isolate changes that can be isolated into the parent CLs of this one, but this change is still quite difficult to wrap one's mind and I'm available to discuss it and explain things to any reviewer. Fixes: b/238, b/237, b/251 and potentially others. Change-Id: I39244163ff5bbecd169fe7b274df19262b515699 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8104 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Reviewed-by: Adam Joseph <adam@westernsemico.com> Tested-by: BuildkiteCI
2023-03-11 r/5954 feat(tvix/eval): don't warn twice about dead codeFlorian Klink1-1/+3
We currently send two warnings in case of detecting dead code - W008 inside compile_dead_code, and a more detailed warning in all places that invoke compile_dead_code: ``` warning[W007]: useless operation on boolean: this expression is always false --> /nix/store/qz3gjn95gazab4fkb7s8lm6hz17rdzzy-414z9nnj1wy66ymq6vgb693x9xjz6hf2-nixpkgs-src/pkgs/top-level/perl-packages.nix:12079:15 | 12079 | doCheck = false && !stdenv.isDarwin; | ^^^^^^^^^^^^^^^^^^^^^^^^^ warning[W008]: this code will never be executed --> /nix/store/qz3gjn95gazab4fkb7s8lm6hz17rdzzy-414z9nnj1wy66ymq6vgb693x9xjz6hf2-nixpkgs-src/pkgs/top-level/perl-packages.nix:12079:24 | 12079 | doCheck = false && !stdenv.isDarwin; | ^^^^^^^^^^^^^^^^ ``` The place invoking `compile_dead_code` has more context to why the code is unused, so it's error message is much more useful. Stop emitting the less informative warning inside compile_dead_code (W008), and update the comment that we expect the caller to emit a warning. I kept W008 itself still around, in case we end up having places this will get used again. Change-Id: I2c5d84fc0cb4035872cd4b71cc3e9e34e120eb37 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8024 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz>