about summary refs log tree commit diff
path: root/tvix/cli/src/known_paths.rs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2023-02-02 r/5827 fix(tvix/cli): keep tracking full paths in known_pathsVincent Ambo1-22/+56
We need to distinguish explicitly between the paths used for the scanner, and the paths that populate the derivation inputs. The full paths must be accessible from the result of the refscanner to populate drv fields correctly. This was previously hidden by debug changes that masked actual IO operations with no-ops. Change-Id: I037af6e6bbe2b573034d695f8779bee1b56bc125 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8022 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-02-02 r/5825 refactor(tvix/cli): use Wu-Manber string scanning for drv referencesVincent Ambo1-2/+10
Switch out the string-scanning algorithm used in the reference scanner. The construction of aho-corasick automata made up the vast majority of runtime when evaluating nixpkgs previously. While the actual scanning with a constructed automaton is relatively fast, we almost never scan for the same set of strings twice and the cost is not worth it. An algorithm that better matches our needs is the Wu-Manber multiple string match algorithm, which works efficiently on *long* and *random* strings of the *same length*, which describes store paths (up to their hash component). This switches the refscanner crate to a Rust implementation[0][1] of this algorithm. This has several implications: 1. This crate does not provide a way to scan streams. I'm not sure if this is an inherent problem with the algorithm (probably not, but it would need buffering). Either way, related functions and tests (which were actually unused) have been removed. 2. All strings need to be of the same length. For this reason, we truncate the known paths after their hash part (they are still unique, of course). 3. Passing an empty set of matches, or a match that is shorter than the length of a store path, causes the crate to panic. We safeguard against this by completely skipping the refscanning if there are no known paths (i.e. when evaluating the first derivation of an eval), and by bailing out of scanning a string that is shorter than a store path. On the upside, this reduces overall runtime to less 1/5 of what it was before when evaluating `pkgs.stdenv.drvPath`. [0]: Frankly, it's a random, research-grade MIT-licensed crate that I found on Github: https://github.com/jneem/wu-manber [1]: We probably want to rewrite or at least fork the above crate, and add things like a three-byte wide scanner. Evaluating large portions of nixpkgs can easily lead to more than 65k derivations being scanned for. Change-Id: I08926778e1e5d5a87fc9ac26e0437aed8bbd9eb0 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8017 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-01-17 r/5677 refactor(tvix/cli): consistently assert type unity in known_pathsVincent Ambo1-46/+41
No situation should be allowed in which a path is inserted into known_paths with different types twice, which we previously enforced only for some path types. Change-Id: I8cb47d4b29c0aab3c58694f8b590e131deba7043 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7843 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-01-17 r/5673 feat(tvix/cli): add replacement strings tracking to KnownPathsVincent Ambo1-0/+29
Replacement strings are some weird internal feature of Nix that is required for calculating derivation hashes. We need to track these like other paths, as they need to be re-used on builds with dependencies on values from previous builds. Change-Id: Ie955b3fb5ae3685cfadfbe4d06ea6b5e219590c7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7828 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-01-17 r/5672 feat(tvix/cli): track known plain paths in NixCompatIOVincent Ambo1-0/+1
When adding things to a C++ Nix store, ensure that the path is tracked in the tracker. Since the mechanism for propagating the tracker instance isn't finalised yet, I've opted to take an Rc<RefCell> parameter for it. How exactly that ends up there is going to become clear in the next commits, but for now it's just instantiated in main with Default::default. Change-Id: I90f0b44f2d4f292dedc98ff1aa39041d279b61fd Reviewed-on: https://cl.tvl.fyi/c/depot/+/7833 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-01-17 r/5671 refactor(tvix/cli): reference scanner owns all the stringsVincent Ambo1-2/+2
This gets very complex very quickly otherwise, as all the construction paths for a reference scanner and all the access patterns for the KnownPaths structure are not yet fully understood. Change-Id: Ibadf1f18b476695f3c286fc6896ae557760edf63 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7827 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-01-17 r/5669 feat(tvix/cli): add known_paths moduleVincent Ambo1-0/+114
This module implements types used to track the set of known paths in the context of an evaluation. These are used to determine the build references of a derivation. Change-Id: I81e15ae33632784e699128916485751613b231a3 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7816 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>