about summary refs log tree commit diff
path: root/tvix/glue/src/tvix_store_io.rs
AgeCommit message (Collapse)AuthorFilesLines
2024-10-15 r/8811 refactor(tvix/glue/builtins/import): refactorFlorian Klink1-53/+3
This removes all the intermediate helper functions and reorganizes the import code to only do the calculations where/when needed, and hopefully makes things easier to understand as well. Change-Id: I7e4c89c742bf8569b45e303523f7f801da7127ea Reviewed-on: https://cl.tvl.fyi/c/depot/+/12627 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: Jörg Thalheim <joerg@thalheim.io> Reviewed-by: edef <edef@edef.eu>
2024-10-15 r/8808 fix(tvix/glue/import): builtins.storeDir fixesFlorian Klink1-10/+0
This didn't support store paths with a subpath joined to them, while Nix does. Use state.path_exists, which does. This also means we can drop the `store_path_exists` helper, which was only used here. Change-Id: I918ccb270f64acbdc41cb4d2a9c3c5871ce15002 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12618 Tested-by: BuildkiteCI Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com> Reviewed-by: Jörg Thalheim <joerg@thalheim.io> Autosubmit: flokli <flokli@flokli.de>
2024-10-15 r/8807 refactor(nix-compat/store_path): use Path in from_absolute_path_fullFlorian Klink1-12/+4
These are not necessarily strings, and making it paths allows us to stop converting them to lossy strings. Change-Id: I11366c721dc5da1778aafe89092a1966b5a43178 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12617 Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com> Reviewed-by: Jörg Thalheim <joerg@thalheim.io> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-10-11 r/8789 refactor(tvix/glue/register_in_path_info_service): return only PathInfoFlorian Klink1-18/+13
The store path is already contained in the PathInfo, and the ca bits is already passed into the function, so known to the caller - there's no need to duplicate this. We can also avoid having two separate block_on in our import builtin - we already know the content hash before constructing, as we pass it in via ca_hash. There's still some room to unclutter some more of the code around importing - we still do NAR calculation twice in some cases, and some of the code might be share-able from other places producing PathInfo too. Log a TODO for this cleanup. Change-Id: I6a5fc427d15bc9293a396310143c7694dd2996c0 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12592 Reviewed-by: Marijan Petričević <marijan.petricevic94@gmail.com> Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-10-11 r/8787 refactor(tvix/store): use strictly typed PathInfo structMarijan Petričević1-57/+31
This switches the PathInfoService trait from using the proto-derived PathInfo struct to a more restrictive struct, and updates all implementations to use it. It removes a lot of the previous conversion and checks, as invalid states became nonrepresentable, and validations are expressed on the type level. PathInfoService implementations consuming protobuf need to convert and do the verification internally, and can only return the strongly typed variant. The nix_compat::narinfo::NarInfo conversions for the proto PathInfo are removed, we only keep a version showing a NarInfo representation for the strong struct. Converting back to a PathInfo requires the root node now, but is otherwise trivial, so left to the users. Co-Authored-By: Florian Klink <flokli@flokli.de> Change-Id: I6fdfdb44063efebb44a8f0097b6b81a828717e03 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12588 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-10-01 r/8743 feat(tvix/glue): wire up nix refscanningYureka1-7/+47
After this, attempting to build the nixpkgs still fails in the same way, because the references are not yet properly used by the code at `tvix/glue/src/tvix_store_io.rs`. Change-Id: I8a59ef8ef3c9a6f6aa7b05106dd9eef2e9ac0d0f Reviewed-on: https://cl.tvl.fyi/c/depot/+/12532 Reviewed-by: Brian Olsen <me@griff.name> Autosubmit: yuka <yuka@yuka.dev> Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2024-08-20 r/8545 refactor(nix-compat/store_path): make StorePath generic on SFlorian Klink1-17/+13
Similar to how cl/12253 already did this for `Signature`, we apply the same logic to `StorePath`. `StorePathRef<'a>'` is now a `StorePath<&'a str>`, and there's less redundant code for the two different implementation. `.as_ref()` returns a `StorePathRef<'_>`, `.to_owned()` gives a `StorePath<String>` (for now). I briefly thought about only publicly exporting `StorePath<String>` as `StorePath`, but the diff is not too large and this will make it easier to gradually introduce more flexibility in which store paths to accept. Also, remove some silliness in `StorePath::from_absolute_path_full`, which now doesn't allocate anymore. Change-Id: Ife8843857a1a0a3a99177ca997649fd45b8198e6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12258 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-08-17 r/8509 chore(tvix/glue): drop some explicit allow(clippy::mutable_key_type)Florian Klink1-2/+0
This is covered by clippy.toml these days. Change-Id: I2330af5781844d5f9d975793d770efcea48d371b Reviewed-on: https://cl.tvl.fyi/c/depot/+/12223 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-08-17 r/8507 refactor(tvix/castore): add into_nodes(), implement consuming proto convFlorian Klink1-8/+3
Provide a into_nodes() function on a Directory, which consumes self and returns owned PathComponent and Node. Use it to provide a proper conversion from Directory to the proto variant that doesn't clone. There's no need for the one taking only &Directory, we don't use it anywhere, and once someone needs that they might as well clone Directory before converting it. Update all other users of the `.nodes()` function to use `.into_nodes()` where applicable, and avoid some more cloning there. Change-Id: Id4577b9eb173c012e225337458898d3937112bcb Reviewed-on: https://cl.tvl.fyi/c/depot/+/12218 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-08-17 r/8506 refactor(tvix/castore): add PathComponent type for checked componentsFlorian Klink1-5/+11
This encodes a verified component on the type level. Internally, it contains a bytes::Bytes. The castore Path/PathBuf component() and file_name() methods now return this type, the old ones returning bytes were renamed to component_bytes() and component_file_name() respectively. We can drop the directory_reject_invalid_name test - it's not possible anymore to pass an invalid name to Directories::add. Invalid names in the Directory proto are still being tested to be rejected in the validate_invalid_names tests. Change-Id: Ide4d16415dfd50b7e2d7e0c36d42a3bbeeb9b6c5 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12217 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-08-17 r/8505 refactor(tvix/castore): drop {Directory,File,Symlink}NodeFlorian Klink1-23/+18
Add a `SymlinkTarget` type to represent validated symlink targets. With this, no invalid states are representable, so we can make `Node` be just an enum of all three kind of types, and allow access to these fields directly. Change-Id: I20bdd480c8d5e64a827649f303c97023b7e390f2 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12216 Reviewed-by: benjaminedwardwebb <benjaminedwardwebb@gmail.com> Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-08-17 r/8504 refactor(tvix/castore): remove `name` from NodesFlorian Klink1-40/+63
Nodes only have names if they're contained inside a Directory, or if they're a root node and have something else possibly giving them a name externally. This removes all `name` fields in the three different Nodes, and instead maintains it inside a BTreeMap inside the Directory. It also removes the NamedNode trait (they don't have a get_name()), as well as Node::rename(self, name), and all [Partial]Ord implementations for Node (as they don't have names to use for sorting). The `nodes()`, `directories()`, `files()` iterators inside a `Directory` now return a tuple of Name and Node, as does the RootNodesProvider. The different {Directory,File,Symlink}Node struct constructors got simpler, and the {Directory,File}Node ones became infallible - as there's no more possibility to represent invalid state. The proto structs stayed the same - there's now from_name_and_node and into_name_and_node to convert back and forth between the two `Node` structs. Some further cleanups: The error types for Node validation were renamed. Everything related to names is now in the DirectoryError (not yet happy about the naming) There's some leftover cleanups to do: - There should be a from_(sorted_)iter and into_iter in Directory, so we can construct and deconstruct in one go. That should also enable us to implement conversions from and to the proto representation that moves, rather than clones. - The BuildRequest and PathInfo structs are still proto-based, so we still do a bunch of conversions back and forth there (and have some ugly expect there). There's not much point for error handling here, this will be moved to stricter types in a followup CL. Change-Id: I7369a8e3a426f44419c349077cb4fcab2044ebb6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12205 Tested-by: BuildkiteCI Reviewed-by: yuka <yuka@yuka.dev> Autosubmit: flokli <flokli@flokli.de> Reviewed-by: benjaminedwardwebb <benjaminedwardwebb@gmail.com> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-08-13 r/8486 refactor(tvix/castore): move *Node and Directory to crate rootFlorian Klink1-1/+2
*Node and Directory are types of the tvix-castore model, not the tvix DirectoryService model. A DirectoryService only happens to send Directories. Move types into individual files in a nodes/ subdirectory, as it's gotten too cluttered in a single file, and (re-)export all types from the crate root. This has the effect that we now cannot poke at private fields directly from other files inside `crate::directoryservice` (as it's not all in the same file anymore), but that's a good thing, it now forces us to go through the proper accessors. For the same reasons, we currently also need to introduce the `rename` functions on each *Node directly. A followup is gonna move the names out of the individual enum kinds, so we can better represent "unnamed nodes". Change-Id: Icdb34dcfe454c41c94f2396e8e99973d27db8418 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12199 Reviewed-by: yuka <yuka@yuka.dev> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-08-13 r/8484 refactor(tvix/castore): use Directory struct separate from proto oneYureka1-42/+25
This uses our own data type to deal with Directories in the castore model. It makes some undesired states unrepresentable, removing the need for conversions and checking in various places: - In the protobuf, blake3 digests could have a wrong length, as proto doesn't know fixed-size fields. We now use `B3Digest`, which makes cloning cheaper, and removes the need to do size-checking everywhere. - In the protobuf, we had three different lists for `files`, `symlinks` and `directories`. This was mostly a protobuf size optimization, but made interacting with them a bit awkward. This has now been replaced with a list of enums, and convenience iterators to get various nodes, and add new ones. Change-Id: I7b92691bb06d77ff3f58a5ccea94a22c16f84f04 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12057 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2024-08-12 r/8482 feat(tvix/cli): Add derivation file dumping functionalityIlan Joselevich1-1/+1
Provides a derivation file dumping functionality for tvix-cli that can be used when passing the --drv-dumpdir CLI arg to tvix-cli. This will dump all the known derivation files into the specified directory, making it easier to debug derivation divergences between Tvix generated drvs and the drvs generated by Nix. Supersedes: https://cl.tvl.fyi/c/depot/+/11265 Change-Id: I0e10b26eba22032b84ac543af0d4150ad87aed3e Reviewed-on: https://cl.tvl.fyi/c/depot/+/12192 Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com> Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2024-07-22 r/8399 refactor(tvix): move service addrs into shared clap structYureka1-2/+6
Change-Id: I7cab29ecfa1823c2103b4c47b7d784bc31459d55 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12008 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de> Autosubmit: yuka <yuka@yuka.dev>
2024-07-20 r/8380 refactor(tvix/store): use composition in tvix_store crateYureka1-1/+1
Change-Id: Ie6290b296baba2b987f1a61c9bb4c78549ac11f1 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11983 Reviewed-by: flokli <flokli@flokli.de> Autosubmit: yuka <yuka@yuka.dev> Tested-by: BuildkiteCI
2024-07-06 r/8351 refactor(tvix/eval): Builderize EvaluationAspen Smith1-4/+6
Make constructing of a new Evaluation use the builder pattern rather than setting public mutable fields. This is currently a pure refactor (no functionality has changed) but has a few advantages: - We've encapsulated the internals of the fields in Evaluation, meaning we can change them without too much breakage of clients - We have type safety that prevents us from ever changing the fields of an Evaluation after it's built (which matters more in a world where we reuse Evaluations). More importantly, this paves the road for doing different things with the construction of an Evaluation - notably, sharing certain things like the GlobalsMap across subsequent evaluations in eg the REPL. Fixes: b/262 Change-Id: I4a27116faac14cdd144fc7c992d14ae095a1aca4 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11956 Tested-by: BuildkiteCI Autosubmit: aspen <root@gws.fyi> Reviewed-by: flokli <flokli@flokli.de>
2024-06-26 r/8306 refactor(tvix/glue): take &CAHash, not CAHashFlorian Klink1-3/+3
We use a bit less cloning that way. Change-Id: I28bf99577e4a481e35fbf99d0724adab5502a1bd Reviewed-on: https://cl.tvl.fyi/c/depot/+/11874 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
2024-06-26 r/8305 feat(tvix/glue): handle regular file at builtins.path importFlorian Klink1-22/+0
If builtins.path is passed a regular file, no filtering is applied. We use the just-introduced file_type function in the EvalIO trait for that. This means, we don't need to pass through filtered_ingest, and can assemble the FileNode directly in that specific match case. This also means, we can explicitly calculate the sha256 flat digest, and avoid having to pipe through the file contents again (via blob_to_sha256_hash) to construct the sha256 digest. Change-Id: I500b19dd9e4b7cc897d88b44547e7851559e5a4e Reviewed-on: https://cl.tvl.fyi/c/depot/+/11872 Tested-by: BuildkiteCI Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-06-26 r/8304 feat(tvix/eval): add file_type methodFlorian Klink1-0/+22
This allows peeking of the type at a given path. It's necessary, as an open() might not fail until you try to read() from it, and generally, stat'ing can be faster in some cases. Change-Id: Ib002da3194a3546ca286de49aac8d1022ec5560f Reviewed-on: https://cl.tvl.fyi/c/depot/+/11871 Tested-by: BuildkiteCI Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-06-13 r/8265 fix(tvix/glue/tvix_store_io): also populate input sourcesFlorian Klink1-2/+23
These also need to be present in the input nodes of the BuildRequest. Change-Id: Ie9b957805e42f766002581adc6182a6543c5333b Reviewed-on: https://cl.tvl.fyi/c/depot/+/11802 Reviewed-by: Brian Olsen <me@griff.name> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
2024-06-13 r/8263 fix(tvix/glue/tvix_store_io): disable concurrent fetches for nowFlorian Klink1-1/+3
We need some shared queue, preventing the same fetches/builds from getting triggered multiple times unnecessarily. Change-Id: I7c4a3c66db558f5cccd66865b170242b758e3e02 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11800 Reviewed-by: aspen <root@gws.fyi> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-06-13 r/8260 fix(tvix/glue/tvix_store_io): distinguish waiting and buildingFlorian Klink1-1/+3
We immediately reported "Building", even though then populated necessary inputs, which looked a bit odd. Make it clear we're still waiting, and update the spinner message once we have all inputs we were waiting for. In the future, we might want to have separate spans for this, so the timer gets reset, but that's something for later. Change-Id: Ic22c9a906d0e7e7179c5ee328162401261efc224 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11799 Reviewed-by: aspen <root@gws.fyi> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
2024-06-13 r/8259 feat(tvix/glue): report progress on all fetches, use progress barsFlorian Klink1-4/+3
This should also report progress on fetches which we couldn't delay until actually having to IO into them, like `builtins.fetchurl` calls without a upfront-provided hash. While at it, upgrade the progress spinners to progress bars, which increment if we know the size of the fetch. Change-Id: Ic3f332286d8bc2177f3d994ba25b165728d4b702 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11797 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: aspen <root@gws.fyi>
2024-06-13 r/8258 fix(tvix/glue/tvix_store_io): use same case for progress messagesFlorian Klink1-1/+1
"Fetching" was uppercase, "building" was lowercase. Let's make this consistent. Change-Id: I11c16f1a7d2057ada4d057e553a4ceaa59597f26 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11796 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com> Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2024-06-12 r/8255 feat(tvix/glue/tvix_store_io): show progress infoFlorian Klink1-5/+8
In `store_path_to_node`, in case we need to build or fetch something, render a progress bar, using the spinner for now. We can upgrade this to a progress *bar* later. Change-Id: I4a7cf5ef8f639076f176af9b39d276be3f37c8ff Reviewed-on: https://cl.tvl.fyi/c/depot/+/11793 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-06-12 r/8254 feat(tvix/glue): support builtin:fetchurlFlorian Klink1-5/+5
nixpkgs calls <nix/fetchurl.nix> during nixpkgs bootstrap. This produces a fake derivation with system = builtin and builder = builtin:fetchurl, and needs to download files from the internet. At the end of the Derivation construction, if we have such a derivation, also synthesize a `Fetch` struct, which we add to the known fetch paths. This will then cause these fetches to be picked up like all other fetches in TvixStoreIO. Change-Id: I72cbca4f85da106b25eda97693a6a6e59911cd57 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10975 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-06-05 r/8217 feat(tvix/glue): Implement builtins.storePathAspen Smith1-1/+1
This one's relatively simple - we just check if the store path exists, and if it does we make a new contextful string containing the store path as its only context element. Automatic testing seems tricky for this (I think?) so I tested it manually: tvix-repl> builtins.storePath /nix/store/yn46i4xx5alh7gs6fpkxk430i34rp2q9-hello-2.12.1 => "/nix/store/yn46i4xx5alh7gs6fpkxk430i34rp2q9-hello-2.12.1" :: string Change-Id: I8a0d9726e4102ab872c53c2419679c2c855a5a18 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11696 Tested-by: BuildkiteCI Autosubmit: aspen <root@gws.fyi> Reviewed-by: flokli <flokli@flokli.de>
2024-05-11 r/8103 refactor(tvix/store): drop calculate_nar from PathInfoServiceFlorian Klink1-24/+35
This shouldn't be part of the PathInfoService trait. Pretty much none of the PathInfoServices do implement it, and requiring them to implement it means they also cannot make use of this calculation already being done by other PathInfoServices. Move it out into its own NarCalculationService trait, defined somewhere at tvix_store::nar, and have everyone who wants to trigger nar calculation use nar_calculation_service directly, which now is an additional field in TvixStoreIO for example. It being moved outside the PathInfoService trait doesn't prohibit specific implementations to implement it (like the GRPC client for the `PathInfoService` does. This is currently wired together in a bit of a hacky fashion - as of now, everything uses the naive implementation that traverses blob and directoryservice, rather than composing it properly. I want to leave that up to a later CL, dealing with other parts of store composition too. Change-Id: I18d07ea4301d4a07651b8218bc5fe95e4e307208 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11619 Reviewed-by: Connor Brewster <cbrewster@hey.com> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-05-06 r/8082 refactor(tvix): remove usage of async-recursionConnor Brewster1-2/+0
Rust 1.77 supports async recursion as long as there is some form of indirection (ie. `Box::pin`). This removes the need to use the async-recursion crate. Change-Id: Ic9613ab7f32016f0103032a861edff92e2fb8b41 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11596 Reviewed-by: flokli <flokli@flokli.de> Autosubmit: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-05-02 r/8066 feat(tvix/castore/directory/traverse): use castore PathsFlorian Klink1-0/+3
This switches from using std::path::Path to using castore paths. We can drop some error handling in descend_to, as absolute (or redundant) paths are not representable. We however now need to convert from a std::path::Path to our representation, and decide to accept .. canonicalization, as paths in EvalIO might contain this. Dealing .. to hop into another store path, if we encounter this, should be dealt with in a previous step. Change-Id: I5e94693808420c5d56587c68731252b54755bf93 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11575 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-04-23 r/7998 fix(tvix/glue/tvix_store_io): remove early returnFlorian Klink1-157/+166
Doing the fetch comes up with the root node, but we still need to descend from there to the desired subpath. Move things around to ensure the fetch case also only sets root_node. This logic should probably be moved into smaller, easier to consume functions. Change-Id: I6ab9317df794f53d2504029bbc77859e89fef1ed Reviewed-on: https://cl.tvl.fyi/c/depot/+/11507 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-23 r/7994 feat(tvix/glue/store_io): have KnownPaths track fetches tooFlorian Klink1-3/+29
Have fetcher builtins call queue_fetch() whenever they don't need to fetch something immediately, and teach TvixStoreIO::store_path_to_node on how to look up (and call ingest_and persist on our Fetcher). Change-Id: Id4bd9d639fac9e4bee20c0b1c584148740b15c2f Reviewed-on: https://cl.tvl.fyi/c/depot/+/11501 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
2024-04-23 r/7993 refactor(tvix/glue): move Fetch[er] into its own types, fetch lazilyFlorian Klink1-138/+14
We actually want to delay fetching until we actually need the file. A simple evaluation asking for `.outPath` or `.drvPath` should work even in a pure offline environment. Before this CL, the fetching logic was quite distributed between tvix_store_io, and builtins/fetchers.rs. Rather than having various functions and conversions between structs, describe a Fetch as an enum type, with the fields describing the fetch. Define a store_path() function on top of `Fetch` which can be used to ask for the calculated store path (if the digest has been provided upfront). Have a `Fetcher` struct, and give it a `fetch_and_persist` function, taking a `Fetch` as well as a desired name, and have it deal with all the logic of persisting the PathInfos. It also returns a StorePathRef, similar to the `.store_path()` method on a `Fetch` struct. In a followup CL, we can extend KnownPaths to track fetches AND derivations, and then use `Fetcher` when we need to do IO into that store path. Change-Id: Ib39a96baeb661750a8706b461f8ba4abb342e777 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11500 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-04-20 r/7987 refactor(tvix/castore): ingest filesystem entries in parallelFlorian Klink1-23/+3
Rather than carrying around an Future in the IngestionEntry::Regular, simply carry the plain B3Digest. Code reading through a non-seekable data stream has no choice but to read and upload blobs immediately, and code seeking through something seekable (like a filesystem) probably knows better what concurrency to pick when ingesting, rather than the consuming side. (Our only) one of these seekable source implementations is now doing exactly that. We produce a stream of futures, and then use [StreamExt::buffered] to process more than one, concurrently. We still keep the same order, to avoid shuffling things and violating the stream order. This also cleans up walk_path_for_ingestion in castore/import, as well as ingest_dir_entries in glue/tvix_store_io. Change-Id: I5eb70f3e1e372c74bcbfcf6b6e2653eba36e151d Reviewed-on: https://cl.tvl.fyi/c/depot/+/11491 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-04-20 r/7986 fix(tvix): fix outdated comment and error in TvixStoreIO::openConnor Brewster1-3/+3
This function was originally called `read_to_string` but was changed to `open` to make it so that file contents aren't always held in memory. A comment and error message were not updated to reflect the new name of this method. Change-Id: I3d86e2f6d7006c2e1513121fc3c62efcb7e7b9bb Reviewed-on: https://cl.tvl.fyi/c/depot/+/11495 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-04-20 r/7983 feat(tvix/eval): Implement builtins.fetchTarballAspen Smith1-22/+71
Implement a first pass at the fetchTarball builtin. This uses much of the same machinery as fetchUrl, but has the extra complexity that tarballs have to be extracted and imported as store paths (into the directory- and blob-services) before hashing. That's reasonably involved due to the structure of those two services. This is (unfortunately) not easy to test in an automated way, but I've tested it manually for now and it seems to work: tvix-repl> (import ../. {}).third_party.nixpkgs.hello.outPath => "/nix/store/dbghhbq1x39yxgkv3vkgfwbxrmw9nfzi-hello-2.12.1" :: string Co-authored-by: Connor Brewster <cbrewster@hey.com> Change-Id: I57afc6b91bad617a608a35bb357861e782a864c8 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11020 Autosubmit: aspen <root@gws.fyi> Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-04-20 r/7981 refactor(tvix/castore/import): make module, split off fs and errorFlorian Klink1-1/+1
Move error types and filesystem-specific functions to a separate file, and keep the fs:: namespace in public exports. Change-Id: I5e9e83ad78d9aea38553fafc293d3e4f8c31a8c1 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11486 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com> Autosubmit: flokli <flokli@flokli.de>
2024-04-19 r/7979 refactor(tvix/castore): generalize store ingestion streamsConnor Brewster1-11/+12
Previously the store ingestion code was coupled to `walkdir::DirEntry`s produced by the `walkdir` crate which made it impossible to reuse ingesting from other sources like tarballs or NARs. This introduces a `IngestionEntry` which carries enough information for store ingestion and a future for computing the Blake3 digest of files. This allows the producer to perform file uploads in a way that makes sense for the source, ie. the filesystem upload could concurrently upload multiple files at the same time, while the NAR ingestor will need to ingest the entire blob before yielding the next blob in the stream. In the future we can buffer small blobs and upload them concurrently, but the full blob still needs to be read from the NAR before advancing. Change-Id: I6d144063e2ba5b05e765bac1f27d41b3c8e7b283 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11462 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-04-09 r/7882 fix(tvix): Avoid buffering file into memory in builtins.hashFileConnor Brewster1-24/+21
Right now `builtins.hashFile` always reads the entire file into memory before hashing, which is not ideal for large files. This replaces `read_to_string` with `open_file` which allows calculating the hash of the file without buffering it entirely into memory. Other callers can continue to buffer into memory if they choose, but they still use the `open_file` VM request and then call `read_to_string` or `read_to_end` on the `std::io::Reader`. Fixes b/380 Change-Id: Ifa1c8324bcee8f751604b0b449feab875c632fda Reviewed-on: https://cl.tvl.fyi/c/depot/+/11236 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-04-01 r/7840 feat(tvix/eval): implement `builtins.path`Ryan Lahfa1-0/+21
Now, it supports almost everything except `recursive = false;`, i.e. `flat`-ingestion because we have no knob exposed in the tvix store import side to do it. This has been tested to work. Change-Id: I2e9da10ceccdfbf45b43c532077ed45d6306aa98 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10597 Tested-by: BuildkiteCI Autosubmit: raitobezarius <tvl@lahfa.xyz> Reviewed-by: flokli <flokli@flokli.de>
2024-04-01 r/7839 refactor(tvix/store): generalize `PathInfo` constructorsRyan Lahfa1-10/+18
Instead of enforcing NAR SHA256 all the time, we generalize the `PathInfo` constructor to take a `CAHash` argument which can drive whether we are having a flat, NAR or text scheme. With this, it is now possible to implement flat schemes in our evaluation builtins, e.g. `builtins.path`. Change-Id: I15bfee0ef4f0f428bfbd2f30c57c012cdcf6a976 Signed-off-by: Ryan Lahfa <tvl@lahfa.xyz> Reviewed-on: https://cl.tvl.fyi/c/depot/+/11286 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-03-28 r/7800 refactor(tvix/glue): drop ingest_entries_syncFlorian Klink1-12/+11
Make this function async, and do the block_on on the (single) callsite. Change-Id: Ib8b0b54ab5370fe02ef95f38a45d8866868a9d60 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11285 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-03-28 r/7799 refactor(tvix/glue): drop register_node_in_path_info_service_syncFlorian Klink1-12/+0
Replace the (single) callsite with some code interacting with the tokio runtime to block on the async version. Change-Id: I3976496ae77b2bb8734603f303655834265e3f0a Reviewed-on: https://cl.tvl.fyi/c/depot/+/11284 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-28 r/7798 feat(tvix/glue/tvix_store_io): drop store_path_to_node_syncFlorian Klink1-12/+10
Let's get rid of these sync helpers, they make this less understandable. Change-Id: I3c7294647849db2747762722247c65e4e2947757 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11283 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-03-11 r/7678 feat(tvix/glue): Implement builtins.fetchurlAspen Smith1-4/+103
Implement the fetchurl builtin, and lay the groundwork for implementing the fetchTarball builtin (which works very similarly, and is implemented using almost the same code in C++ nix). An overview of how this works: 1. First, we check if the store path that *would* result from the download already exists in the store - if it does, we just return that 2. If we need to download the URL, TvixStoreIO has an `http_client: reqwest::Client` field now which we use to make the request 3. As we're downloading the blob, we hash the data incrementally into a SHA256 hasher 4. We compare the hash against the expected hash (if any) and bail out if it doesn't match 5. Finally, we put the blob in the store and return the store path Since the logic is very similar, this commit also implements a *chunk* of `fetchTarball` (though the actual implementation will likely include a refactor to some of the code reuse here). The main thing that's missing here is caching of downloaded blobs when fetchurl is called without a hash - I've opened b/381 to track the TODO there. Adding the `SSL_CERT_FILE` here is necessary to teach reqwest how to load it during tests - see 1c16dee20 (feat(tvix/store): use reqwests' rustls-native-roots feature, 2024-03-03) for more info. Change-Id: I83c4abbc7c0c3bfe92461917e23d6d3430fbf137 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11017 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de> Autosubmit: aspen <root@gws.fyi>
2024-02-21 r/7587 fix(tvix/eval): allow reading non-UTF8 filesFlorian Klink1-5/+5
With our values using bstr now, we're not restricted to only reading files that contain valid UTF-8. Update our `read_to_string` function to `read_to_end` (named like `std::io::Read::read_to_end`), and have it return a Vec<u8>. Change-Id: I87f0291dc855a132689576559c891d66c30ddf2b Reviewed-on: https://cl.tvl.fyi/c/depot/+/11003 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Pádraic Ó Mhuiris <patrick.morris.310@gmail.com> Reviewed-by: flokli <flokli@flokli.de>
2024-02-21 r/7585 feat(tvix/nix-compat): Use `StorePath` in `Output`Peter Kolloch1-11/+6
https: //b.tvl.fyi/issues/264 Change-Id: Icb09be9643245cc68d09f01d7723af2d44d6bd1a Reviewed-on: https://cl.tvl.fyi/c/depot/+/11001 Autosubmit: Peter Kolloch <info@eigenvalue.net> Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-02-21 r/7583 feat(tvix/nix-compat): input_derivations with StorePathsPeter Kolloch1-7/+1
...in `Derivation`. This is more type-safe and should consume less memory. This also removes some allocations in the potentially hot path of output hash calculation. https: //b.tvl.fyi/issues/264 Change-Id: I6ad7d3cb868dc9f750894d449a6065608ef06e8c Reviewed-on: https://cl.tvl.fyi/c/depot/+/10957 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de> Autosubmit: Peter Kolloch <info@eigenvalue.net> Reviewed-by: Peter Kolloch <info@eigenvalue.net>