about summary refs log tree commit diff
path: root/tvix/castore/src/import.rs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2024-04-15 r/7929 refactor(tvix/castore): relax trait bounds on BlobServiceFlorian Klink1-2/+2
We don't need to clone BlobService anymore. Change-Id: I2f3b9a595f604ec0f1e081f6e90cd8b67cbb8961 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11419 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
2024-04-15 r/7927 refactor(tvix/castore/import): restructure directory uploader a bitFlorian Klink1-17/+51
Have a Option<Box<dyn DirectoryPutter>>, which is lazily initialized whenever we first want to upload a directory. Have the loop explicitly break when it encounters the root_node, and deal with the flushing after the loop. Deal with the FUTUREWORK (assertion for root directory digest matching what the DirectoryPutter returns). Change-Id: Iefc4904d8b8387e868fb752d40e3e4e4218c7407 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11417 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 r/7926 refactor(tvix/castore/import): put invariant checker into a .inspect()Florian Klink1-18/+18
Separate this a bit stronger from the main application flow. Change-Id: I2e9bd3ec47cc6e37256ba6afc6e0586ddc9a051f Reviewed-on: https://cl.tvl.fyi/c/depot/+/11416 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: Brian Olsen <me@griff.name> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 r/7925 refactor(tvix/*/import): rename direntry_stream, entries_per_depthsFlorian Klink1-4/+5
Align these names and comments with the two users, to make it more obvious we're doing the same thing here, just use a different method to come up with entries_per_depths. Change-Id: I42058e397588b6b57a6299e87183bef27588b228 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11415 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 r/7924 refactor(tvix/castore/import): inline process_entryFlorian Klink1-118/+65
This did very little, and especially the part of relying on the outside caller to pass in a Directory if the type is a directory required having per-entry-type specific logic anyways. It's cleaner to just inline it. Change-Id: I997a8513ee91c67b0a2443cb5cd9e8700f69211e Reviewed-on: https://cl.tvl.fyi/c/depot/+/11414 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-15 r/7923 refactor(tvix/castore/import): move process_entry to the end of the fileFlorian Klink1-95/+95
This makes it easier to understand the code. Change-Id: I0a9047433000551a6ba1f50a8c5c93527bc86216 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11413 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-04-13 r/7910 feat(tvix/castore/import): remove copying in find_ancestorFlorian Klink1-8/+2
We don't need to copy if we explicitly say that the returned Option<Path> may hold onto bytes from the passed in &DirEntry. Change-Id: Ib46b6fd2f8f19a45f8bef79c4c1d2fa6b490cad7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11410 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2024-04-13 r/7902 refactor(tvix/castore/import): rename ingest_entries function argFlorian Klink1-2/+2
This is a stream of DirEntry, so let's call it direntry_stream. Change-Id: I5b3cb4efba899d746393f75f6ece7eaa79424717 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11401 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-02-18 r/7547 fix(tvix/castore): don't emit ret as INFOFlorian Klink1-1/+2
This otherwise gets a bit spammy. Change-Id: I288350a600d79a394c239f253424ad55bc3cefc5 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10954 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2024-01-22 r/7437 feat(tvix/castore): `process_entry` cannot process unsupported nodesRyan Lahfa1-1/+13
In the past, we had a `todo!` on unsupported node types, this returns a proper error that can be caught by the caller. Change-Id: Icba4c1dab33c0d670a97f162c9b358d1ed5855cb Reviewed-on: https://cl.tvl.fyi/c/depot/+/10675 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2024-01-20 r/7431 refactor(tvix/castore): break down `ingest_path`Ryan Lahfa1-70/+187
In one function that does the heavy lifting: `ingest_entries`, and three additional helpers: - `walk_path_for_ingestion` which perform the tree walking in a very naive way and can be replaced by the user - `leveled_entries_to_stream` which transforms a list of a list of entries ordered by their depth in the tree to a stream of entries in the bottom to top order (Merkle-compatible order I will say in the future). - `ingest_path` which calls the previous functions. Change-Id: I724b972d3c5bffc033f03363255eae448f017cef Reviewed-on: https://cl.tvl.fyi/c/depot/+/10573 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de> Autosubmit: raitobezarius <tvl@lahfa.xyz>
2024-01-20 r/7430 feat(tvix/castore): ingestion does DFS and invert itRyan Lahfa1-49/+69
To make use of the filtering feature, we need to revert the internal walker to a real DFS. We will therefore just invert the whole tree by storing all of its contents in a level-keyed vector. This is horribly expensive in memory, this is a compromise between CPU and memory, here is the fundamental reason for why: When you encounter a directory, it's either a leaf or not, i.e. it contains subdirectories or not. To know this fact, you can: - wait until you notice subdirectories under it, i.e. you need to store any intermediate nodes you see in the meantime -> memory penalty. - getdents or readdir on it to determine *NOW* its subdirectories -> CPU penalty and I/O penalty. This is an implementation of the first proposal, we pay memory. In practice, we are paying O(#nb of nodes) in memory. There's a smarter albeit much more complicated algorithm that pays only O(\sum_i #siblings(p_i)) nodes where (p_1, ..., p_n) is the path to a leaf. which means for: A / \ B C / / \ D E F We would never store D, E, F but only E, F at a given time. But we would still store B, C no matter what. Change-Id: I456ed1c3f0db493e018ba1182665d84bebe29c11 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10567 Tested-by: BuildkiteCI Autosubmit: raitobezarius <tvl@lahfa.xyz> Reviewed-by: flokli <flokli@flokli.de>
2024-01-18 r/7412 feat(tvix/castore): convert import error to `std::io::Error`Ryan Lahfa1-0/+6
So that we can just `map_err` easily in functions returning `std::io::Error` but calling functions returning `castore::import::Error`. Change-Id: Id181b95e8431c69e95f3a8cd569ca10306656e1d Reviewed-on: https://cl.tvl.fyi/c/depot/+/10572 Autosubmit: raitobezarius <tvl@lahfa.xyz> Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-01-09 r/7359 refactor(tvix): use AsRef<dyn …> instead of Deref<Target= …>Florian Klink1-6/+5
Removes some more needs for Arcs. Change-Id: I9a9f4b81641c271de260e9ffa98313a32944d760 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10578 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-06 r/7354 chore(tvix/castore): fix the docstring for `process_entry`Ryan Lahfa1-15/+15
It was a `//` not a `///`. Change-Id: Iee3e8c116d73b5dd8a41c027153714415a66695f Reviewed-on: https://cl.tvl.fyi/c/depot/+/10566 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2024-01-05 r/7346 refactor(tvix/castore): relax trait bounds for DSFlorian Klink1-2/+2
Make this an `AsRef<dyn DirectoryService>`. This helps dropping some Clone requirements. Unfortunately, we can't thread this through to TvixStoreIO just yet. Change-Id: I3f07eb28d6c793d3313fe21506ada84d5a8aa3ac Reviewed-on: https://cl.tvl.fyi/c/depot/+/10533 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-01 r/7300 feat(tvix/castore/import): generalize ingest_pathFlorian Klink1-8/+16
We don't actually care if it's an Arc<dyn BlobService>, or something else, as long as we can Deref to a BlobService and clone. Change-Id: I0852aaf723f51c5e6b820be8db1199d17309ab08 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10510 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-12-12 r/7209 fix(tvix/castore/import): don't unwrap entryFlorian Klink1-2/+8
If the path specified doesn't exist, construct a proper error instead of panicking. Part of b/344. Change-Id: Id5c6a91248b0a387f3e8f138f8e686e402009e8f Reviewed-on: https://cl.tvl.fyi/c/depot/+/10330 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-12-12 r/7208 feat(tvix/castore/import): log returned errorsFlorian Klink1-1/+1
This will emit a log event / trace in case this function returns an error-y type. Change-Id: I48db6807f3e42304357c422a2b6e177cb8b95228 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10329 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-12-12 r/7207 refactor(tvix/castore/blobservice): use io::Result in traitFlorian Klink1-1/+4
For all these calls, the caller has enough context about what it did, so it should be fine to use io::Result here. We pretty much only constructed crate::Error::StorageError before anyways, so this conveys *more* information. Change-Id: I5cabb3769c9c2314bab926d34dda748fda9d3ccc Reviewed-on: https://cl.tvl.fyi/c/depot/+/10328 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
2023-11-24 r/7053 fix(tvix/castore): correctly flag unreachable codesterni1-1/+1
Change-Id: Id09afa4b77c3c70fb5695f253f6df4aa88b61e19 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10113 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-05 r/6946 feat(tvix/castore): bump [Directory,File]Node size to u64Florian Klink1-1/+1
Having more than 4GiB files is quite possible (think about the NixOS graphical installer, and an uncompressed iso of it). No wire format changes. Change-Id: Ia78a07e4c554e91b93c5b9f8533266e4bd7f22b6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9950 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-10-17 r/6842 fix(tvix/castore): Fix race when ingesting into castoreConnor Brewster1-0/+4
After finishing the ingestion, the directory putter was not being closed. This caused a race where the root directory node was accessed before the directory node had been flushed to the server. This patch makes it so we close the putter before returning the root node which should ensure that the root node exists on the directory service server before the `ingest_path` function returns. Fixes b/326 Change-Id: Id16cf46bc48962121dde76d3c9c23a845d87d0f1 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9761 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-10-13 r/6796 docs(tvix/castore): point out use of contents_firstLinus Heckemann1-0/+5
Change-Id: I7620d2abe01675ea7028a478d4f8447e36d5768b Reviewed-on: https://cl.tvl.fyi/c/depot/+/9605 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-10-08 r/6729 docs(tvix/castore): remove TODOFlorian Klink1-1/+0
This probably was about passing around directory_putter at some point, which we do, so whatever this meant, it's not actionable anymore. Change-Id: I1b4e0cdd2119bf2b2a9cf06d186a3b476b0ff367 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9573 Reviewed-by: Linus Heckemann <git@sphalerite.org> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-10-04 r/6689 fix(tvix/castore): explicitly name lifetimes in process_entryedef1-3/+3
Otherwise this produces absolutely inscrutable errors: note: hidden type `[async fn body@castore/src/import.rs:63:1: 63:94]` captures lifetime '_#24r Change-Id: If5d9626c9edf400de5bcec038bcaa5a3117561f0 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9506 Tested-by: BuildkiteCI Autosubmit: edef <edef@edef.eu> Reviewed-by: flokli <flokli@flokli.de>
2023-09-22 r/6629 refactor(tvix): move castore into tvix-castore crateFlorian Klink1-0/+205
This splits the pure content-addressed layers from tvix-store into a `castore` crate, and only leaves PathInfo related things, as well as the CLI entrypoint in the tvix-store crate. Notable changes: - `fixtures` and `utils` had to be moved out of the `test` cfg, so they can be imported from tvix-store. - Some ad-hoc fixtures in the test were moved to proper fixtures in the same step. - The protos are now created by a (more static) recipe in the protos/ directory. The (now two) golang targets are commented out, as it's not possible to update them properly in the same CL. This will be done by a followup CL once this is merged (and whitby deployed) Bug: https://b.tvl.fyi/issues/301 Change-Id: I8d675d4bf1fb697eb7d479747c1b1e3635718107 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9370 Reviewed-by: tazjin <tazjin@tvl.su> Reviewed-by: flokli <flokli@flokli.de> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>