about summary refs log tree commit diff
path: root/tvix/castore (follow)
AgeCommit message (Collapse)AuthorFilesLines
2024-03-20 r/7757 feat(tvix/castore): derive Eq and Hash automaticallyFlorian Klink2-3/+2
This allows these messages to be put in HashSets. Change-Id: Ia58094cafe53eb624578821d3d8d969c5d21a1d7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11219 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com> Autosubmit: flokli <flokli@flokli.de>
2024-03-20 r/7756 refactor(tvix/castore): instrument DirectoryPutter impls consistentlyFlorian Klink2-4/+5
Log the entire span with "trace" level, not just its `ret` level. The level of the error value event defaults to ERROR, so we don't loose these. B3Digest implements Debug and Display the same way, so we can omit the `(Display)` part in `ret(Display)` for them. Change-Id: Id00d123a5798e5bdc9820dd97ae2b4d4eb5455f0 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11218 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-20 r/7755 refactor(tvix/castore/directory): remove GRPCPutter::newFlorian Klink1-10/+3
This is no public API to construct this, there's exactly one caller, and it's perfectly fine to directly populate the struct there. Change-Id: Idae43a0162ee9bc687d21c550e0c9df33f12d263 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11217 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-20 r/7753 feat(tvix/castore): record errors for some failures in SimplePutterFlorian Klink1-0/+4
This makes it easier to see what's going wrong when uploading multiple Directories. Change-Id: Ieb71424b9761777c5f719b2f365962644de82baf Reviewed-on: https://cl.tvl.fyi/c/depot/+/11209 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2024-03-20 r/7751 feat(tvix/castore/blob): document missing objectstore+*:// URLFlorian Klink1-0/+1
Change-Id: I3cbc75e636efd467ddda7fc3c3c8f19ad42204ee Reviewed-on: https://cl.tvl.fyi/c/depot/+/11206 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 r/7750 refactor(tvix/castore/blob): drop simplefsFlorian Klink3-208/+1
This functionality is provided by the object store backend too (using `objectstore+file://$some_path`). This backend also supports content-defined chunking and compresses chunks with zstd. Change-Id: I5968c713112c400d23897c59db06b6c713c9d8cb Reviewed-on: https://cl.tvl.fyi/c/depot/+/11205 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-20 r/7749 refactor(tvix/castore): introduce "cloud" feature flagFlorian Klink2-6/+23
This controls whether tvix-castore has support for various cloud backends or not. Use this to control the set of feature flags for the object_store backend, and only enable the aws, azure and gcp ones if it's set. In the future this can be used to enable/disable other cloud backends too. Without feature flags, `object_store` already supports the `InMemory` and `LocalFilesystem` backends, and we also want to unconditionally enable the `http` one. Make sure at least the construction of these services is covered in the tests. Similarly, the tvix-store crate, which provides the tvix-store CLI has a `cloud` feature flag too (defaulting to enabled). Change-Id: I9fb9c87b740e7dc83f8ff7a0862905d036d513f2 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11204 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2024-03-20 r/7748 docs(tvix/castore/directory): update docstring for get_recursiveFlorian Klink1-1/+4
The rust trait was missing to document the order of the elements in the stream. Document that, and also the reasoning behind this. Change-Id: I27ef0b2020082783fc41c2015233175e2b8e716d Reviewed-on: https://cl.tvl.fyi/c/depot/+/11203 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-03-20 r/7746 refactor(tvix/castore/directory/from_addr): use match guardsFlorian Klink1-39/+42
This will allow feature-flagging some of the backends. Change-Id: Iddbdb89d3cf9c966a2c25b06b03e6917b284cae5 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11201 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: flokli <flokli@flokli.de>
2024-03-20 r/7745 refactor(tvix/castore/blob/from_addr): use match guardsFlorian Klink1-53/+59
This will allow feature-flagging some of the backends. Change-Id: Idffbf8b3fd154f5a3d938225c3871feffea8ff8c Reviewed-on: https://cl.tvl.fyi/c/depot/+/11200 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-03-18 r/7730 refactor(tvix/castore/blobsvc): use B3Digest Display implFlorian Klink2-10/+2
We don't need to use BASE64 here on our own, B3Digest has a Display impl. This will also make sure the `b3:` digest is present in field values. Change-Id: I0ce6ee0f7e7e99fb9b16872953a1b742e99be291 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11192 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
2024-03-18 r/7724 feat(tvix/blobservice/object_store) more loggingFlorian Klink1-1/+3
Have derive_{blob,chunk}_path emit trace-level events for both the values they're called with, as well as the return value. With RUST_LOG in place, it doesn't get lost in other unrelated noise. Change-Id: Id2451e3657324eff482841eb26a22d19e22bde30 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11136 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-03-18 r/7722 feat(tvix/castore/blobsvc/grpc): read data in chunksFlorian Klink1-26/+54
Whenever this encounters an open_read(), it'll first check for more granular chunking. If there's more granular chunking data available, a ChunkedReader is constructed (which supports seeking backwards). This currently is still a bit stupid, and doesn't compose, as `ChunkedReader` uses `self` as the `BlobService` to ask for the individual chunks. In store composition future, we might want to compose this differently, essentially constructing `ChunkedReader` with another `BlobService` representing the entire hierarchy, so there's a chance to locally cache things, and do less requests. Change-Id: I22e0df4d6245f666d083b4f0b7114d3ac41d1dce Reviewed-on: https://cl.tvl.fyi/c/depot/+/11185 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-18 r/7720 refactor(tvix/castore/src/blobservice): remove useless else caseFlorian Klink1-4/+3
Change-Id: I09000371a1d8ff212ab46050d1a480509c6ffe70 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11183 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-03-17 r/7719 feat(tvix/castore): impl Debug for B3DigestFlorian Klink1-1/+7
Use the same format as Display, b3: followed by the base64 representation. This makes the debug implementation of everything containing a b3 digest much nicer to read. Change-Id: I3ca3154d0b6fb07781c8f9c83ece3ff1a6957902 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11181 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-03-16 r/7705 chore(tvix): bump tonic to 0.11.0Florian Klink1-3/+3
This bumps tonic and surrounding crates to 0.11.x. We added support for tonic 0.11.x into tokio-listener (https://github.com/vi/tokio-listener/pull/4), so that's bumped as well. Change-Id: Icfade5894403228299836fefb21b2f9ae59dbebb Reviewed-on: https://cl.tvl.fyi/c/depot/+/11156 Reviewed-by: Connor Brewster <cbrewster@hey.com> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-03-16 r/7701 fix(tvix): don't emit rerun-if-changedFlorian Klink1-2/+2
`build.rs` emits rerun-if-changed statements for all proto files, as well as all include paths we pass it. Unfortunately, due to protobufs include path rules, we need to specify the path to the depot root itself as an include path, at least when building impurely with `cargo`. This causes cargo to essentially always rebuild, as it also puts its own temporary files in there. Unfortunately, tonic-build does not chase down to individual .proto files that are included. Disable emitting these `rerun-if-changed` statements for now. This could cause cargo to not rebuild protos every time, causing stale data until the next local `cargo clean`, but considering the protos change not that frequently, and it'll immediately surface if trying to build via Nix (either locally or in CI), it's a good-enough compromise. Change-Id: Ifd279a2216222ef3fc0e70c5a2fe6f87997f562e Reviewed-on: https://cl.tvl.fyi/c/depot/+/11157 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: sterni <sternenseemann@systemli.org> Tested-by: BuildkiteCI
2024-03-11 r/7685 feat(tvix/castore/blobsvc/from_addr): support object_storeFlorian Klink1-1/+21
The object_store crate supports a ton of different stores, with different schemes. For now, use a objectstore+ scheme prefix to enable these. Change-Id: I946f76e32a0fb0867ef59060217894cda5b959b9 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11080 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com> Autosubmit: flokli <flokli@flokli.de>
2024-03-11 r/7684 feat(tvix/castore/blobsvc): add object storage implementationFlorian Klink4-0/+564
This uses the `object_store` crate to expose a tvix-castore BlobService backed by object storage. It's using FastCDC to chunk blobs into smaller chunks when writing to it. These are exposed at the .chunks() method. Change-Id: I2858c403d4d6490cdca73ebef03c26290b2b3c8e Reviewed-on: https://cl.tvl.fyi/c/depot/+/11076 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI Reviewed-by: Brian Olsen <me@griff.name>
2024-03-09 r/7660 fix(tvix/castore/grpc/directory): skip_all fields in instrumentFlorian Klink1-15/+21
This only contains the outer metadata wrapping, and that's not too interesting: > Request { metadata: MetadataMap { headers: {"content-type": > "application/grpc", "user-agent": "grpc-go/1.60.1", "te": "trailers", > "grpc-accept-encoding": "gzip"} }, message: Streaming, extensions: > Extensions } Drop these fields for now, and rely on the underlying implementations to add instrumentation for the application-specific fields. Clean up the error logging a bit. Change-Id: Ife1090ed411766a61e1fa60fd4c9570f38de1e98 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11102 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-09 r/7659 fix(tvix/castore/grpc/blob): skip_all fields in instrumentFlorian Klink1-5/+11
This only contains the outer metadata wrapping, and that's not too interesting: > Request { metadata: MetadataMap { headers: {"content-type": > "application/grpc", "user-agent": "grpc-go/1.60.1", "te": "trailers", > "grpc-accept-encoding": "gzip"} }, message: Streaming, extensions: > Extensions } Drop these fields for now, and rely on the underlying implementations to add instrumentation for the application-specific fields. Log errors in some places where we didn't so far. Change-Id: Ia68d6c526987d3716be62a0809195401cf28512b Reviewed-on: https://cl.tvl.fyi/c/depot/+/11101 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-03-03 r/7650 fix(tvix/castore): also set SSL_CERT_FILE for tests thereFlorian Klink1-0/+3
For everything using reqwest here during test cases, we also need to set SSL_CERT_FILE. Change-Id: If8aeda65f3d75cb9ac5c9bc64e37a0cb7dffc17c Reviewed-on: https://cl.tvl.fyi/c/depot/+/11092 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2024-03-03 r/7648 refactor(tvix/*/from_addr): improve test debuggabilityFlorian Klink2-4/+12
If there's an unexpected test failure, print it out, rather than just saying something is false even though it should be true. Use .expect() for this, which displays the error if it failed. We can't use expect_err(), as our stores are not display'able, so use an assertion with a message there. Change-Id: I2d88861d979d107edc0717fbdb3cdac9a6bfc5e4 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11091 Tested-by: BuildkiteCI Reviewed-by: Brian Olsen <me@griff.name> Reviewed-by: flokli <flokli@flokli.de>
2024-03-03 r/7641 feat(tvix/castore): add HashingReader, B3HashingReaderFlorian Klink2-0/+90
HashingReader wraps an existing AsyncRead, and allows querying for the digest of all data read "through" it. The hash function is configurable by type parameter, and we define B3HashingReader. Change-Id: Ic08142077566fc08836662218f5ec8c3aff80be5 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11087 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2024-03-03 r/7640 feat(tvix/castore/digests): impl From digest::Output<_> for B3DigestFlorian Klink2-3/+11
This allows calling .into() to get a B3Digest. Change-Id: I6e63b496413cd00d84acfcd15c7de0f64c79721f Reviewed-on: https://cl.tvl.fyi/c/depot/+/11086 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2024-03-03 r/7631 refactor(tvix/castore/blobsvc/chunked_reader): refactor, documentFlorian Klink1-131/+139
The public-consumable thing here is ChunkedReader, not ChunkedBlob. ChunkedBlob is a helper that can be used to get a new AsyncRead, but not AsyncSeek. It is used internally by ChunkedReader whenever the client seeks. Make this more obvious, by extending the documentation, and putting ChunkedReader at the top of this file. Also make ChunkedBlob and its methods private, and give ChunkedReader a more useful constructor (from_chunks, instead of from_chunked_blob). Change-Id: I2399867591df923faa73927b924e7c116ad98dc0 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11079 Tested-by: BuildkiteCI Reviewed-by: Brian Olsen <me@griff.name> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-02 r/7627 feat(tvix/castore/blobsvc): BlobReader for more trivial typesFlorian Klink1-0/+2
Change-Id: I80e4f26c41a504fa4c6a013c2a1e76de613ba294 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11078 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-03-02 r/7626 fix(tvix/castore/blobwriter): don't require Sync + 'staticFlorian Klink1-1/+1
There's no reason for these two. Change-Id: Ie6f238bbb0b17971c9877b11b61ea7ebca573c13 Reviewed-on: https://cl.tvl.fyi/c/depot/+/11075 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
2024-02-20 r/7570 docs(tvix/castore/directorysvc): K/V is not necessarily flatFlorian Klink2-0/+10
Some implementations of DirectoryService might not allow retrieval of intermediate Directory nodes, that are not at the "root". Think about an object store implementation. The client is doing a get_recursive anyways to reduce the number of roundtrips. By documenting the fact we don't need to support looking up intermediate Directory messages, we can just batch all directories into the same object, keyed by the root. Change-Id: I019d720186d03c4125cec9191e93d20586a20963 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10988 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2024-02-19 r/7558 feat(tvix/castore): Compile fix for DarwinPeter Kolloch1-0/+3
Towards https://b.tvl.fyi/issues/264 Change-Id: If8fa912ae3fb2987b761f649ab738529ebf3b2e8 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10970 Autosubmit: Peter Kolloch <info@eigenvalue.net> Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-02-18 r/7547 fix(tvix/castore): don't emit ret as INFOFlorian Klink1-1/+2
This otherwise gets a bit spammy. Change-Id: I288350a600d79a394c239f253424ad55bc3cefc5 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10954 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2024-02-17 r/7535 feat(tvix/castore/fs): make allow_other configurableFlorian Klink2-3/+8
Also add a cli argument to the `tvix-store` binary. Change-Id: Id07d7fedb60d6060543b195f3a810a46137f9ad5 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10945 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su>
2024-02-10 r/7494 feat(tvix/castore/blobsvc): add Chunked{Blob,Reader}Florian Klink2-0/+489
These provide seekable access into a Blob for which we have more granular chunking information. There's no support for verified streaming in here yet, this simply produces a stream of readers for each chunk, skipping irrelevant chunks and data from the first chunk at the beginning. A seek simply does produce a new reader using the same process. Change-Id: I37f76b752adce027586770475435f3990a6dee0b Reviewed-on: https://cl.tvl.fyi/c/depot/+/10731 Reviewed-by: Connor Brewster <cbrewster@hey.com> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-02-06 r/7479 docs(tvix/castore/blobstore): reorganize docsFlorian Klink3-121/+251
docs/verified-streaming.md explained how CDC and verified streaming can work together, but didn't really highlight enough how chunking in general also helps with seeking. In addition, a lot of the thoughts w.r.t. the BlobStore protocol, both gRPC and Rust traits, as well as why there's no support for seeking directly in gRPC, as well as how clients should behave w.r.t. chunked fetching was missing, or mixed together with the verified streaming bits. While there is no verified streaming version yet, a chunked one is coming soon, and documenting this a bit better is gonna make it easier to understand, as well as provide some lookout on where this is heading. Change-Id: Ib11b8ccf2ef82f9f3a43b36103df0ad64a9b68ce Reviewed-on: https://cl.tvl.fyi/c/depot/+/10733 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-02-02 r/7471 fix(tvix/castore/grpc/svc_wrapper): expose chunks() over gRPCFlorian Klink1-3/+6
The Stat() method was just always signalling no granular chunks are available. However, as we now have a .chunks() method, we can expose it over gRPC. Change-Id: I74f0890ae083f301bb0cec62f1ea4a95463ac590 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10736 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-02-02 r/7470 feat(tvix/castore/blobsvc): validate StatBlobResponseFlorian Klink2-0/+29
All chunks must have valid blake3 digests. It is allowed to send an empty list, if no more granular chunking is available. Change-Id: I7ecb53579cdf40fd938bb68a85685751b4d3626f Reviewed-on: https://cl.tvl.fyi/c/depot/+/10726 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com> Autosubmit: flokli <flokli@flokli.de>
2024-02-02 r/7469 refactor(tvix/castore/grpc/blobsvc): inline stream_mapperFlorian Klink1-12/+3
This can be written without the additional function. Change-Id: Ib11c5d5254d3e44c8fa9661414835b0622eb1ac4 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10735 Reviewed-by: Connor Brewster <cbrewster@hey.com> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-02-02 r/7468 docs(tvix/castore/blobsvc): fix doc comments on traitFlorian Klink1-10/+14
The readers implement AsyncRead/AsyncSeek, not their sync counterparts. Also update expectations around chunks. Change-Id: Ic266688039d80d16d33f651b96ce2bcdedecfa00 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10734 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-02-02 r/7466 feat(tvix/castore/docs/verified-streaming): clarify replyFlorian Klink1-1/+1
"given chunksize" is misleading here. It's up to the backend to decide if it does chunking at all, and how it chunks. Change-Id: I4f130ca9ac34db79f18ef1d6475295806ac7f9a4 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10728 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2024-02-02 r/7465 refactor(tvix/castore/blobsvc/combinator): compact trait boundsFlorian Klink1-1/+2
BlobService already implies Send and Sync, we don't need to explicitly list it here. Change-Id: I58a4c5912be61a60acd961565979aa01d94ee0f7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10727 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
2024-01-22 r/7437 feat(tvix/castore): `process_entry` cannot process unsupported nodesRyan Lahfa1-1/+13
In the past, we had a `todo!` on unsupported node types, this returns a proper error that can be caught by the caller. Change-Id: Icba4c1dab33c0d670a97f162c9b358d1ed5855cb Reviewed-on: https://cl.tvl.fyi/c/depot/+/10675 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2024-01-21 r/7435 chore(tvix/store): Use BoxStream type aliasConnor Brewster7-24/+17
The BoxStream type alias is a more concise and easier to read than the full `Pin<Box<dyn Stream<Item = ...> + Send + ...>>` type. Change-Id: I5b7bccfd066ded5557e01f7895f4cf5c4a33bd44 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10677 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Autosubmit: Connor Brewster <cbrewster@hey.com>
2024-01-20 r/7431 refactor(tvix/castore): break down `ingest_path`Ryan Lahfa1-70/+187
In one function that does the heavy lifting: `ingest_entries`, and three additional helpers: - `walk_path_for_ingestion` which perform the tree walking in a very naive way and can be replaced by the user - `leveled_entries_to_stream` which transforms a list of a list of entries ordered by their depth in the tree to a stream of entries in the bottom to top order (Merkle-compatible order I will say in the future). - `ingest_path` which calls the previous functions. Change-Id: I724b972d3c5bffc033f03363255eae448f017cef Reviewed-on: https://cl.tvl.fyi/c/depot/+/10573 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de> Autosubmit: raitobezarius <tvl@lahfa.xyz>
2024-01-20 r/7430 feat(tvix/castore): ingestion does DFS and invert itRyan Lahfa1-49/+69
To make use of the filtering feature, we need to revert the internal walker to a real DFS. We will therefore just invert the whole tree by storing all of its contents in a level-keyed vector. This is horribly expensive in memory, this is a compromise between CPU and memory, here is the fundamental reason for why: When you encounter a directory, it's either a leaf or not, i.e. it contains subdirectories or not. To know this fact, you can: - wait until you notice subdirectories under it, i.e. you need to store any intermediate nodes you see in the meantime -> memory penalty. - getdents or readdir on it to determine *NOW* its subdirectories -> CPU penalty and I/O penalty. This is an implementation of the first proposal, we pay memory. In practice, we are paying O(#nb of nodes) in memory. There's a smarter albeit much more complicated algorithm that pays only O(\sum_i #siblings(p_i)) nodes where (p_1, ..., p_n) is the path to a leaf. which means for: A / \ B C / / \ D E F We would never store D, E, F but only E, F at a given time. But we would still store B, C no matter what. Change-Id: I456ed1c3f0db493e018ba1182665d84bebe29c11 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10567 Tested-by: BuildkiteCI Autosubmit: raitobezarius <tvl@lahfa.xyz> Reviewed-by: flokli <flokli@flokli.de>
2024-01-19 r/7422 chore(3p/sources): Bump channels & overlayssterni1-3/+1
- Adjust to ecl 23.9.9 release - Regenerate go protos after protoc-gen-go update - Drop dhall fork which hasn't kept up with 1.42.* - Address new clippy warnings: - Variant naming of Error::ValidationError - Simplify .try_into().unwrap() - Drop unnecessary identity function - Test module must be last in file - Drop unused `pub use` - Update agenix to 0.15.0. Current master has a installCheckPhase that doesn't work with C++ Nix 2.3.*: https://github.com/ryantm/agenix/commit/a23aa271bec82d3e962bafb994595c1c4a62b133#commitcomment-137185861 Change-Id: Ic29eef20d6fd1362ce1031364a5ca6b4edf195bd Reviewed-on: https://cl.tvl.fyi/c/depot/+/10615 Reviewed-by: aspen <root@gws.fyi> Tested-by: BuildkiteCI Autosubmit: sterni <sternenseemann@systemli.org>
2024-01-18 r/7412 feat(tvix/castore): convert import error to `std::io::Error`Ryan Lahfa1-0/+6
So that we can just `map_err` easily in functions returning `std::io::Error` but calling functions returning `castore::import::Error`. Change-Id: Id181b95e8431c69e95f3a8cd569ca10306656e1d Reviewed-on: https://cl.tvl.fyi/c/depot/+/10572 Autosubmit: raitobezarius <tvl@lahfa.xyz> Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2024-01-15 r/7380 feat(tvix/castore): implement Ord for node::NodeFlorian Klink1-0/+14
This allows assembling BTreeSets of node::Node. Change-Id: I97b83be5ffc3e891307a8ef2b5fc31e38b747a62 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10625 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: flokli <flokli@flokli.de>
2024-01-09 r/7366 feat(tvix/castore): implement CombinedBlobServiceFlorian Klink2-0/+137
First attempt on composition of BlobServices. Change-Id: I6e70248007edfd322a503fd40c1c4b4300cbc30c Reviewed-on: https://cl.tvl.fyi/c/depot/+/10587 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: flokli <flokli@flokli.de>
2024-01-09 r/7365 feat(tvix/castore/blobsvc): add chunks methodFlorian Klink2-2/+47
This adds support to retrieve a list of chunks for a given blob to the BlobService interface. While theoretically all chunk-awareness could be kept private inside each BlobService reader, we'd not be able to resolve individual chunks from different Blobservices - and due to this, not able to substitute chunks we already have in a more local store. This function allows asking a BlobService for the list of chunks, leaving any actual fetching up to the caller (be it through individual calls to open_read), or asking another store for it. Change-Id: I1d33c591195ed494be3aec71a8c804743cbe0dca Reviewed-on: https://cl.tvl.fyi/c/depot/+/10586 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2024-01-09 r/7364 refactor(tvix/castore): do clone inside a scopeFlorian Klink1-6/+5
Make it clear this is only used inside the scope. Change-Id: Ie94f88d7f0fb58cd4bf9c2f1176000b272e6f2e6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10585 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>