about summary refs log tree commit diff
path: root/tvix/store/src/proto/grpc_blobservice_wrapper.rs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2023-09-18 r/6606 refactor(tvix/store/blobsvc): make BlobStore asyncFlorian Klink1-32/+21
We previously kept the trait of a BlobService sync. This however had some annoying consequences: - It became more and more complicated to track when we're in a context with an async runtime in the context or not, producing bugs like https://b.tvl.fyi/issues/304 - The sync trait shielded away async clients from async worloads, requiring manual block_on code inside the gRPC client code, and spawn_blocking calls in consumers of the trait, even if they were async (like the gRPC server) - We had to write our own custom glue code (SyncReadIntoAsyncRead) to convert a sync io::Read into a tokio::io::AsyncRead, which already existed in tokio internally, but upstream ia hesitant to expose. This now makes the BlobService trait async (via the async_trait macro, like we already do in various gRPC parts), and replaces the sync readers and writers with their async counterparts. Tests interacting with a BlobService now need to have an async runtime available, the easiest way for this is to mark the test functions with the tokio::test macro, allowing us to directly .await in the test function. In places where we don't have an async runtime available from context (like tvix-cli), we can pass one down explicitly. Now that we don't provide a sync interface anymore, the (sync) FUSE library now holds a pointer to a tokio runtime handle, and needs to at least have 2 threads available when talking to a blob service (which is why some of the tests now use the multi_thread flavor). The FUSE tests got a bit more verbose, as we couldn't use the setup_and_mount function accepting a callback anymore. We can hopefully move some of the test fixture setup to rstest in the future to make this less repetitive. Co-Authored-By: Connor Brewster <cbrewster@hey.com> Change-Id: Ia0501b606e32c852d0108de9c9016b21c94a3c05 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9329 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-07-22 r/6439 feat(tvix/store/proto): use Bytes instead of Vec<u8>Florian Klink1-5/+5
Makes use of https://github.com/tokio-rs/prost/pull/341, which makes our bytes field cheaper to clone. It's a bit annoying to configure due to https://github.com/hyperium/tonic/issues/908, but the workaround does get the job done. Change-Id: I25714600b041bb5432d3adf5859b151e72b12778 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8975 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: flokli <flokli@flokli.de>
2023-07-21 r/6437 feat(tvix/store/digests): use bytes::Bytes instead of Vec<u8>Florian Klink1-5/+7
This will save us some copies, because a clone will simply create an additional pointer to the same data. Change-Id: I017a5d6b4c85a861b5541ebad2858ad4fbf8e8fa Reviewed-on: https://cl.tvl.fyi/c/depot/+/8978 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-06-12 r/6278 refactor(tvix/store/blobsvc): drop Result<_,_> around open_writeFlorian Klink1-4/+1
We never returned Err here anyways, and we can still return an error during the first (or subsequent) write(s). Change-Id: I4b4cd3d35f6ea008e9ffe2f7b71bfc9187309e2f Reviewed-on: https://cl.tvl.fyi/c/depot/+/8750 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-06-12 r/6276 feat(tvix/store): increase blob chunk sizeRyan Lahfa1-2/+69
From 64 bytes to 100 KBytes. We need to provide a custom wrapper with a different Default instance. Change-Id: Id7c6c437b8183b355a9e388f98cef1622b363f64 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8748 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-06-12 r/6273 refactor(tvix/store): use Arc instead of BoxFlorian Klink1-4/+4
This allows us to blob services without closing them before putting them in a box. We currently need to use Arc<_>, not Rc<_>, because the GRPC wrappers require Sync. Change-Id: I679c5f06b62304f5b0456cfefe25a0a881de7c84 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8738 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
2023-06-12 r/6269 feat(tvix/store): eliminate generics in BlobStoreFlorian Klink1-10/+6
To construct various stores at runtime, we need to eliminate associated types from the BlobService trait, and return Box<dyn …> instead of specific types. This also means we can't consume self in the close() method, so everything we write to is put in an Option<>, and during the first close we take from there. Change-Id: Ia523b6ab2f2a5276f51cb5d17e81a5925bce69b6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8647 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-05-23 r/6178 refactor(tvix/store/blobsvc): move from Vec<u8> to B3DigestFlorian Klink1-18/+6
Change-Id: I809bab75221f81b6023cfe75c2fe9e589c1e9192 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8605 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-05-11 r/6133 refactor(tvix/store): remove ChunkServiceFlorian Klink1-158/+84
Whether chunking is involved or not, is an implementation detail of each Blobstore. Consumers of a whole blob shouldn't need to worry about that. It currently is not visible in the gRPC interface either. It shouldn't bleed into everything. Let the BlobService trait provide `open_read` and `open_write` methods, which return handles providing io::Read or io::Write, and leave the details up to the implementation. This means, our custom BlobReader module can go away, and all the chunking bits in there, too. In the future, we might still want to add more chunking-aware syncing, but as a syncing strategy some stores can expose, not as a fundamental protocol component. This currently needs "SyncReadIntoAsyncRead", taken and vendored in from https://github.com/tokio-rs/tokio/pull/5669. It provides a AsyncRead for a sync Read, which is necessary to connect our (sync) BlobReader interface to a GRPC server implementation. As an alternative, we could also make the BlobReader itself async, and let consumers of the trait (EvalIO) deal with the async-ness, but this is less of a change for now. In terms of vendoring, I initially tried to move our tokio crate to these commits, but ended up in version incompatibilities, so let's vendor it in for now. Change-Id: I5969ebbc4c0e1ceece47981be3b9e7cfb3f59ad0 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8551 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-03-27 r/6041 docs(tvix/store): fix typo in commentFlorian Klink1-1/+1
We query the blob service for detailled blob info, not the chunk service. Change-Id: I85a6a57b1dae74a950f734be7d4455c5c35ae355 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8348 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su>
2023-03-17 r/6018 chore(tvix/store/grpcblobsvc): clippy lintFlorian Klink1-1/+1
Change-Id: Idb78e0417a962599cdfdef5e7346f7fa41e3fa1b Reviewed-on: https://cl.tvl.fyi/c/depot/+/8320 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-03-16 r/6015 refactor(tvix/store/chunksvc): use [u8; 32] instead of Vec<u8>Florian Klink1-9/+16
Change-Id: Ie2b94aa5d69ff2c61fb77e13ae844f81f6270273 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8314 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: flokli <flokli@flokli.de>
2023-03-13 r/5959 refactor(tvix/store): use read_all_and_chunk in gRPC blobserviceFlorian Klink1-35/+38
This was the last piece of code using BlobWriter. We can also use `read_all_and_chunk`, it's just requires a bit more plumbing: - The data coming from the client (stream) needs to be mapped (we extract the .data field). - The stream needs to be turned into an (async) reader - The reader needs to be made sync, and that code using the sync reader needs to be in a `task::spawn_blocking`. Change-Id: I4e374e1a9f47d5a0933f59a8f5c121185a5f3e95 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8260 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-03-11 r/5952 refactor(tvix/store): factor out hash update into functionFlorian Klink1-6/+6
We're using this in a bunch of places. Let's move it into a helper function. Change-Id: I118fba35f6d343704520ba37280e4ca52a61da44 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8251 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-10 r/5933 fix(tvix/store/proto/grpc_blobservice_wrapper): buffer recv dataFlorian Klink1-7/+14
While we don't want to keep all of the data in memory, we want to feed a reasonably-enough buffer to the chunking function, to prevent unnecessarily trying to chunk over and over again. Change-Id: I5bbe2d55e8c1c63f8f7ce343889d374b528b559e Reviewed-on: https://cl.tvl.fyi/c/depot/+/8160 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-10 r/5927 refactor(tvix/store): move blob splitting into a BlobWriter structFlorian Klink1-76/+11
This will moves the chunking-as-we-receive logic that so far only lived in grpc_blobservice_wrapper.rs into a generic BlobWriter. Change-Id: Ief7d1bda3c6280129f7139de3f6c4174be2ca6ea Reviewed-on: https://cl.tvl.fyi/c/depot/+/8154 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-10 r/5926 feat(tvix/store): do not buffer blob dataFlorian Klink1-73/+106
Use the FastCDC::cut function to ask fastcd for cutting points as we receive the data. Make sure to keep the last chunk in the temporary buffer, as we might not actually cut at the end. Also, use rayon to calculate the blake3 hash if the input data is > 128KiB. Change-Id: I6195f3b74eac5516965cb12d8d026aa720c8b891 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8135 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-03-10 r/5925 feat(tvix/store): bump fastcdc, use v2020 versionFlorian Klink1-1/+1
This switches away from the less canonical "ronomon" version to the implementation as described in the [paper](https://ieeexplore.ieee.org/document/9055082) by Wen Xia, et al., in 2020. That version uses 64-bit hash values and tends to be faster than both the ronomon and v2016 versions, and produces the same chunking as the 2016 version. As per https://docs.rs/fastcdc/latest/fastcdc/#implementations-1, it's the recommended choice. The crate also gained support for streaming version of chunkers: https://docs.rs/fastcdc/latest/fastcdc/#large-data, which might be useful. Change-Id: Ieabec3da54eb2b73c045cb54e51f7a216f63641e Reviewed-on: https://cl.tvl.fyi/c/depot/+/8134 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-03-10 r/5910 feat(tvix/store/blobservice): add GRPCBlobServiceWrapperFlorian Klink1-0/+231
This takes a BlobService and ChunkService in the constructor, and provides a [proto::blob_service_server::BlobService] trait for it. Implementing proto::blob_service_server::BlobService is a lot of surface to cover, and providing this wrapper will make individual implementations taking care of how to store chunks or chunking information much simpler. Change-Id: Ia7b46484fb3ac9104354d496ff2922dca96ff7b9 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8092 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI