depot - monorepo for the virus lounge

Age	Commit message (Collapse)	Author	Files	Lines
2023-05-23	r/6178 refactor(tvix/store/blobsvc): move from Vec<u8> to B3Digest	Florian Klink	1	-18/+6
	Change-Id: I809bab75221f81b6023cfe75c2fe9e589c1e9192 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8605 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-05-11	r/6133 refactor(tvix/store): remove ChunkService	Florian Klink	1	-158/+84
	Whether chunking is involved or not, is an implementation detail of each Blobstore. Consumers of a whole blob shouldn't need to worry about that. It currently is not visible in the gRPC interface either. It shouldn't bleed into everything. Let the BlobService trait provide `open_read` and `open_write` methods, which return handles providing io::Read or io::Write, and leave the details up to the implementation. This means, our custom BlobReader module can go away, and all the chunking bits in there, too. In the future, we might still want to add more chunking-aware syncing, but as a syncing strategy some stores can expose, not as a fundamental protocol component. This currently needs "SyncReadIntoAsyncRead", taken and vendored in from https://github.com/tokio-rs/tokio/pull/5669. It provides a AsyncRead for a sync Read, which is necessary to connect our (sync) BlobReader interface to a GRPC server implementation. As an alternative, we could also make the BlobReader itself async, and let consumers of the trait (EvalIO) deal with the async-ness, but this is less of a change for now. In terms of vendoring, I initially tried to move our tokio crate to these commits, but ended up in version incompatibilities, so let's vendor it in for now. Change-Id: I5969ebbc4c0e1ceece47981be3b9e7cfb3f59ad0 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8551 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-03-27	r/6041 docs(tvix/store): fix typo in comment	Florian Klink	1	-1/+1
	We query the blob service for detailled blob info, not the chunk service. Change-Id: I85a6a57b1dae74a950f734be7d4455c5c35ae355 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8348 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su>
2023-03-17	r/6018 chore(tvix/store/grpcblobsvc): clippy lint	Florian Klink	1	-1/+1
	Change-Id: Idb78e0417a962599cdfdef5e7346f7fa41e3fa1b Reviewed-on: https://cl.tvl.fyi/c/depot/+/8320 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-03-16	r/6015 refactor(tvix/store/chunksvc): use [u8; 32] instead of Vec<u8>	Florian Klink	1	-9/+16
	Change-Id: Ie2b94aa5d69ff2c61fb77e13ae844f81f6270273 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8314 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: flokli <flokli@flokli.de>
2023-03-13	r/5959 refactor(tvix/store): use read_all_and_chunk in gRPC blobservice	Florian Klink	1	-35/+38
	This was the last piece of code using BlobWriter. We can also use `read_all_and_chunk`, it's just requires a bit more plumbing: - The data coming from the client (stream) needs to be mapped (we extract the .data field). - The stream needs to be turned into an (async) reader - The reader needs to be made sync, and that code using the sync reader needs to be in a `task::spawn_blocking`. Change-Id: I4e374e1a9f47d5a0933f59a8f5c121185a5f3e95 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8260 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-03-11	r/5952 refactor(tvix/store): factor out hash update into function	Florian Klink	1	-6/+6
	We're using this in a bunch of places. Let's move it into a helper function. Change-Id: I118fba35f6d343704520ba37280e4ca52a61da44 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8251 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-10	r/5933 fix(tvix/store/proto/grpc_blobservice_wrapper): buffer recv data	Florian Klink	1	-7/+14
	While we don't want to keep all of the data in memory, we want to feed a reasonably-enough buffer to the chunking function, to prevent unnecessarily trying to chunk over and over again. Change-Id: I5bbe2d55e8c1c63f8f7ce343889d374b528b559e Reviewed-on: https://cl.tvl.fyi/c/depot/+/8160 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-10	r/5927 refactor(tvix/store): move blob splitting into a BlobWriter struct	Florian Klink	1	-76/+11
	This will moves the chunking-as-we-receive logic that so far only lived in grpc_blobservice_wrapper.rs into a generic BlobWriter. Change-Id: Ief7d1bda3c6280129f7139de3f6c4174be2ca6ea Reviewed-on: https://cl.tvl.fyi/c/depot/+/8154 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-10	r/5926 feat(tvix/store): do not buffer blob data	Florian Klink	1	-73/+106
	Use the FastCDC::cut function to ask fastcd for cutting points as we receive the data. Make sure to keep the last chunk in the temporary buffer, as we might not actually cut at the end. Also, use rayon to calculate the blake3 hash if the input data is > 128KiB. Change-Id: I6195f3b74eac5516965cb12d8d026aa720c8b891 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8135 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-03-10	r/5925 feat(tvix/store): bump fastcdc, use v2020 version	Florian Klink	1	-1/+1
	This switches away from the less canonical "ronomon" version to the implementation as described in the [paper](https://ieeexplore.ieee.org/document/9055082) by Wen Xia, et al., in 2020. That version uses 64-bit hash values and tends to be faster than both the ronomon and v2016 versions, and produces the same chunking as the 2016 version. As per https://docs.rs/fastcdc/latest/fastcdc/#implementations-1, it's the recommended choice. The crate also gained support for streaming version of chunkers: https://docs.rs/fastcdc/latest/fastcdc/#large-data, which might be useful. Change-Id: Ieabec3da54eb2b73c045cb54e51f7a216f63641e Reviewed-on: https://cl.tvl.fyi/c/depot/+/8134 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-03-10	r/5910 feat(tvix/store/blobservice): add GRPCBlobServiceWrapper	Florian Klink	1	-0/+231
	This takes a BlobService and ChunkService in the constructor, and provides a [proto::blob_service_server::BlobService] trait for it. Implementing proto::blob_service_server::BlobService is a lot of surface to cover, and providing this wrapper will make individual implementations taking care of how to store chunks or chunking information much simpler. Change-Id: Ia7b46484fb3ac9104354d496ff2922dca96ff7b9 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8092 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI