Age | Commit message (Collapse) | Author | Files | Lines |
|
This splits the pure content-addressed layers from tvix-store into a
`castore` crate, and only leaves PathInfo related things, as well as the
CLI entrypoint in the tvix-store crate.
Notable changes:
- `fixtures` and `utils` had to be moved out of the `test` cfg, so they
can be imported from tvix-store.
- Some ad-hoc fixtures in the test were moved to proper fixtures in the
same step.
- The protos are now created by a (more static) recipe in the protos/
directory.
The (now two) golang targets are commented out, as it's not possible to
update them properly in the same CL. This will be done by a followup CL
once this is merged (and whitby deployed)
Bug: https://b.tvl.fyi/issues/301
Change-Id: I8d675d4bf1fb697eb7d479747c1b1e3635718107
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9370
Reviewed-by: tazjin <tazjin@tvl.su>
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
|
|
This wasn't removed yet, and no code is using/populating it so far.
It's confusing, let's update it to the state of things now, and re-
introduce it once we get there.
Change-Id: I68f5ba17a8eee604d8ccd82749da7c8be094cb99
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9351
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
|
|
We previously kept the trait of a BlobService sync.
This however had some annoying consequences:
- It became more and more complicated to track when we're in a context
with an async runtime in the context or not, producing bugs like
https://b.tvl.fyi/issues/304
- The sync trait shielded away async clients from async worloads,
requiring manual block_on code inside the gRPC client code, and
spawn_blocking calls in consumers of the trait, even if they were
async (like the gRPC server)
- We had to write our own custom glue code (SyncReadIntoAsyncRead)
to convert a sync io::Read into a tokio::io::AsyncRead, which already
existed in tokio internally, but upstream ia hesitant to expose.
This now makes the BlobService trait async (via the async_trait macro,
like we already do in various gRPC parts), and replaces the sync readers
and writers with their async counterparts.
Tests interacting with a BlobService now need to have an async runtime
available, the easiest way for this is to mark the test functions
with the tokio::test macro, allowing us to directly .await in the test
function.
In places where we don't have an async runtime available from context
(like tvix-cli), we can pass one down explicitly.
Now that we don't provide a sync interface anymore, the (sync) FUSE
library now holds a pointer to a tokio runtime handle, and needs to at
least have 2 threads available when talking to a blob service (which is
why some of the tests now use the multi_thread flavor).
The FUSE tests got a bit more verbose, as we couldn't use the
setup_and_mount function accepting a callback anymore. We can hopefully
move some of the test fixture setup to rstest in the future to make this
less repetitive.
Co-Authored-By: Connor Brewster <cbrewster@hey.com>
Change-Id: Ia0501b606e32c852d0108de9c9016b21c94a3c05
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9329
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
|
|
Makes use of https://github.com/tokio-rs/prost/pull/341, which makes our
bytes field cheaper to clone.
It's a bit annoying to configure due to
https://github.com/hyperium/tonic/issues/908, but the workaround does
get the job done.
Change-Id: I25714600b041bb5432d3adf5859b151e72b12778
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8975
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
|
|
This will save us some copies, because a clone will simply create an
additional pointer to the same data.
Change-Id: I017a5d6b4c85a861b5541ebad2858ad4fbf8e8fa
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8978
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
|
|
We never returned Err here anyways, and we can still return an error
during the first (or subsequent) write(s).
Change-Id: I4b4cd3d35f6ea008e9ffe2f7b71bfc9187309e2f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8750
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
|
|
From 64 bytes to 100 KBytes.
We need to provide a custom wrapper with a different Default instance.
Change-Id: Id7c6c437b8183b355a9e388f98cef1622b363f64
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8748
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
|
|
This allows us to blob services without closing them before putting them
in a box.
We currently need to use Arc<_>, not Rc<_>, because the GRPC wrappers
require Sync.
Change-Id: I679c5f06b62304f5b0456cfefe25a0a881de7c84
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8738
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
|
|
To construct various stores at runtime, we need to eliminate associated
types from the BlobService trait, and return Box<dyn …> instead of
specific types.
This also means we can't consume self in the close() method, so
everything we write to is put in an Option<>, and during the first close
we take from there.
Change-Id: Ia523b6ab2f2a5276f51cb5d17e81a5925bce69b6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8647
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
|
|
Change-Id: I809bab75221f81b6023cfe75c2fe9e589c1e9192
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8605
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
|
|
Whether chunking is involved or not, is an implementation detail of each
Blobstore. Consumers of a whole blob shouldn't need to worry about that.
It currently is not visible in the gRPC interface either. It
shouldn't bleed into everything.
Let the BlobService trait provide `open_read` and `open_write` methods,
which return handles providing io::Read or io::Write, and leave the
details up to the implementation.
This means, our custom BlobReader module can go away, and all the
chunking bits in there, too.
In the future, we might still want to add more chunking-aware syncing,
but as a syncing strategy some stores can expose, not as a fundamental
protocol component.
This currently needs "SyncReadIntoAsyncRead", taken and vendored in from
https://github.com/tokio-rs/tokio/pull/5669.
It provides a AsyncRead for a sync Read, which is necessary to connect
our (sync) BlobReader interface to a GRPC server implementation.
As an alternative, we could also make the BlobReader itself async, and
let consumers of the trait (EvalIO) deal with the async-ness, but this
is less of a change for now.
In terms of vendoring, I initially tried to move our tokio crate to
these commits, but ended up in version incompatibilities, so let's
vendor it in for now.
Change-Id: I5969ebbc4c0e1ceece47981be3b9e7cfb3f59ad0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8551
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
|
|
We query the blob service for detailled blob info, not the chunk
service.
Change-Id: I85a6a57b1dae74a950f734be7d4455c5c35ae355
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8348
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
|
|
Change-Id: Idb78e0417a962599cdfdef5e7346f7fa41e3fa1b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8320
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
|
|
Change-Id: Ie2b94aa5d69ff2c61fb77e13ae844f81f6270273
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8314
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
Autosubmit: flokli <flokli@flokli.de>
|
|
This was the last piece of code using BlobWriter.
We can also use `read_all_and_chunk`, it's just requires a bit more
plumbing:
- The data coming from the client (stream) needs to be mapped (we
extract the .data field).
- The stream needs to be turned into an (async) reader
- The reader needs to be made sync, and that code using the sync reader
needs to be in a `task::spawn_blocking`.
Change-Id: I4e374e1a9f47d5a0933f59a8f5c121185a5f3e95
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8260
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
|
|
We're using this in a bunch of places. Let's move it into a helper
function.
Change-Id: I118fba35f6d343704520ba37280e4ca52a61da44
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8251
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
|
|
While we don't want to keep all of the data in memory, we want to
feed a reasonably-enough buffer to the chunking function, to prevent
unnecessarily trying to chunk over and over again.
Change-Id: I5bbe2d55e8c1c63f8f7ce343889d374b528b559e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8160
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
|
|
This will moves the chunking-as-we-receive logic that so far only lived
in grpc_blobservice_wrapper.rs into a generic BlobWriter.
Change-Id: Ief7d1bda3c6280129f7139de3f6c4174be2ca6ea
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8154
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
|
|
Use the FastCDC::cut function to ask fastcd for cutting points as we
receive the data. Make sure to keep the last chunk in the temporary
buffer, as we might not actually cut at the end.
Also, use rayon to calculate the blake3 hash if the input data is
> 128KiB.
Change-Id: I6195f3b74eac5516965cb12d8d026aa720c8b891
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8135
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
|
|
This switches away from the less canonical "ronomon" version to the
implementation as described in the
[paper](https://ieeexplore.ieee.org/document/9055082) by Wen Xia, et
al., in 2020.
That version uses 64-bit hash values and tends to be faster than both
the ronomon and v2016 versions, and produces the same chunking as the
2016 version.
As per https://docs.rs/fastcdc/latest/fastcdc/#implementations-1, it's
the recommended choice.
The crate also gained support for streaming version of chunkers:
https://docs.rs/fastcdc/latest/fastcdc/#large-data, which might be
useful.
Change-Id: Ieabec3da54eb2b73c045cb54e51f7a216f63641e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8134
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
|
|
This takes a BlobService and ChunkService in the constructor, and
provides a [proto::blob_service_server::BlobService] trait for it.
Implementing proto::blob_service_server::BlobService is a lot of surface
to cover, and providing this wrapper will make individual
implementations taking care of how to store chunks or chunking
information much simpler.
Change-Id: Ia7b46484fb3ac9104354d496ff2922dca96ff7b9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8092
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
|