diff options
-rw-r--r-- | tvix/docs/src/TODO.md | 17 |
1 files changed, 16 insertions, 1 deletions
diff --git a/tvix/docs/src/TODO.md b/tvix/docs/src/TODO.md index 97f5d8b7d67d..f07bfa122a6d 100644 --- a/tvix/docs/src/TODO.md +++ b/tvix/docs/src/TODO.md @@ -178,7 +178,22 @@ logs etc, but this is something requiring a lot of designing. ### BlobService - On the trait side, currently there's no way to distinguish reading a known-chunk vs blob, so we might be calling `.chunks()` unnecessarily often. - At least for the `object_store` backend, this might be a problem. + At least for the `object_store` backend, this might be a problem, causing a + lot of round-trips. It also doesn't compose well - every implementation of + `BlobService` needs to both solve the "holding metadata about chunking info" + as well as "storing chunks" questions. + Design idea (@flokli): split these two concerns into two separate traits: + - a `ChunkService` dealing with retrieving individual chunks, by their + content digests. Chunks are small enough to keep around in contiguous + memory. + - a `BlobService` storing metadata about blobs. + + Individual stores would not need to implement `BlobReader` anymore, but that + could be a global thing with access to the whole store composition layer, + which should make it easier to reuse chunks from other backends. Unclear + if the write path should be structured the same way. At least for some + backends, we want the remote end to be able to decide about chunking. + - While `object_store` recently got support for `Content-Type` (https://github.com/apache/arrow-rs/pull/5650), there's no support on the local filesystem yet. We'd need to add support to this (through xattrs). |