about summary refs log tree commit diff
diff options
context:
space:
mode:
authorFlorian Klink <flokli@flokli.de>2024-06-05T20·38+0300
committerclbot <clbot@tvl.fyi>2024-06-11T11·59+0000
commit154e0d71e0712d2e354c6f795f71b25bf0949a72 (patch)
tree7870d4ed2b6c63770c9c74e6c88540f91b4274bb
parentc4d4cce6579fb5790e974161d664bac061aeef98 (diff)
docs(tvix/docs/TODO): document ChunkService split idea r/8246
Change-Id: Ie9c88b0d14902c642e2d3d6603265688eef0e10d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11755
Reviewed-by: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
-rw-r--r--tvix/docs/src/TODO.md17
1 files changed, 16 insertions, 1 deletions
diff --git a/tvix/docs/src/TODO.md b/tvix/docs/src/TODO.md
index 97f5d8b7d67d..f07bfa122a6d 100644
--- a/tvix/docs/src/TODO.md
+++ b/tvix/docs/src/TODO.md
@@ -178,7 +178,22 @@ logs etc, but this is something requiring a lot of designing.
 ### BlobService
  - On the trait side, currently there's no way to distinguish reading a
    known-chunk vs blob, so we might be calling `.chunks()` unnecessarily often.
-   At least for the `object_store` backend, this might be a problem.
+   At least for the `object_store` backend, this might be a problem, causing a
+   lot of round-trips. It also doesn't compose well - every implementation of
+   `BlobService` needs to both solve the "holding metadata about chunking info"
+   as well as "storing chunks" questions.
+   Design idea (@flokli): split these two concerns into two separate traits:
+    - a `ChunkService` dealing with retrieving individual chunks, by their
+      content digests. Chunks are small enough to keep around in contiguous
+      memory.
+    - a `BlobService` storing metadata about blobs.
+
+   Individual stores would not need to implement `BlobReader` anymore, but that
+   could be a global thing with access to the whole store composition layer,
+   which should make it easier to reuse chunks from other backends. Unclear
+   if the write path should be structured the same way. At least for some
+   backends, we want the remote end to be able to decide about chunking.
+
  - While `object_store` recently got support for `Content-Type`
    (https://github.com/apache/arrow-rs/pull/5650), there's no support on the
    local filesystem yet. We'd need to add support to this (through xattrs).