36 files changed, 5284 insertions, 0 deletions
diff --git a/tvix/docs/src/SUMMARY.md b/tvix/docs/src/SUMMARY.md
new file mode 100644
index 000000000000..cce6d8966df6
--- /dev/null
+++ b/tvix/docs/src/SUMMARY.md
@@ -0,0 +1,49 @@
+# Summary
+
+# Welcome
+* [Introduction](./introduction.md)
+* [Community](./community.md)
+* [Getting Started](./getting-started.md)
+
+# Contributing
+* [Gerrit](./contributing/gerrit.md)
+* [Email](./contributing/email.md)
+* [Code & Commits](./contributing/code-&-commits.md)
+
+# Tvix
+- [Architecture & data flow](./architecture.md)
+- [TODOs](./TODO.md)
+
+# Evaluator
+- [Compilation of Bindings](./eval/bindings.md)
+- [Builtins](./eval/builtins.md)
+- [Build References](./eval/build-references.md)
+- [Catchable Errors](./eval/catchable-errors.md)
+- [Known Optimisation Potential](./eval/known-optimisation-potential.md)
+- [Langugage Issues](./eval/language-issues.md)
+- [Attrset Opcodes](./eval/opcodes-attrsets.md)
+- [Recursive attribute sets](./eval/recursive-attrs.md)
+- [VM Loop](./eval/vm-loop.md)
+- [Abandoned](./eval/abandoned/index.md)
+  - [Thread-local VM](./eval/abandoned/thread-local-vm.md)
+
+# Store
+- [Store API](./store/api.md)
+- [BlobStore Chunking](./castore/blobstore-chunking.md)
+- [BlobStore Protocol](./castore/blobstore-protocol.md)
+- [Data Model](./castore/data-model.md)
+- [Why not git trees?](./castore/why-not-git-trees.md)
+
+# Builder
+- [Build API](./build/index.md)
+
+# Nix
+- [Specification of the Nix Language](./language-spec.md)
+- [Nix language version history](./lang-version.md)
+- [Value Pointer Equality](./value-pointer-equality.md)
+- [Daemon Protocol](./nix-daemon/index.md)
+  - [Handshake](./nix-daemon/handshake.md)
+  - [Logging](./nix-daemon/logging.md)
+  - [Operations](./nix-daemon/operations.md)
+  - [Serialization](./nix-daemon/serialization.md)
+  - [Changelog](./nix-daemon/changelog.md)
diff --git a/tvix/docs/src/TODO.md b/tvix/docs/src/TODO.md
new file mode 100644
index 000000000000..af558c9580fe
--- /dev/null
+++ b/tvix/docs/src/TODO.md
@@ -0,0 +1,227 @@
+# TODO
+
+This contains a rough collection of ideas on the TODO list, trying to keep track
+of it somewhere.
+
+Of course, there's no guarantee these things will get addressed, but it helps
+dumping the backlog somewhere.
+
+Feel free to add new ideas. Before picking something, ask in `#tvix-dev` to make
+sure noone is working on this, or has some specific design in mind already.
+
+## Cleanups
+### Nix language test suite
+ - Think about how to merge, but "categorize" `tvix_tests` in `glue` and `eval`.
+   We currently only have this split as they need a different feature set /
+   builtins.
+ - move some of the rstest cases in `tvix-glue` to the `.nix`/`.exp` mechanism.
+   Some of them need test fixtures, which cannot be represented in git (special
+   file types in the import tests for example). Needs some support from the test
+   suite to create these fixtures on demand.
+ - extend `verify-lang-tests/default.nix` mechanism to validate `tvix-eval` and
+   `tvix-glue` test cases (or the common structure above).
+ - absorb `eval/tests/nix_oracle.rs` into `tvix_tests`, or figure out why it's
+   not possible (and document) it. It looks like it's only as nix is invoked
+   with a different level of `--strict`, but the toplevel doc-comment suggests
+   its generic?
+
+### Correctness > Performance
+A lot of the Nix behaviour isn't well documented out, and before going too deep
+into performance optimizations, we need to ensure we properly grasped all hidden
+features. This is to avoid doing a lot of "overall architecture perf-related
+work" and increased code complexity based on a mental model that might get
+disproved later on, as we work towards correctness.
+
+We do this by evaluating more and more parts of the official Nix test suite, as
+well as our own Tvix test suite, and compare it with Nix' output.
+
+Additionally, we evaluate attributes from nixpkgs, compare calculated output
+paths (to determine equivalence of evaluated A-Terms) and fix differences as we
+encounter them.
+
+This currently is a very manual and time-consuming process, both in terms of
+setup, as well as spotting the source of the differences (and "compensating" for
+the resulting diff noise on resulting mismtaches).
+
+ - We could use some better tooling that periodically evaluates nixpkgs, and
+   compares the output paths with the ones produced by Nix
+ - We could use some better tooling that can spot the (real) differences between
+   two (graphs of) derivations, while removing all resulting noise from the diff
+in resulting store paths.
+
+
+### Performance
+Even while keeping in mind some of the above caveats, there's some obvious
+low-langing fruits that could have a good impact on performance, with somewhat
+limited risk of becoming obsolete in case of behaviorial changes due to
+correctness:
+
+ - String Contexts currently do a lot of indirections (edef)
+   (NixString -> NixStringInner -> HashSet[element] -> NixContextElement -> String -> data)
+   to get to the actual data. We should improve this. There's various ideas, one
+   of it is globally interning all Nix context elements, and only keeping
+   indices into that. We might need to have different representations for small
+   amount of context elements or larger ones, and need tooling to reason about
+   the amount of contexts we have.
+ - To calculate NAR size and digest (used for output path calculation of FODs),
+   our current `SimpleRenderer` `NarCalculationService` sequentially asks for
+   one blob after another (and internally these might consists out of multiple
+   chunks too).
+   That's a lot of roundtrips, adding up to a lot of useless waiting.
+   While we cannot avoid having to feed all bytes sequentially through sha256,
+   we already know what blobs to fetch and in which order.
+   There should be a way to buffer some "amount of upcoming bytes" in memory,
+   and not requesting these seqentially.
+   This is somewhat the "spiritual counterpart" to our sequential ingestion
+   code (`ConcurrentBlobUploader`, used by `ingest_nar`), which keeps
+   "some amount of outgoing bytes" in memory.
+   This is somewhat blocked until the {Chunk/Blob}Service split is done, as then
+   prefetching would only be a matter of adding it into the one `BlobReader`.
+
+### Error cleanup
+ - Currently, all services use tvix_castore::Error, which only has two kinds
+   (invalid request, storage error), containing an (owned) string.
+   This is quite primitive. We should have individual error types for BS, DS, PS.
+   Maybe these should have some generics to still be able to carry errors from
+   the underlying backend, similar to `IngestionError`.
+   There was an attempt to give PS separate error types (cl/11695), but this
+   ended up very verbose.
+   Every error had to be boxed, and a possible additional message be added. Some
+   errors that didn't wrap another underlying errors were hard to construct, too
+   (requiring the addition of errors). All of this without even having added
+   proper backtrace support, which would be quite helpful in store hierarchies.
+   `anyhow`'s `.context()` gives us most of this out of the box. Maybe we can
+   use that, using enums rather than `&'static str` as context in some cases?
+
+## Fixes towards correctness
+ - `rnix` only supports string source files, but `NixString` uses bytes (and Nix
+   source code might be no valid UTF-8).
+
+## Documentation
+Extend the other pages in here. Some ideas on what should be tackled:
+ - Document what Tvix is, and what it is not yet. What it is now, what it is not
+   (yet), explaining some of the architectural choices (castore, more hermetic
+   `Build` repr), while still being compatible. Explain how it's possible to
+   plug in other frontends, and use `tvix-{[ca]store,build}` without Nixlang even.
+   And how `nix-compat` is a useful crate for all sorts of formats and data
+   types of Nix.
+ - Update the Architecture diagram to model the current state of things.
+   There's no gRPC between Coordinator and Evaluator.
+ - Add a dedicated section/page explaining the separation between tvix-glue and
+   tvix-eval, and how more annoying builtins get injected into tvix-eval through
+   tvix-glue.
+   Maybe restructure to only explain the component structure potentially
+   crossing process boundaries (those with gRPC), and make the rest more crate
+   and trait-focused?
+ - Restructure docs on castore vs store, this seems to be duplicated a bit and
+   is probably still not too clear.
+ - Describe store composition(s) in more detail. There's some notes on granular
+   fetching which probably can be repurposed.
+ - Absorb the rest of //tvix/website into this.
+
+## Features
+
+### Fetchers
+Some more fetcher-related builtins need work:
+ - `fetchGit`
+ - `fetchTree` (hairy, seems there's no proper spec and the URL syntax seems
+   subject to change/underdocumented)
+
+### Derivation -> Build
+While we have some support for `structuredAttrs` and `fetchClosure` (at least
+enough to calculate output hashes, aka produce identical ATerm), the code
+populating the `Build` struct doesn't exist it yet.
+
+Similarly, we also don't properly populate the build environment for
+`fetchClosure` yet. (Note there already is `ExportedPathInfo`, so once
+`structuredAttrs` is there this should be easy.
+
+### Builders
+Once builds are proven to work with real-world builds, and the corner cases
+there are ruled out, adding other types of builders might be interesting.
+
+ - bwrap
+ - gVisor
+ - Cloud Hypervisor (using similar technique as `//tvix//boot`).
+
+Long-term, we want to extend traits and gRPC protocol.
+This requires some more designing. Some goals:
+
+ - use stricter castore types (and maybe stricter build types) instead of
+   proto types, add conversion code where necessary
+ - (more granular) control while a build is happening
+ - expose more telemetry and logs
+
+
+### Store composition
+ - Combinators: list-by-priority, first-come-first-serve, cache
+ - Store composition hierarchies (@yuka).
+   - URL format too one-dimensional.
+   - We want to have nice and simple user-facing substituter config, including
+     sensible default wrappers for caching, retries, fallbacks, as well as
+     granular control for power-users.
+   - Current design idea:
+     - Have a concept similar to rclone config (map with store aliases as
+       keys, allowing to refer to stores by their alias from other parts of
+       the config).
+       It allows both referring to by name, as well as ad-hoc definition:
+       https://rclone.org/docs/#syntax-of-remote-paths
+     - Each store needs to be aware of its "instance name", so it can be
+       included in logs, metrics, …
+     - Have a "instantiation function" traversing such a config data structure,
+       creating store instances and plugging them together, ultimately returning
+       a dyn …Service interface.
+     - No reconfiguration/reconcilation for now
+     - Making URLs the primary data format would get ugly quite easily (hello
+       multiple layers of escaping!), so best to convert the existing URL
+       syntax to our new config format on the fly and then use one codepath
+       to instantiate/assemble. Similarly, something like the "user-facing
+       substituter config" mentioned above could aalso be converted to such a
+       config format under the hood.
+     - Maybe add a ?cache=$other_url parameter support to the URL syntax, to
+       easily wrap a store with a caching frontend, using $other_url as the
+      "near" store URL.
+
+### Store Config
+   There's already serde for some store options (bigtable uses `serde_qs`).
+   We might also have common options global over all backends, like chunking
+   parameters for chunking blobservices. Think where this would fit in.
+ - Rework the URL syntax for object_store. We should support the default s3/gcs
+   URLs at least.
+
+### BlobService
+ - On the trait side, currently there's no way to distinguish reading a
+   known-chunk vs blob, so we might be calling `.chunks()` unnecessarily often.
+   At least for the `object_store` backend, this might be a problem, causing a
+   lot of round-trips. It also doesn't compose well - every implementation of
+   `BlobService` needs to both solve the "holding metadata about chunking info"
+   as well as "storing chunks" questions.
+   Design idea (@flokli): split these two concerns into two separate traits:
+    - a `ChunkService` dealing with retrieving individual chunks, by their
+      content digests. Chunks are small enough to keep around in contiguous
+      memory.
+    - a `BlobService` storing metadata about blobs.
+
+   Individual stores would not need to implement `BlobReader` anymore, but that
+   could be a global thing with access to the whole store composition layer,
+   which should make it easier to reuse chunks from other backends. Unclear
+   if the write path should be structured the same way. At least for some
+   backends, we want the remote end to be able to decide about chunking.
+
+ - While `object_store` recently got support for `Content-Type`
+   (https://github.com/apache/arrow-rs/pull/5650), there's no support on the
+   local filesystem yet. We'd need to add support to this (through xattrs).
+
+### PathInfoService
+ - sqlite backend (different schema than the Nix one, we need the root nodes data!)
+
+### Nix Daemon protocol
+- Some work ongoing on the worker operation parsing (griff, picnoir)
+
+### O11Y
+ - Maybe drop `--log-level` entirely, and only use `RUST_LOG` env exclusively?
+   `debug`,`trace` level across all crates is a bit useless, and `RUST_LOG` can
+   be much more granular…
+ - Trace propagation for object_store once they support a way to register a
+   middleware, so we can use that to register a tracing middleware.
+   https://github.com/apache/arrow-rs/issues/5990
diff --git a/tvix/docs/src/architecture.md b/tvix/docs/src/architecture.md
new file mode 100644
index 000000000000..02ffdfdcd2b0
--- /dev/null
+++ b/tvix/docs/src/architecture.md
@@ -0,0 +1,156 @@
+# Tvix - Architecture & data flow
+
+## Background
+
+We intend for Tvix tooling to be more decoupled than the existing,
+monolithic Nix implementation. In practice, we expect to gain several
+benefits from this, such as:
+
+- Ability to use different builders
+- Ability to use different store implementations
+- No monopolisation of the implementation, allowing users to replace
+  components that they are unhappy with (up to and including the
+  language evaluator)
+- Less hidden intra-dependencies between tools due to explicit RPC/IPC
+  boundaries
+
+Communication between different components of the system will use
+gRPC. The rest of this document outlines the components.
+
+## Components
+
+### Coordinator
+
+```admonish warning
+Currently there's no separate coordinator. Most of the interaction between
+store, builder and evaluator is done by library code living in the `tvix-glue`
+crate (and `tvix-cli` is a user of it).
+
+Keep in mind some of the statements below are outdated and neither reflect
+reality nor desired design anymore.
+```
+
+*Purpose:* The coordinator (in the simplest case, the Tvix CLI tool)
+oversees the flow of a build process and delegates tasks to the right
+subcomponents. For example, if a user runs the equivalent of
+`nix-build` in a folder containing a `default.nix` file, the
+coordinator will invoke the evaluator, pass the resulting derivations
+to the builder and coordinate any necessary store interactions (for
+substitution and other purposes).
+
+While many users are likely to use the CLI tool as their primary
+method of interacting with Tvix, it is not unlikely that alternative
+coordinators (e.g. for a distributed, "Nix-native" CI system) would be
+implemented. To facilitate this, we are considering implementing the
+coordinator on top of a state-machine model that would make it
+possible to reuse the FSM logic without tying it to any particular
+kind of application.
+
+### Evaluator
+
+*Purpose:* Eval takes care of evaluating Nix code. In a typical build
+flow it would be responsible for producing derivations. It can also be
+used as a standalone tool, for example, in use-cases where Nix is used
+to generate configuration without any build or store involvement.
+
+*Requirements:* For now, it will run on the machine invoking the build
+command itself. We give it filesystem access to handle things like
+imports or `builtins.readFile`.
+
+To support IFD, the Evaluator also needs access to store paths. This
+could be implemented by having the coordinator provide an interface to retrieve
+files from a store path, or by ensuring a "realized version of the store" is
+accessible by the evaluator (this could be a FUSE filesystem, or the "real"
+/nix/store on disk.
+
+We might be okay with running the evaluator with filesystem access for now and
+can extend the interface if the need arises.
+
+### Builder
+
+*Purpose:* A builder receives derivations from the coordinator and
+builds them.
+
+By making builder a standardised interface it's possible to make the
+sandboxing mechanism used by the build process pluggable.
+
+Nix is currently using a hard-coded
+[libseccomp](https://github.com/seccomp/libseccomp) based sandboxing
+mechanism and another one based on
+[sandboxd](https://www.unix.com/man-page/mojave/8/sandboxd/) on macOS.
+These are only separated by [compiler preprocessor
+macros](https://gcc.gnu.org/onlinedocs/cpp/Ifdef.html) within the same
+source files despite having very little in common with each other.
+
+This makes experimentation with alternative backends difficult and
+porting Nix to other platforms harder than it has to be. We want to
+write a new Linux builder which uses
+[OCI](https://github.com/opencontainers/runtime-spec), the current
+dominant Linux containerisation technology, by default.
+
+With a well-defined builder abstraction, it's also easy to imagine
+other backends such as a Kubernetes-based one in the future.
+
+The environment in which builds happen is currently very Nix-specific. We might
+want to avoid having to maintain all the intricacies of a Nix-specific
+sandboxing environment in every builder, and instead only provide a more
+generic interface, receiving build requests (and have the coordinator translate
+derivations to that format). [^1]
+
+To build, the builder needs to be able to mount all build inputs into the build
+environment. For this, it needs the store to expose a filesystem interface.
+
+### Store
+
+*Purpose:* Store takes care of storing build results. It provides a
+unified interface to get store paths and upload new ones, as well as querying
+for the existence of a store path and its metadata (references, signatures, …).
+
+Tvix natively uses an improved store protocol. Instead of transferring around
+NAR files, which don't provide an index and don't allow seekable access, a
+concept similar to git tree hashing is used.
+
+This allows more granular substitution, chunk reusage and parallel download of
+individual files, reducing bandwidth usage.
+As these chunks are content-addressed, it opens up the potential for
+peer-to-peer trustless substitution of most of the data, as long as we sign the
+root of the index.
+
+Tvix still keeps the old-style signatures, NAR hashes and NAR size around. In
+the case of NAR hash / NAR size, this data is strictly required in some cases.
+The old-style signatures are valuable for communication with existing
+implementations.
+
+Old-style binary caches (like cache.nixos.org) can still be exposed via the new
+interface, by doing on-the-fly (re)chunking/ingestion.
+
+Most likely, there will be multiple implementations of store, some storing
+things locally, some exposing a "remote view".
+
+A few possible ones that come to mind are:
+
+- Local store
+- SFTP/ GCP / S3 / HTTP
+- NAR/NARInfo protocol: HTTP, S3
+
+A remote Tvix store can be connected by simply connecting to its gRPC
+interface, possibly using SSH tunneling, but there doesn't need to be an
+additional "wire format" like the Nix `ssh(+ng)://` protocol.
+
+Settling on one interface allows composition of stores, meaning it becomes
+possible to express substitution from remote caches as a proxy layer.
+
+It'd also be possible to write a FUSE implementation on top of the RPC
+interface, exposing a lazily-substituting /nix/store mountpoint. Using this in
+remote build context dramatically reduces the amount of data transferred to a
+builder, as only the files really accessed during the build are substituted.
+
+## Figures
+
+```plantuml,format=svg
+{{#include figures/component-flow.puml}}
+```
+
+[^1]: There have already been some discussions in the Nix community, to switch
+  to REAPI:
+  https://discourse.nixos.org/t/a-proposal-for-replacing-the-nix-worker-protocol/20926/22
diff --git a/tvix/docs/src/build/index.md b/tvix/docs/src/build/index.md
new file mode 100644
index 000000000000..cf9580c98a82
--- /dev/null
+++ b/tvix/docs/src/build/index.md
@@ -0,0 +1,59 @@
+# Builder Protocol
+
+The builder protocol is used by tvix-glue to trigger builds.
+
+One goal of the protocol is to not be too tied to the Nix implementation itself,
+allowing it to be used for other builds/workloads in the future.
+
+This means the builder protocol is versatile enough to express the environment a
+Nix build expects, while not being aware of "what any of this means".
+
+For example, it is not aware of how certain environment variables are set in a
+nix build, but allows specifying environent variables that should be set.
+
+It's also not aware of what nix store paths are. Instead, it allows:
+
+ - specifying a list of paths expected to be produced during the build
+ - specifying a list of castore root nodes to be present in a specified
+   `inputs_dir`.
+ - specifying which paths are write-able during build.
+
+In case all specified paths are produced, and the command specified in
+`command_args` succeeds, the build is considered to be successful.
+
+This happens to be sufficient to *also* express how Nix builds works.
+
+Check `build/protos/build.proto` for a detailed description of the individual
+fields, and the tests in `glue/src/tvix_build.rs` for some examples.
+
+The following sections describe some aspects of Nix builds, and how this is
+(planned to be) implemented with the Tvix Build protocol.
+
+## Reference scanning
+At the end of a build, Nix does scan a store path for references to other store
+paths (*out of the set of all store paths present during the build*).
+It does do this by (only) looking for a list of nixbase32-encoded hashes in
+filenames (?), symlink targets and blob contents.
+
+While we could do this entirely outside the builder protocol, it'd mean a build
+client would be required to download the produced outputs locally, and do the
+refscan there. This is undesireable, as the builder already has all produced
+outputs locally, and it'd make more sense for it do do it.
+
+Instead, we want to describe reference scanning in a generic fashion.
+
+One proposed way to do this is to add an additional field `refscan_needles` to
+the `BuildRequest` message.
+If this is an non-empty list, all paths in `outputs` are scanned for these.
+
+The `Build` response message would then be extended with an `outputs_needles`
+field, containing the same number of elements as the existing `outputs` field.
+In there, we'd have a list of numbers, indexing into `refscan_needles`
+originally specified.
+
+For Nix, `refscan_needles` would be populated with the nixbase32 hash parts of
+every input store path and output store path. The latter is necessary to scan
+for references between multi-output derivations.
+
+This is sufficient to construct the referred store paths in each build output on
+the build client.
diff --git a/tvix/docs/src/castore/blobstore-chunking.md b/tvix/docs/src/castore/blobstore-chunking.md
new file mode 100644
index 000000000000..d8c3d54b52f0
--- /dev/null
+++ b/tvix/docs/src/castore/blobstore-chunking.md
@@ -0,0 +1,147 @@
+# BlobStore: Chunking & Verified Streaming
+
+`tvix-castore`'s BlobStore is a content-addressed storage system, using [blake3]
+as hash function.
+
+Returned data is fetched by using the digest as lookup key, and can be verified
+to be correct by feeding the received data through the hash function and
+ensuring it matches the digest initially used for the lookup.
+
+This means, data can be downloaded by any untrusted third-party as well, as the
+received data is validated to match the digest it was originally requested with.
+
+However, for larger blobs of data, having to download the entire blob at once is
+wasteful, if we only care about a part of the blob. Think about mounting a
+seekable data structure, like loop-mounting an .iso file, or doing partial reads
+in a large Parquet file, a column-oriented data format.
+
+> We want to have the possibility to *seek* into a larger file.
+
+This however shouldn't compromise on data integrity properties - we should not
+need to trust a peer we're downloading from to be "honest" about the partial
+data we're reading. We should be able to verify smaller reads.
+
+Especially when substituting from an untrusted third-party, we want to be able
+to detect quickly if that third-party is sending us wrong data, and terminate
+the connection early.
+
+## Chunking
+In content-addressed systems, this problem has historically been solved by
+breaking larger blobs into smaller chunks, which can be fetched individually,
+and making a hash of *this listing* the blob digest/identifier.
+
+ - BitTorrent for example breaks files up into smaller chunks, and maintains
+   a list of sha1 digests for each of these chunks. Magnet links contain a
+   digest over this listing as an identifier. (See [bittorrent-v2][here for
+   more details]).
+   With the identifier, a client can fetch the entire list, and then recursively
+   "unpack the graph" of nodes, until it ends up with a list of individual small
+   chunks, which can be fetched individually.
+ - Similarly, IPFS with its IPLD model builds up a Merkle DAG, and uses the
+   digest of the root node as an identitier.
+
+These approaches solve the problem of being able to fetch smaller chunks in a
+trusted fashion. They can also do some deduplication, in case there's the same
+leaf nodes same leaf nodes in multiple places.
+
+However, they also have a big disadvantage. The chunking parameters, and the
+"topology" of the graph structure itself "bleed" into the root hash of the
+entire data structure itself.
+
+Depending on the chunking parameters used, there's different representations for
+the same data, causing less data sharing/reuse in the overall system, in terms of how
+many chunks need to be downloaded vs. are already available locally, as well as
+how compact data is stored on-disk.
+
+This can be workarounded by agreeing on only a single way of chunking, but it's
+not pretty and misses a lot of deduplication potential.
+
+### Chunking in Tvix' Blobstore
+tvix-castore's BlobStore uses a hybrid approach to eliminate some of the
+disadvantages, while still being content-addressed internally, with the
+highlighted benefits.
+
+It uses [blake3] as hash function, and the blake3 digest of **the raw data
+itself** as an identifier (rather than some application-specific Merkle DAG that
+also embeds some chunking information).
+
+BLAKE3 is a tree hash where all left nodes fully populated, contrary to
+conventional serial hash functions. To be able to validate the hash of a node,
+one only needs the hash of the (2) children [^1], if any.
+
+This means one only needs to the root digest to validate a constructions, and these
+constructions can be sent [separately][bao-spec].
+
+This relieves us from the need of having to encode more granular chunking into
+our data model / identifier upfront, but can make this mostly a transport/
+storage concern.
+
+For some more description on the (remote) protocol, check
+[BlobStore Protocol](./blobstore-protocol.md).
+
+#### Logical vs. physical chunking
+
+Due to the properties of the BLAKE3 hash function, we have logical blocks of
+1KiB, but this doesn't necessarily imply we need to restrict ourselves to these
+chunk sizes w.r.t. what "physical chunks" are sent over the wire between peers,
+or are stored on-disk.
+
+The only thing we need to be able to read and verify an arbitrary byte range is
+having the covering range of aligned 1K blocks, and a construction from the root
+digest to the 1K block.
+
+Note the intermediate hash tree can be further trimmed, [omitting][bao-tree]
+lower parts of the tree while still providing verified streaming - at the cost
+of having to fetch larger covering ranges of aligned blocks.
+
+Let's pick an example. We identify each KiB by a number here for illustrational
+purposes.
+
+Assuming we omit the last two layers of the hash tree, we end up with logical
+4KiB leaf chunks (`bao_shift` of `2`).
+
+For a blob of 14 KiB total size, we could fetch logical blocks `[0..=3]`,
+`[4..=7]`, `[8..=11]` and `[12..=13]` in an authenticated fashion:
+
+`[ 0 1 2 3 ] [ 4 5 6 7 ] [ 8 9 10 11 ] [ 12 13 ]`
+
+Assuming the server now informs us about the following physical chunking:
+
+```
+[ 0 1 ] [ 2 3 4 5 ] [ 6 ] [ 7 8 ] [ 9 10 11 12 13 14 15 ]`
+```
+
+If our application now wants to arbitrarily read from 0 until 4 (inclusive):
+
+```
+[ 0 1 ] [ 2 3 4 5 ] [ 6 ] [ 7 8 ] [ 9 10 11 12 13 14 15 ]
+ |-------------|
+
+```
+
+…we need to fetch physical chunks `[ 0 1 ]`, `[ 2 3 4 5 ]` and `[ 6 ] [ 7 8 ]`.
+
+
+`[ 0 1 ]` and `[ 2 3 4 5 ]` are obvious, they contain the data we're
+interested in.
+
+We however also need to fetch the physical chunks `[ 6 ]` and `[ 7 8 ]`, so we
+can assemble `[ 4 5 6 7 ]` to verify both logical chunks:
+
+```
+[ 0 1 ] [ 2 3 4 5 ] [ 6 ] [ 7 8 ] [ 9 10 11 12 13 14 15 ]
+^       ^           ^     ^
+|----4KiB----|------4KiB-----|
+```
+
+Each physical chunk fetched can be validated to have the blake3 digest that was
+communicated upfront, and can be stored in a client-side cache/storage, so
+subsequent / other requests for the same data will be fast(er).
+
+---
+
+[^1]: and the surrounding context, aka position inside the whole blob, which is available while verifying the tree
+[bittorrent-v2]: https://blog.libtorrent.org/2020/09/bittorrent-v2/
+[blake3]: https://github.com/BLAKE3-team/BLAKE3
+[bao-spec]: https://github.com/oconnor663/bao/blob/master/docs/spec.md
+[bao-tree]: https://github.com/n0-computer/bao-tree
diff --git a/tvix/docs/src/castore/blobstore-protocol.md b/tvix/docs/src/castore/blobstore-protocol.md
new file mode 100644
index 000000000000..0dff787ccb00
--- /dev/null
+++ b/tvix/docs/src/castore/blobstore-protocol.md
@@ -0,0 +1,104 @@
+# BlobStore: Protocol / Composition
+
+This documents describes the protocol that BlobStore uses to substitute blobs
+other ("remote") BlobStores.
+
+How to come up with the blake3 digest of the blob to fetch is left to another
+layer in the stack.
+
+To put this into the context of Tvix as a Nix alternative, a blob represents an
+individual file inside a StorePath.
+In the Tvix Data Model, this is accomplished by having a `FileNode` (either the
+`root_node` in a `PathInfo` message, or a individual file inside a `Directory`
+message) encode a BLAKE3 digest.
+
+However, the whole infrastructure can be applied for other usecases requiring
+exchange/storage or access into data of which the blake3 digest is known.
+
+## Protocol and Interfaces
+As an RPC protocol, BlobStore currently uses gRPC.
+
+On the Rust side of things, every blob service implements the
+[`BlobService`](../src/blobservice/mod.rs) async trait, which isn't
+gRPC-specific.
+
+This `BlobService` trait provides functionality to check for existence of Blobs,
+read from blobs, and write new blobs.
+It also provides a method to ask for more granular chunks if they are available.
+
+In addition to some in-memory, on-disk and (soon) object-storage-based
+implementations, we also have a `BlobService` implementation that talks to a
+gRPC server, as well as a gRPC server wrapper component, which provides a gRPC
+service for anything implementing the `BlobService` trait.
+
+This makes it very easy to talk to a remote `BlobService`, which does not even
+need to be written in the same language, as long it speaks the same gRPC
+protocol.
+
+It also puts very little requirements on someone implementing a new
+`BlobService`, and how its internal storage or chunking algorithm looks like.
+
+The gRPC protocol is documented in `../protos/rpc_blobstore.proto`.
+Contrary to the `BlobService` trait, it does not have any options for seeking/
+ranging, as it's more desirable to provide this through chunking (see also
+[BlobStore Chunking](./blobstore-chunking.md).
+
+## Composition
+Different `BlobStore` are supposed to be "composed"/"layered" to express
+caching, multiple local and remote sources.
+
+The fronting interface can be the same, it'd just be multiple "tiers" that can
+respond to requests, depending on where the data resides. [^1]
+
+This makes it very simple for consumers, as they don't need to be aware of the
+entire substitutor config.
+
+The flexibility of this doesn't need to be exposed to the user in the default
+case; in most cases we should be fine with some form of on-disk storage and a
+bunch of substituters with different priorities.
+
+### gRPC Clients
+Clients are encouraged to always read blobs in a chunked fashion (asking for a
+list of chunks for a blob via `BlobService.Stat()`, then fetching chunks via
+`BlobService.Read()` as needed), instead of directly reading the entire blob via
+`BlobService.Read()`.
+
+In a composition setting, this provides opportunity for caching, and avoids
+downloading some chunks if they're already present locally (for example, because
+they were already downloaded by reading from a similar blob earlier).
+
+It also removes the need for seeking to be a part of the gRPC protocol
+alltogether, as chunks are supposed to be "reasonably small" [^2].
+
+There's some further optimization potential, a `BlobService.Stat()` request
+could tell the server it's happy with very small blobs just being inlined in
+an additional additional field in the response, which would allow clients to
+populate their local chunk store in a single roundtrip.
+
+## Verified Streaming
+As already described in [BlobStore Chunking](./blobstore-chunking.md), the physical chunk
+information sent in a `BlobService.Stat()` response is still sufficient to fetch
+in an authenticated fashion.
+
+The exact protocol and formats are still a bit in flux, but here's some notes:
+
+ - `BlobService.Stat()` request gets a `send_bao` field (bool), signalling a
+   [BAO][bao-spec] should be sent. Could also be `bao_shift` integer, signalling
+   how detailed (down to the leaf chunks) it should go.
+   The exact format (and request fields) still need to be defined, edef has some
+   ideas around omitting some intermediate hash nodes over the wire and
+   recomputing them, reducing size by another ~50% over [bao-tree].
+ - `BlobService.Stat()` response gets some bao-related fields (`bao_shift`
+   field, signalling the actual format/shift level the server replies with, the
+   actual bao, and maybe some format specifier).
+   It would be nice to also be compatible with the baos used by [iroh], so we
+   can provide an implementation using it too.
+
+---
+
+[^1]: We might want to have some backchannel, so it becomes possible to provide
+      feedback to the user that something is downloaded.
+[^2]: Something between 512K-4M, TBD.
+[bao-spec]: https://github.com/oconnor663/bao/blob/master/docs/spec.md
+[bao-tree]: https://github.com/n0-computer/bao-tree
+[iroh]: https://github.com/n0-computer/iroh
diff --git a/tvix/docs/src/castore/data-model.md b/tvix/docs/src/castore/data-model.md
new file mode 100644
index 000000000000..7f7e396a2267
--- /dev/null
+++ b/tvix/docs/src/castore/data-model.md
@@ -0,0 +1,50 @@
+# Data model
+
+This provides some more notes on the fields used in castore.proto.
+
+See [Store API](../store/api.md) for the full context.
+
+## Directory message
+`Directory` messages use the blake3 hash of their canonical protobuf
+serialization as its identifier.
+
+A `Directory` message contains three lists, `directories`, `files` and
+`symlinks`, holding `DirectoryNode`, `FileNode` and `SymlinkNode` messages
+respectively. They describe all the direct child elements that are contained in
+a directory.
+
+All three message types have a `name` field, specifying the (base)name of the
+element (which MUST not contain slashes or null bytes, and MUST not be '.' or '..').
+For reproducibility reasons, the lists MUST be sorted by that name and the
+name MUST be unique across all three lists.
+
+In addition to the `name` field, the various *Node messages have the following
+fields:
+
+## DirectoryNode
+A `DirectoryNode` message represents a child directory.
+
+It has a `digest` field, which points to the identifier of another `Directory`
+message, making a `Directory` a merkle tree (or strictly speaking, a graph, as
+two elements pointing to a child directory with the same contents would point
+to the same `Directory` message).
+
+There's also a `size` field, containing the (total) number of all child
+elements in the referenced `Directory`, which helps for inode calculation.
+
+## FileNode
+A `FileNode` message represents a child (regular) file.
+
+Its `digest` field contains the blake3 hash of the file contents. It can be
+looked up in the `BlobService`.
+
+The `size` field contains the size of the blob the `digest` field refers to.
+
+The `executable` field specifies whether the file should be marked as
+executable or not.
+
+## SymlinkNode
+A `SymlinkNode` message represents a child symlink.
+
+In addition to the `name` field, the only additional field is the `target`,
+which is a string containing the target of the symlink.
diff --git a/tvix/docs/src/castore/why-not-git-trees.md b/tvix/docs/src/castore/why-not-git-trees.md
new file mode 100644
index 000000000000..4a12b4ef5554
--- /dev/null
+++ b/tvix/docs/src/castore/why-not-git-trees.md
@@ -0,0 +1,57 @@
+## Why not git tree objects?
+
+We've been experimenting with (some variations of) the git tree and object
+format, and ultimately decided against using it as an internal format, and
+instead adapted the one documented in the other documents here.
+
+While the tvix-store API protocol shares some similarities with the format used
+in git for trees and objects, the git one has shown some significant
+disadvantages:
+
+### The binary encoding itself
+
+#### trees
+The git tree object format is a very binary, error-prone and
+"made-to-be-read-and-written-from-C" format.
+
+Tree objects are a combination of null-terminated strings, and fields of known
+length. References to other tree objects use the literal sha1 hash of another
+tree object in this encoding.
+Extensions of the format/changes are very hard to do right, because parsers are
+not aware they might be parsing something different.
+
+The tvix-store protocol uses a canonical protobuf serialization, and uses
+the [blake3][blake3] hash of that serialization to point to other `Directory`
+messages.
+It's both compact and with a wide range of libraries for encoders and decoders
+in many programming languages.
+The choice of protobuf makes it easy to add new fields, and make old clients
+aware of some unknown fields being detected [^adding-fields].
+
+#### blob
+On disk, git blob objects start with a "blob" prefix, then the size of the
+payload, and then the data itself. The hash of a blob is the literal sha1sum
+over all of this - which makes it something very git specific to request for.
+
+tvix-store simply uses the [blake3][blake3] hash of the literal contents
+when referring to a file/blob, which makes it very easy to ask other data
+sources for the same data, as no git-specific payload is included in the hash.
+This also plays very well together with things like [iroh][iroh-discussion],
+which plans to provide a way to substitute (large)blobs by their blake3 hash
+over the IPFS network.
+
+In addition to that, [blake3][blake3] makes it possible to do
+[verified streaming][bao], as already described in other parts of the
+documentation.
+
+The git tree object format uses sha1 both for references to other trees and
+hashes of blobs, which isn't really a hash function to fundamentally base
+everything on in 2023.
+The [migration to sha256][git-sha256] also has been dead for some years now,
+and it's unclear what a "blake3" version of this would even look like.
+
+[bao]: https://github.com/oconnor663/bao
+[blake3]: https://github.com/BLAKE3-team/BLAKE3
+[git-sha256]: https://git-scm.com/docs/hash-function-transition/
+[iroh-discussion]: https://github.com/n0-computer/iroh/discussions/707#discussioncomment-5070197
+[^adding-fields]: Obviously, adding new fields will change hashes, but it's something that's easy to detect.
\ No newline at end of file
diff --git a/tvix/docs/src/community.md b/tvix/docs/src/community.md
new file mode 100644
index 000000000000..9ab117c552f3
--- /dev/null
+++ b/tvix/docs/src/community.md
@@ -0,0 +1,23 @@
+# Community
+
+## Chatroom
+
+Tvix development discussions happen on IRC. We use the [hackint][] IRC network
+where you should be able to join using your favorite client/protocol.
+
+* [IRC][] / [Webchat][]
+* [Matrix][]
+* [XMPP][]
+
+## Mailing list
+
+Discussions on larger architectural problems and thoughts occasionally happen
+on the [TVL Public Inbox][public inbox] mailing list.
+
+[hackint]: https://hackint.org/
+[IRC]: ircs://irc.hackint.org:6697/#tvix-dev
+[Webchat]: https://webirc.hackint.org/#ircs://irc.hackint.org/#tvix-dev
+[Matrix]: matrix:r/tvix-dev:hackint.org?action=join
+[XMPP]: xmpp:#tvix-dev@irc.hackint.org?join
+[depot@tvl.su]: mailto:depot@tvl.su
+[public inbox]: https://inbox.tvl.su/depot/
diff --git a/tvix/docs/src/contributing/code-&-commits.md b/tvix/docs/src/contributing/code-&-commits.md
new file mode 100644
index 000000000000..628c124bf12f
--- /dev/null
+++ b/tvix/docs/src/contributing/code-&-commits.md
@@ -0,0 +1,76 @@
+# Code & Commits
+
+## Code quality
+
+This one should go without saying — but please ensure that your code quality
+does not fall below the rest of the project. This is of course very subjective,
+but as an example if you place code that throws away errors into a block in
+which errors are handled properly your change will be rejected.
+
+
+```admonish hint
+Usually there is a strong correlation between the visual appearance of a code
+block and its quality. This is a simple way to sanity-check your work while
+squinting and keeping some distance from your screen ;-)
+```
+
+
+## Commit messages
+
+The [Angular Conventional Commits][angular] style is the general commit style
+used in the Tvix project. Commit messages should be structured like this:
+
+```admonish example
+    type(scope): Subject line with at most a 72 character length
+
+    Body of the commit message with an empty line between subject and
+    body. This text should explain what the change does and why it has
+    been made, *especially* if it introduces a new feature.
+
+    Relevant issues should be mentioned if they exist.
+```
+
+Where `type` can be one of:
+
+* `feat`: A new feature has been introduced
+* `fix`: An issue of some kind has been fixed
+* `docs`: Documentation or comments have been updated
+* `style`: Formatting changes only
+* `refactor`: Hopefully self-explanatory!
+* `test`: Added missing tests / fixed tests
+* `chore`: Maintenance work
+* `subtree`: Operations involving `git subtree`
+
+And `scope` should refer to some kind of logical grouping inside of the
+project.
+
+It does not make sense to include the full path unless it aids in
+disambiguating. For example, when changing the struct fields in
+`tvix/glue/src/builtins/fetchers.rs` it is enough to write
+`refactor(tvix/glue): …`.
+
+Please take a look at the existing commit log for examples.
+
+
+## Commit content
+
+Multiple changes should be divided into multiple git commits whenever possible.
+Common sense applies.
+
+The fix for a single-line whitespace issue is fine to include in a different
+commit. Introducing a new feature and refactoring (unrelated) code in the same
+commit is not fine.
+
+`git commit -a` is generally **taboo**, whereas on the command line you should
+be preferring `git commit -p`.
+
+
+```admonish tip
+Tooling can really help this process. The [lazygit][] TUI or [magit][] for
+Emacs are worth looking into.
+```
+
+
+[angular]: https://www.conventionalcommits.org/en/
+[lazygit]: https://github.com/jesseduffield/lazygit
+[magit]: https://magit.vc
diff --git a/tvix/docs/src/contributing/email.md b/tvix/docs/src/contributing/email.md
new file mode 100644
index 000000000000..238ff388f595
--- /dev/null
+++ b/tvix/docs/src/contributing/email.md
@@ -0,0 +1,33 @@
+# Submitting changes via email
+
+With SSO & local accounts, hopefully Tvix provides you a low-friction or
+privacy-respecting way to make contributions by means of
+[TVL’s self-hosted Gerrit][gerrit]. However, if you still decide differently,
+you may submit a patch via email to `depot@tvl.su` where it will be added to
+Gerrit by a contributor.
+
+Please keep in mind this process is more complicated requiring extra work from
+both us & you:
+
+* You will need to manually check the Gerrit website for updates & someone will
+  need to relay potential comments to/from Gerrit to you as you won’t get
+  emails from Gerrit.
+* New revisions need to be stewarded by someone uploading changes to Gerrit
+  on your behalf.
+* As CLs cannot change owners, if you decide to get a Gerrit account later on
+  existing CLs need to be abandoned then recreated. This introduces more churn
+  to the review process since prior discussion are disconnected.
+
+Create an appropriate commit locally then send it us using either of these
+options:
+
+* `git format-patch`: This will create a `*.patch` file which you should email to
+  us.
+* `git send-email`: If configured on your system, this will take care of the
+  whole emailing process for you.
+
+The email address is a [public inbox][].
+
+
+[gerrit]: ../contributing/gerrit.html
+[public inbox]: https://inbox.tvl.su/depot/
diff --git a/tvix/docs/src/contributing/gerrit.md b/tvix/docs/src/contributing/gerrit.md
new file mode 100644
index 000000000000..3644e8cb0200
--- /dev/null
+++ b/tvix/docs/src/contributing/gerrit.md
@@ -0,0 +1,110 @@
+# Contributing to Tvix
+
+## Registration
+
+Self-hosted [Gerrit](https://www.gerritcodereview.com) & changelists (CLs) are
+the preferred method of contributions & review.
+
+TVL’s Gerrit supports single sign-on (SSO) using a GitHub, GitLab, or
+StackOverflow account.
+
+Additionally if you would prefer not to use an SSO option or wish to have a
+backup authentication strategy in the event of downed server or otherwise, we
+recommend setting up a TVL-specific LDAP account.
+
+You can create such an account by following these instructions:
+
+1. Checkout [TVL’s monorepo][check-out-monorepo] if you haven’t already
+2. Be a member of `#tvix-dev` (and/or `#tvl`) on [hackint][], a communication
+   network.
+3. Generate a user entry using [//web/pwcrypt](https://signup.tvl.fyi/).
+4. Commit that generated user entry to our LDAP server configuration in
+   [ops/users][ops-users] (for an example, see:
+   [CL/2671](https://cl.tvl.fyi/c/depot/+/2671))
+5. If only using LDAP, submit the patch via email (see [<cite>Submitting
+   changes via email</cite>][email])
+
+
+## Gerrit setup
+
+Gerrit uses the concept of change IDs to track commits across rebases and other
+operations that might change their hashes, and link them to unique changes in
+Gerrit.
+
+First, [upload your public SSH keys to Gerrit][Gerrit SSH]. Then change your
+remote to point to your newly-registered user over SSH. Then follow up with Git
+config by setting the default push URLs for & installing commit hooks for a
+smoother Gerrit experience.
+
+```console
+$ cd depot
+$ git remote set-url origin "ssh://$USER@code.tvl.fyi:29418/depot"
+$ git config remote.origin.url "ssh://$USER@code.tvl.fyi:29418/depot"
+$ git config remote.origin.push "HEAD:refs/for/canon"
+$ curl -L --compressed https://cl.tvl.fyi/tools/hooks/commit-msg | tee .git/hooks/commit-msg
+…
+if ! mv "${dest}" "$1" ; then
+  echo "cannot mv ${dest} to $1"
+  exit 1
+fi
+$ chmod +x .git/hooks/commit-msg
+```
+
+## Gerrit workflow
+
+The workflow on Gerrit is quite different than the pull request (PR) model that
+many developers are more likely to be accustomed to. Instead of pushing changes
+to remote branches, all changes have to be pushed to `refs/for/canon`. For each
+commit that is pushed there, a change request is created automatically
+
+Every time you create a new commit the change hook will insert a unique
+`Change-Id` tag into the commit message. Once you are satisfied with the state
+of your commit and want to submit it for review, you push it to a Git `ref`
+called `refs/for/canon`. This designates the commits as changelists (CLs)
+targeted for the `canon` branch.
+
+After you feel satisfied with your changes changes, push to the default:
+
+```console
+$ git commit -m 'docs(REVIEWS): Fixed all the errors in the reviews docs'
+$ git push origin
+```
+
+Or to a special target, such as a work-in-progress CL:
+
+```console
+$ git push origin HEAD:refs/for/canon%wip
+```
+
+During the review process, the reviewer(s) might ask you to make changes. You
+can simply amend[^amend] your commit(s) then push to the same ref (`--force*`
+flags not needed). Gerrit will automatically update your changes.
+
+```admonish caution
+Every individual commit will become a separate change. We do *not* squash
+related commits, but instead submit them one by one. Be aware that if you are
+expecting a different behavior such as attempt something like an unsquashed
+subtree merge, you will produce a *lot* of CLs. This is strongly discouraged.
+```
+
+```admonish tip
+If do not have experience with the Gerrit model, consider reading the
+[<cite>Working with Gerrit: An example</cite>][Gerrit Walkthrough] or
+[<cite>Basic Gerrit Walkthrough — For GitHub Users</cite>][github-diff].
+
+It will also be important to read about [attention sets][] to understand how
+your ‘turn’ works, how notifications will be distributed to users through the
+system, as well as the other [attention set rules][attention-set-rules].
+```
+
+
+[check-out-monorepo]: ./getting-started#tvl-monorepo
+[email]: ../contributing/email.html
+[Gerrit SSH]: https://cl.tvl.fyi/settings/#SSHKeys
+[Gerrit walkthrough]: https://gerrit-review.googlesource.com/Documentation/intro-gerrit-walkthrough.html
+[ops-users]: https://code.tvl.fyi/tree/ops/users/default.nix
+[hackint]: https://hackint.org
+[github-diff]: https://gerrit.wikimedia.org/r/Documentation/intro-gerrit-walkthrough-github.html
+[attention sets]: https://gerrit-review.googlesource.com/Documentation/user-attention-set.html
+[attention-set-rules]: https://gerrit-review.googlesource.com/Documentation/user-attention-set.html#_rules
+[^keycloak]: [^amend]: `git commit --amend`
diff --git a/tvix/docs/src/eval/abandoned/index.md b/tvix/docs/src/eval/abandoned/index.md
new file mode 100644
index 000000000000..1cef704d08d7
--- /dev/null
+++ b/tvix/docs/src/eval/abandoned/index.md
@@ -0,0 +1,3 @@
+# Abandoned ideas
+
+This chapter keeps track of abandoned ideas, and why they were abandoned.
diff --git a/tvix/docs/src/eval/abandoned/thread-local-vm.md b/tvix/docs/src/eval/abandoned/thread-local-vm.md
new file mode 100644
index 000000000000..c6a2d5e07e5c
--- /dev/null
+++ b/tvix/docs/src/eval/abandoned/thread-local-vm.md
@@ -0,0 +1,233 @@
+# We can't have nice things because IFD
+
+The thread-local VM work below was ultimately not merged because it
+was decided that it would be harmful for `tvix::eval::Value` to
+implement `Eq`, `Hash`, or any of the other `std` traits.
+
+Implementing `std` traits on `Value` was deemed harmful because IFD
+can cause arbitrary amounts of compilation to occur, including
+network transactions with builders.  Obviously it would be
+unexpected and error-prone to have a `PartialEq::eq()` which does
+something like this.  This problem does not manifest within the
+"nixpkgs compatibility only" scope, or in any undeprecated language
+feature other than IFD.  Although IFD is outside the "nixpkgs
+compatibility scope", it [has been added to the TVL compatibility
+scope](https://cl.tvl.fyi/c/depot/+/7193/comment/3418997b_0dbd0b65/).
+
+This was the sole reason for not merging.
+
+The explanation below may be useful in case future circumstances
+affect the relevance of the reasoning above.
+
+The implementation can be found in these CLs:
+
+- [refactor(tvix/eval): remove lifetime parameter from VM<'o>](https://cl.tvl.fyi/c/depot/+/7194)
+- [feat(tvix/eval): [FOUNDLING] thread-local VM](https://cl.tvl.fyi/c/depot/+/7195)
+- [feat(tvix/eval): [FOUNDLING] VM::vm_xxx convenience methods](https://cl.tvl.fyi/c/depot/+/7196)
+- [refactor(tvix/eval): [FOUNDLING]: drop explicit `&mut vm` parameter](https://cl.tvl.fyi/c/depot/+/7197)
+
+# Thread-local storage for tvix::eval::vm::VM
+
+## The problem
+
+`Value::force()` takes a `&mut VM` argument, since forcing a value
+requires executing opcodes.  This means that `Value::nix_eq()` too
+must take a `&mut VM`, since any sensible definition of equality
+will have to force thunks.
+
+Unfortunately Rust's `PartialEq::eq()` function does not accept any
+additional arguments like this, so `Value` cannot implement
+`PartialEq`.  Worse, structs which *contain* `Value`s can't
+implement `PartialEq` either.  This means `Value`, and anything
+containing it, cannot be the key for a `BTreeMap` or `HashMap`.  We
+can't even insert `Value`s into a `HashSet`!
+
+There are other situations like this that don't involve `PartialEq`,
+but it's the most glaring one.  The main problem is that you need a
+`VM` in order to force thunks, and thunks can be anywhere in a
+`Value`.
+
+## Solving the problem with thread-locals
+
+We could avoid threading the `&mut VM` through the entire codebase
+by making it a thread-local.
+
+To do this without a performance hit, we need to use LLVM
+thread-locals, which are the same cost as references to `static`s
+but load relative to
+[`llvm.threadlocal.address`][threadlocal-intrinsic] instead of
+relative to the data segment.  Unfortunately `#[thread_local]` [is
+unstable][thread-local-unstable] and [unsafe in
+general][thread-local-unsafe] for most of the cases where we would
+want to use it.  There is one [exception][tls-const-init], however:
+if a `!thread_local()` has a `const` initializer, the compiler will
+insert a `#[thread_local]`; this special case is both safe and
+stable.
+
+The difficult decision is what the type of the thread-local should
+be.  Since you can't get a mutable reference to a `thread_local!()`
+it will have to be some interior-mutability-bestowing wrapper around
+our current `struct VM`.  Here are the choices:
+
+### `RefCell<VM>`
+
+This is the obvious first choice, since it lets you borrow a
+`RefMut<Target=VM>`.  The problem here is that we want to keep the
+codebase written such that all the functions in `impl VM` still take
+a `&mut self`.  This means that there will be an active mutable
+borrow for the duration of `VM::call_builtin()`.  So if we implement
+`PartialEq` by having `eq()` attempt a second mutable borrow from
+the thread-local storage, it will fail since there is already an
+active borrow.
+
+The problem here is that you can't "unborrow" a `RefMut` except by
+dropping it.  There's no way around this.
+
+#### Problem: Uglification
+
+The only solution here is to rewrite all the functions in `impl VM`
+so they don't take any kind of `self` argument, and then have them
+do a short-lived `.borrow_mut()` from the thread-local `RefCell`
+*separately, each time* they want to modify one of the fields of
+`VM` (currently `frames`, `stack`, `with_stack`, `warnings`).  This
+means that if you had a code sequence like this:
+
+```
+impl VM {
+  fn foo(&mut self, ...) {
+    ...
+    self.frame().ip += 1;
+    self.some_other_method();
+    self.frame().ip += 1;
+```
+
+You would need to add *two separate `borrow_mut()`s*, one for each
+of the `self.frame().ip+=1` statements.  You can't just do one big
+`borrow_mut()` because `some_other_method()` will call
+`borrow_mut()` and panic.
+
+#### Problem: Performance
+
+The `RefCell<VM>` approach also has a fairly huge performance hit,
+because every single modification to any part of `VM` will require a
+reference count increment/decrement, and a conditional branch based
+on the check (which will never fail) that the `RefCell` isn't
+already mutably borrowed.  It will also impede a lot of rustc's
+optimizations.
+
+### `Cell<VM>`
+
+This is a non-starter because it means that in order to mutate any
+field of `VM`, you have to move the entire `struct VM` out of the
+`Cell`, mutate it, and move it back in.
+
+### `Cell<Box<VM>>`
+
+Now we're getting warmer.  Here, we can move the `Box<VM>` out of
+the cell with a single pointer-sized memory access.
+
+We don't want to do the "uglification" described in the previous
+section.  We are very fortunate that, sometime in mid-2019, the Rust
+dieties [decreed by fiat][fiat-decree] that `&Cell<T>` and `&mut T`
+are bit-for-bit identical, and even gave us mortals safe wrappers
+[`from_mut()`][from_mut] and [`get_mut()`][get_mut] around
+`mem::transmute()`.
+
+So now, when a `VM` method (which takes `&mut self`) calls out to
+some external code (like a builtin), instead of passing the `&mut
+self` to the external code it can call `Cell::from_mut(&mut self)`,
+and then `Cell::swap()` that into the thread-local storage cell for
+the duration of the external code.  After the external code returns,
+it can `Cell::swap()` it back.  This whole dance gets wrapped in a
+lexical block, and the borrow checker sees that the `&Cell<Box<VM>>`
+returned by `Cell::from_mut()` lives only until the end of the
+lexical block, *so we get the `&mut self` back after the close-brace
+for that block*.  NLL FTW.  This sounds like a lot of work, but it
+should compile down to two pointer-sized loads and two pointer-sized
+stores, and it is incurred basically only for `OpBuiltin`.
+
+This all works, with only two issues:
+
+1. `vm.rs` needs to be very careful to do the thread-local cell swap
+   dance before calling anything that might call `PartialEq::eq()`
+   (or any other method that expects to be able to pull the `VM` out
+   of thread-local storage).  There is no compile-time check that we
+   did the dance in all the right places.  If we forget to do the
+   dance somewhere we'll get a runtime panic from `Option::expect()`
+   (see next section).
+
+2. Since we need to call `Cell::from_mut()` on a `Box<VM>` rather
+   than a bare `VM`, we still need to rewrite all of `vm.rs` so that
+   every function takes a `&mut Box<VM>` instead of a `&mut self`.
+   This creates a huge amount of "noise" in the code.
+
+Fortunately, it turns out that nearly all the "noise" that arises
+from the second point can be eliminated by taking advantage of
+[deref coercions][deref-coercions]!  This was the last "shoe to
+drop".
+
+There is still the issue of having to be careful about calls from
+`vm.rs` to things outside that file, but it's manageable.
+
+### `Cell<Option<Box<VM>>>`
+
+In order to get the "safe and stable `#[thread_local]`"
+[exception][tls-const-init] we need a `const` initializer, which
+means we need to be able to put something into the `Cell` that isn't
+a `VM`.  So the type needs to be `Cell<Option<Box<VM>>>`.
+
+Recall that you can't turn an `Option<&T>` into an `&Option<T>`.
+The latter type has the "is this a `Some` or `None`" bit immediately
+adjacent to the bits representing `T`.  So if I hand you a `t:&T`
+and you wrap it as `Some(t)`, those bits aren't adjacent in memory.
+This means that all the VM methods need to operate on an
+`Option<Box<VM>>` -- we can't just wrap a `Some()` around `&mut
+self` "at the last minute" before inserting it into the thread-local
+storage cell.  Fortunately deref coercions save the day here too --
+the coercion is inferred through both layers (`Box` and `Option`) of
+wrapper, so there is no additional noise in the code.
+
+Note that Rust is clever and can find some sequence of bits that
+aren't a valid `T`, so `sizeof(Option<T>)==sizeof(T)`.  And in fact,
+`Box<T>` is one of these cases (and this is guaranteed).  So the
+`Option` has no overhead.
+
+# Closing thoughts, language-level support
+
+This would have been easier with language-level support.
+
+## What wouldn't help
+
+Although it [it was decreed][fiat-decree] that `Cell<T>` and `&mut
+T` are interchangeable, a `LocalKey<Cell<T>>` isn't quite the same
+thing as a `Cell<T>`, so it wouldn't be safe for the standard
+library to contain something like this:
+
+```
+impl<T> LocalKey<Cell<T>> {
+  fn get_mut(&self) -> &mut T {
+    unsafe {
+      // ... mem::transmute() voodoo goes here ...
+```
+
+The problem here is that you can call `LocalKey<Cell<T>>::get_mut()` twice and
+end up with two `&mut T`s that point to the same thing (mutable aliasing) which
+results in undefined behavior.
+
+## What would help
+
+The ideal solution is for Rust to let you call arbitrary methods
+`T::foo(&mut self...)` on a `LocalKey<Cell<T>>`.  This way you can
+have one (and only one) `&mut T` at any syntactical point in your
+program -- the `&mut self`.
+
+
+[tls-const-init]: https://github.com/rust-lang/rust/pull/90774
+[thread-local-unstable]: https://github.com/rust-lang/rust/issues/29594
+[thread-local-unsafe-generally]: https://github.com/rust-lang/rust/issues/54366
+[fiat-decree]: https://github.com/rust-lang/rust/issues/43038
+[from_mut]: https://doc.rust-lang.org/stable/std/cell/struct.Cell.html#method.from_mut
+[get_mut]: https://doc.rust-lang.org/stable/std/cell/struct.Cell.html#method.get_mut
+[thread-local-unsafe]: [https://github.com/rust-lang/rust/issues/54366]
+[deref-coercions]: https://doc.rust-lang.org/book/ch15-02-deref.html#implicit-deref-coercions-with-functions-and-methods
+[threadlocal-intrinsic]: https://llvm.org/docs/LangRef.html#llvm-threadlocal-address-intrinsic
diff --git a/tvix/docs/src/eval/bindings.md b/tvix/docs/src/eval/bindings.md
new file mode 100644
index 000000000000..4fb35b623580
--- /dev/null
+++ b/tvix/docs/src/eval/bindings.md
@@ -0,0 +1,134 @@
+# Compilation of bindings
+
+Compilation of Nix bindings is one of the most mind-bending parts of Nix
+evaluation. The implementation of just the compilation is currently almost 1000
+lines of code, excluding the various insane test cases we dreamt up for it.
+
+## What is a binding?
+
+In short, any attribute set or `let`-expression. Tvix currently does not treat
+formals in function parameters (e.g. `{ name ? "fred" }: ...`) the same as these
+bindings.
+
+They have two very difficult features:
+
+1. Keys can mutually refer to each other in `rec` sets or `let`-bindings,
+   including out of definition order.
+2. Attribute sets can be nested, and parts of one attribute set can be defined
+   in multiple separate bindings.
+
+Tvix resolves as much of this logic statically (i.e. at compile-time) as
+possible, but the procedure is quite complicated.
+
+## High-level concept
+
+The idea behind the way we compile bindings is to fully resolve nesting
+statically, and use the usual mechanisms (i.e. recursion/thunking/value
+capturing) for resolving dynamic values.
+
+This is done by compiling bindings in several phases:
+
+1. An initial compilation phase *only* for plain inherit statements (i.e.
+   `inherit name;`), *not* for namespaced inherits (i.e. `inherit (from)
+   name;`).
+
+2. A declaration-only phase, in which we use the compiler's scope tracking logic
+   to calculate the physical runtime stack indices (further referred to as
+   "stack slots" or just "slots") that all values will end up in.
+
+   In this phase, whenever we encounter a nested attribute set, it is merged
+   into a custom data structure that acts like a synthetic AST node.
+
+   This can be imagined similar to a rewrite like this:
+
+   ```nix
+   # initial code:
+   {
+       a.b = 1;
+       a.c = 2;
+   }
+
+   # rewritten form:
+   {
+       a = {
+           b = 1;
+           c = 2;
+       };
+   }
+   ```
+
+   The rewrite applies to attribute sets and `let`-bindings alike.
+
+   At the end of this phase, we know the stack slots of all namespaces for
+   inheriting from, all values inherited from them, and all values (and
+   optionally keys) of bindings at the current level.
+
+   Only statically known keys are actually merged, so any dynamic keys that
+   conflict will lead to a "key already defined" error at runtime.
+
+3. A compilation phase, in which all values (and, when necessary, keys) are
+   actually compiled. In this phase the custom data structure used for merging
+   is encountered when compiling values.
+
+   As this data structure acts like an AST node, the process begins recursively
+   for each nested attribute set.
+
+At the end of this process we have bytecode that leaves the required values (and
+optionally keys) on the stack. In the case of attribute sets, a final operation
+is emitted that constructs the actual attribute set structure at runtime. For
+`let`-bindings a final operation is emitted that removes these locals from the
+stack when the scope ends.
+
+## Moving parts
+
+```admonish caution
+This documents the *current* implementation. If you only care about the
+conceptual aspects, see above.
+```
+
+There's a few types involved:
+
+* `PeekableAttrs`: peekable iterator over an attribute path (e.g. `a.b.c`)
+* `BindingsKind`: enum defining the kind of bindings (attrs/recattrs/let)
+* `AttributeSet`: struct holding the bindings kind, the AST nodes with inherits
+  (both namespaced and not), and an internal representation of bindings
+  (essentially a vector of tuples of the peekable attrs and the expression to
+  compile for the value).
+* `Binding`: enum describing the kind of binding (namespaced inherit, attribute
+  set, plain binding of *any other value type*)
+* `KeySlot`: enum describing the location in which a key slot is placed at
+  runtime (nowhere, statically known value in a slot, dynamic value in a slot)
+* `TrackedBinding`: struct representing statically known information about a
+  single binding (its key slot, value slot and `Binding`)
+* `TrackedBindings`: vector of tracked bindings, which implements logic for
+  merging attribute sets together
+
+And quite a few methods on `Compiler`:
+
+* `compile_bindings`: entry point for compiling anything that looks like a
+  binding, this calls out to the functions below.
+* `compile_plain_inherits`: takes all inherits of a bindings node and compiles
+  the ones that are trivial to compile (i.e. just plain inherits without a
+  namespace). The `rnix` parser does not represent namespaced/plain inherits in
+  different nodes, so this function also aggregates the namespaced inherits and
+  returns them for further use
+* `declare_namespaced_inherits`: passes over all namespaced inherits and
+  declares them on the locals stack, as well as inserts them into the provided
+  `TrackedBindings`
+* `declare_bindings`: declares all regular key/value bindings in a bindings
+  scope, but without actually compiling their keys or values.
+
+  There's a lot of heavy lifting going on here:
+
+  1. It invokes the various pieces of logic responsible for merging nested
+     attribute sets together, creating intermediate data structures in the value
+     slots of bindings that can be recursively processed the same way.
+  2. It decides on the key slots of expressions based on the kind of bindings,
+     and the type of expression providing the key.
+* `bind_values`: runs the actual compilation of values. Notably this function is
+  responsible for recursively compiling merged attribute sets when it encounters
+  a `Binding::Set` (on which it invokes `compile_bindings` itself).
+
+In addition to these several methods (such as `compile_attr_set`,
+`compile_let_in`, ...) invoke the binding-kind specific logic and then call out
+to the functions above.
diff --git a/tvix/docs/src/eval/build-references.md b/tvix/docs/src/eval/build-references.md
new file mode 100644
index 000000000000..dd53f65d83aa
--- /dev/null
+++ b/tvix/docs/src/eval/build-references.md
@@ -0,0 +1,259 @@
+# Build references in derivations
+
+This document describes how build references are calculated in Tvix. Build
+references are used to determine which store paths should be available to a
+builder during the execution of a build (i.e. the full build closure of a
+derivation).
+
+## String contexts in C++ Nix
+
+In C++ Nix, each string value in the evaluator carries an optional so-called
+"string context".
+
+These contexts are themselves a list of strings that take one of the following
+formats:
+
+1. `!<output_name>!<drv_path>`
+
+   This format describes a build reference to a specific output of a derivation.
+
+2. `=<drv_path>`
+
+   This format is used for a special case where a derivation attribute directly
+   refers to a derivation path (e.g. by accessing `.drvPath` on a derivation).
+
+   ```admonish note
+   In C++ Nix this case is quite special and actually requires a store-database
+   query during evaluation.
+   ```
+
+3. `<path>` - a non-descript store path input, usually a plain source file (e.g.
+   from something like `src = ./.` or `src = ./foo.txt`).
+
+   In the case of `unsafeDiscardOutputDependency` this is used to pass a raw
+   derivation file, but *not* pull in its outputs.
+
+Lets introduce names for these (in the same order) to make them easier to
+reference below:
+
+```rust
+enum BuildReference {
+    /// !<output_name>!<drv_path>
+    SingleOutput(OutputName, DrvPath),
+
+    /// =<drv_path>
+    DrvClosure(DrvPath),
+
+    /// <path>
+    Path(StorePath),
+}
+```
+
+String contexts are, broadly speaking, created whenever a string is the result
+of a computation (e.g. string interpolation) that used a *computed* path or
+derivation in any way.
+
+Note: This explicitly does *not* include simply writing a literal string
+containing a store path (whether valid or not). That is only permitted through
+the `storePath` builtin.
+
+## Derivation inputs
+
+Based on the data above, the fields `inputDrvs` and `inputSrcs` of derivations
+are populated in `builtins.derivationStrict` (the function which
+`builtins.derivation`, which isn't actually a builtin, wraps).
+
+`inputDrvs` is represented by a map of derivation paths to the set of their
+outputs that were referenced by the context.
+
+TODO: What happens if the set is empty? Somebody claimed this means all outputs.
+
+`inputSrcs` is represented by a set of paths.
+
+These are populated by the above references as follows:
+
+* `SingleOutput` entries are merged into `inputDrvs`
+* `Path` entries are inserted into `inputSrcs`
+* `DrvClosure` leads to a special store computation (`computeFSClosure`), which
+  finds all paths referenced by the derivation and then inserts all of them into
+  the fields as above (derivations with _all_ their outputs)
+
+This is then serialised in the derivation and passed down the pipe.
+
+## Builtins interfacing with contexts
+
+C++ Nix has several builtins that interface directly with string contexts:
+
+* `unsafeDiscardStringContext`: throws away a string's string context (if
+  present)
+* `hasContext`: returns `true`/`false` depending on whether the string has
+  context
+* `unsafeDiscardOutputDependency`: drops dependencies on the *outputs* of a
+  `.drv` in the context, passing only the literal `.drv` itself
+
+  ```admonish note
+  This is only used for special test-cases in nixpkgs, and deprecated Nix
+  commands like `nix-push`.
+  ```
+* `getContext`: returns the string context in serialised form as a Nix attribute
+  set
+* `appendContext`: adds a given string context to the string in the same format
+  as returned by `getContext`
+
+Most of the string manipulation operations will propagate the context to the
+result based on their parameters' contexts.
+
+## Placeholders
+
+C++ Nix has `builtins.placeholder`, which given the name of an output (e.g.
+`out`) creates a hashed string representation of that output name. If that
+string is used anywhere in input attributes, the builder will replace it with
+the actual name of the corresponding output of the current derivation.
+
+C++ Nix does not use contexts for this, it blindly creates a rewrite map of
+these placeholder strings to the names of all outputs, and runs the output
+replacement logic on all environment variables it creates, attribute files it
+passes etc.
+
+## Tvix & string contexts
+
+In the past, Tvix did not track string contexts in its evaluator at all, see
+the historical section for more information about that.
+
+Tvix tracks string contexts in every `NixString` structure via a
+`HashSet<BuildReference>` and offers an API to combine the references while
+keeping the exact internal structure of that data private.
+
+## Historical attempt: Persistent reference tracking
+
+We were investigating implementing a system which allows us to drop string
+contexts in favour of reference scanning derivation attributes.
+
+This means that instead of maintaining and passing around a string context data
+structure in eval, we maintain a data structure of *known paths* from the same
+evaluation elsewhere in Tvix, and scan each derivation attribute against this
+set of known paths when instantiating derivations.
+
+We believed we could take the stance that the system of string contexts as
+implemented in C++ Nix is likely an implementation detail that should not be
+leaking to the language surface as it does now.
+
+### Tracking "known paths"
+
+Every time a Tvix evaluation does something that causes a store interaction, a
+"known path" is created. On the language surface, this is the result of one of:
+
+1. Path literals (e.g. `src = ./.`).
+2. Calls to `builtins.derivationStrict` yielding a derivation and its output
+   paths.
+3. Calls to `builtins.path`.
+
+Whenever one of these occurs, some metadata that persists for the duration of
+one evaluation should be created in Nix. This metadata needs to be available in
+`builtins.derivationStrict`, and should be able to respond to these queries:
+
+1. What is the set of all known paths? (used for e.g. instantiating an
+   Aho-Corasick type string searcher)
+2. What is the _type_ of a path? (derivation path, derivation output, source
+   file)
+3. What are the outputs of a derivation?
+4. What is the derivation of an output?
+
+These queries will need to be asked of the metadata when populating the
+derivation fields.
+
+```admonish note
+Depending on how we implement `builtins.placeholder`, it might be useful
+to track created placeholders in this metadata, too.
+```
+
+### Context builtins
+
+Context-reading builtins can be implemented in Tvix by adding `hasContext` and
+`getContext` with the appropriate reference-scanning logic. However, we should
+evaluate how these are used in nixpkgs and whether their uses can be removed.
+
+Context-mutating builtins can be implemented by tracking their effects in the
+value representation of Tvix, however we should consider not doing this at all.
+
+`unsafeDiscardOutputDependency` should probably never be used and we should warn
+or error on it.
+
+`unsafeDiscardStringContext` is often used as a workaround for avoiding IFD in
+inconvenient places (e.g. in the TVL depot pipeline generation). This is
+unnecessary in Tvix. We should evaluate which other uses exist, and act on them
+appropriately.
+
+The initial danger with diverging here is that we might cause derivation hash
+discrepancies between Tvix and C++ Nix, which can make initial comparisons of
+derivations generated by the two systems difficult. If this occurs we need to
+discuss how to approach it, but initially we will implement the mutating
+builtins as no-ops.
+
+### Why this did not work for us?
+
+Nix has a feature to perform environmental checks of your derivation, e.g.
+"these derivation outputs should not be referenced in this derivation", this was
+introduced in Nix 2.2 by
+https://github.com/NixOS/nix/commit/3cd15c5b1f5a8e6de87d5b7e8cc2f1326b420c88.
+
+Unfortunately, this feature introduced a very unfortunate and critical bug: all
+usage of this feature with contextful strings will actually force the
+derivation to depend at least at build time on those specific paths, see
+https://github.com/NixOS/nix/issues/4629.
+
+For example, if you wanted to `disallowedReferences` to a package and you used a
+derivation as a path, you would actually register that derivation as a input
+derivation of that derivation.
+
+This bug is still unfixed in Nix and it seems that fixing it would require
+introducing different ways to evaluate Nix derivations to preserve the
+output path calculation for Nix expressions so far.
+
+All of this would be fine if the bug behavior was uniform in the sense that no
+one tried to force-workaround it. Since Nixpkgs 23.05, due to
+https://github.com/NixOS/nixpkgs/pull/211783 this is not true anymore.
+
+If you let nixpkgs be the disjoint union of bootstrapping derivations $A$ and
+`stdenv.mkDerivation`-built derivations $B$.
+
+$A$ suffers from the bug and $B$ doesn't by the forced usage of
+`unsafeDiscardStringContext` on those special checking fields.
+
+This means that to build hash-compatible $A$ **and** $B$, we need to
+distinguish $A$ and $B$. A lot of hacks could be imagined to support this
+problem.
+
+Let's assume we have a solution to that problem, it means that we are able to
+detect implicitly when a set of specific fields are
+`unsafeDiscardStringContext`-ed.
+
+Thus, we could use that same trick to implement `unsafeDiscardStringContext`
+entirely for all fields actually.
+
+Now, to implement `unsafeDiscardStringContext` in the persistent reference
+tracking model, you will need to store a disallowed list of strings that should
+not trigger a reference when we are scanning a derivation parameters.
+
+But assume you have something like:
+
+```nix
+derivation {
+   buildInputs = [
+     stdenv.cc
+   ];
+
+   disallowedReferences = [ stdenv.cc ];
+}
+```
+
+If you unregister naively the `stdenv.cc` reference, it will silence the fact
+that it is part of the `buildInputs`, so you will observe that Nix will fail
+the derivation during environmental check, but Tvix would silently force remove
+that reference.
+
+Until proven otherwise, it seems highly difficult to have the fine-grained
+information to prevent reference tracking of those specific fields. It is not a
+failure of the persistent reference tracking, it is an unresolved critical bug
+of Nix that only nixpkgs really workarounded for `stdenv.mkDerivation`-based
+derivations.
diff --git a/tvix/docs/src/eval/builtins.md b/tvix/docs/src/eval/builtins.md
new file mode 100644
index 000000000000..d9fcd72ccab5
--- /dev/null
+++ b/tvix/docs/src/eval/builtins.md
@@ -0,0 +1,137 @@
+# Nix builtins
+
+Nix has a lot of built-in functions, some of which are accessible in
+the global scope, and some of which are only accessible through the
+global `builtins` attribute set.
+
+This document is an attempt to track all of these builtins, but
+without documenting their functionality.
+
+See also https://nixos.org/manual/nix/stable/expressions/builtins.html
+
+The `impl` column indicates implementation status in tvix:
+- implemented: "" (empty cell)
+- not yet implemented, but not blocked: `todo`
+- not yet implemented, but blocked by other prerequisites:
+  - `store`: awaiting eval<->store api(s)
+  - `context`: awaiting support for string contexts
+
+| name                          | global | arity | pure  | impl    |
+|-------------------------------|--------|-------|-------|---------|
+| abort                         | true   | 1     |       |         |
+| add                           | false  | 2     | true  |         |
+| addErrorContext               | false  | ?     |       | context |
+| all                           | false  | 2     | true  |         |
+| any                           | false  | 2     | true  |         |
+| appendContext                 | false  | ?     |       |         |
+| attrNames                     | false  | 1     | true  |         |
+| attrValues                    | false  |       | true  |         |
+| baseNameOf                    | true   |       |       |         |
+| bitAnd                        | false  |       |       |         |
+| bitOr                         | false  |       |       |         |
+| bitXor                        | false  |       |       |         |
+| builtins                      | true   |       |       |         |
+| catAttrs                      | false  |       |       |         |
+| compareVersions               | false  |       |       |         |
+| concatLists                   | false  |       |       |         |
+| concatMap                     | false  |       |       |         |
+| concatStringsSep              | false  |       |       |         |
+| currentSystem                 | false  |       |       |         |
+| currentTime                   | false  |       | false |         |
+| deepSeq                       | false  |       |       |         |
+| derivation                    | true   |       |       | store   |
+| derivationStrict              | true   |       |       | store   |
+| dirOf                         | true   |       |       |         |
+| div                           | false  |       |       |         |
+| elem                          | false  |       |       |         |
+| elemAt                        | false  |       |       |         |
+| false                         | true   |       |       |         |
+| fetchGit                      | true   |       |       | store   |
+| fetchMercurial                | true   |       |       | store   |
+| fetchTarball                  | true   |       |       | store   |
+| fetchurl                      | false  |       |       | store   |
+| filter                        | false  |       |       |         |
+| filterSource                  | false  |       |       | store   |
+| findFile                      | false  |       | false | todo    |
+| foldl'                        | false  |       |       |         |
+| fromJSON                      | false  |       |       |         |
+| fromTOML                      | true   |       |       |         |
+| functionArgs                  | false  |       |       |         |
+| genList                       | false  |       |       |         |
+| genericClosure                | false  |       |       | todo    |
+| getAttr                       | false  |       |       |         |
+| getContext                    | false  |       |       |         |
+| getEnv                        | false  |       | false |         |
+| hasAttr                       | false  |       |       |         |
+| hasContext                    | false  |       |       |         |
+| hashFile                      | false  |       | false |         |
+| hashString                    | false  |       |       |         |
+| head                          | false  |       |       |         |
+| import                        | true   |       |       |         |
+| intersectAttrs                | false  |       |       |         |
+| isAttrs                       | false  |       |       |         |
+| isBool                        | false  |       |       |         |
+| isFloat                       | false  |       |       |         |
+| isFunction                    | false  |       |       |         |
+| isInt                         | false  |       |       |         |
+| isList                        | false  |       |       |         |
+| isNull                        | true   |       |       |         |
+| isPath                        | false  |       |       |         |
+| isString                      | false  |       |       |         |
+| langVersion                   | false  |       |       |         |
+| length                        | false  |       |       |         |
+| lessThan                      | false  |       |       |         |
+| listToAttrs                   | false  |       |       |         |
+| map                           | true   |       |       |         |
+| mapAttrs                      | false  |       |       |         |
+| match                         | false  |       |       |         |
+| mul                           | false  |       |       |         |
+| nixPath                       | false  |       |       | todo    |
+| nixVersion                    | false  |       |       | todo    |
+| null                          | true   |       |       |         |
+| parseDrvName                  | false  |       |       |         |
+| partition                     | false  |       |       |         |
+| path                          | false  |       | sometimes | store |
+| pathExists                    | false  |       | false |         |
+| placeholder                   | true   |       |       | context |
+| readDir                       | false  |       | false |         |
+| readFile                      | false  |       | false |         |
+| removeAttrs                   | true   |       |       |         |
+| replaceStrings                | false  |       |       |         |
+| scopedImport                  | true   |       |       |         |
+| seq                           | false  |       |       |         |
+| sort                          | false  |       |       |         |
+| split                         | false  |       |       |         |
+| splitVersion                  | false  |       |       |         |
+| storeDir                      | false  |       |       | store   |
+| storePath                     | false  |       |       | store   |
+| stringLength                  | false  |       |       |         |
+| sub                           | false  |       |       |         |
+| substring                     | false  |       |       |         |
+| tail                          | false  |       |       |         |
+| throw                         | true   |       |       |         |
+| toFile                        | false  |       |       | store   |
+| toJSON                        | false  |       |       |         |
+| toPath                        | false  |       |       |         |
+| toString                      | true   |       |       |         |
+| toXML                         | true   |       |       |         |
+| trace                         | false  |       |       |         |
+| true                          | true   |       |       |         |
+| tryEval                       | false  |       |       |         |
+| typeOf                        | false  |       |       |         |
+| unsafeDiscardOutputDependency | false  |       |       |         |
+| unsafeDiscardStringContext    | false  |       |       |         |
+| unsafeGetAttrPos              | false  |       |       | todo    |
+| valueSize                     | false  |       |       | todo    |
+
+## Added after C++ Nix 2.3 (without Flakes enabled)
+
+| name          | global | arity | pure  | impl  |
+|---------------|--------|-------|-------|-------|
+| break         | false  | 1     |       | todo  |
+| ceil          | false  | 1     | true  |       |
+| fetchTree     | true   | 1     |       | todo  |
+| floor         | false  | 1     | true  |       |
+| groupBy       | false  | 2     | true  |       |
+| traceVerbose  | false  | 2     |       | todo  |
+| zipAttrsWith  | false  | 2     | true  | todo  |
diff --git a/tvix/docs/src/eval/catchable-errors.md b/tvix/docs/src/eval/catchable-errors.md
new file mode 100644
index 000000000000..ce320a921777
--- /dev/null
+++ b/tvix/docs/src/eval/catchable-errors.md
@@ -0,0 +1,131 @@
+# (Possible) Implementation(s) of Catchable Errors for `builtins.tryEval`
+
+## Terminology
+
+Talking about “catchable errors” in Nix in general is a bit precarious since
+there is no properly established terminology. Also, the existing terms are less
+than apt. The reason for this lies in the fact that catchable errors (or
+whatever you want to call them) don't properly _exist_ in the language: While
+Nix's `builtins.tryEval` is (originally) based on the C++ exception system,
+it specifically lacks the ability of such systems to have an exception _value_
+whilst handling it. Consequently, these errors don't have an obvious name
+as they never appear _in_ the Nix language. They just have to be named in the
+respective Nix implementation:
+
+- In C++ Nix the only term for such errors is `AssertionError` which is the
+  name of the (C++) exception used in the implementation internally. This
+  term isn't great, though, as `AssertionError`s can not only be generated
+  using `assert`, but also using `throw` and failed `NIX_PATH` resolutions.
+  Were this terminology to be used in documentation addressing Nix language
+  users, it would probably only serve confusion.
+
+- Tvix currently (as of r/7573) uses the term catchable errors. This term
+  relates to nothing in the language as such: Errors are not caught, we rather
+  try to evaluate an expression. Catching also sort of implies that a value
+  representation of the error is attainable (like in an exception system) which
+  is untrue.
+
+In light of this I (sterni) would like to suggest “tryable errors” as an
+alternative term going forward which isn't inaccurate and relates to terms
+already established by language internal naming.
+
+However, this document will continue using the term catchable error until the
+naming is adjusted in Tvix itself.
+
+## Implementation
+
+Below we discuss different implementation approaches in Tvix in order to arrive
+at a proposal for the new one. The historical discussion is intended as a basis
+for discussing the proposal: Are we committing to an old or current mistake? Are
+we solving all problems that cropped up or were solved at any given point in
+time?
+
+### Original
+
+The original implementation of `tryEval` in cl/6924 was quite straightforward:
+It would simply interrupt the propagation of a potential catchable error to the
+top level (which usually happened using the `?` operator) in the builtin and
+construct the appropriate representation of an unsuccessful evaluation if the
+error was deemed catchable. It had, however, multiple problems:
+
+- The VM was originally written without `tryEval` in mind, i.e. it largely
+  assumed that an error would always cause execution to be terminated. This
+  problem was later solved (cl/6940).
+- Thunks could not be `tryEval`-ed multiple times (b/281). This was another
+  consequence of VM architecture at the time: Thunks would be blackholed
+  before evaluation was started and the error could occur. Due to the
+  interaction of the generator-based VM code and `Value::force` the part
+  of the code altering the thunk state would never be informed about the
+  evaluation result in case of a failure, so the thunk would remain
+  blackholed leading to a crash if the same thunk was `tryEval`-ed or
+  forced again. To solve this issue, amjoseph completely overhauled
+  the implementation.
+
+One key point about this implementation is that it is based on the assumption
+that catchable errors can only be generated in thunks, i.e. expressions causing
+them are never evaluated strictly. This can be illustrated using C++ Nix:
+
+```console
+> nix-instantiate --eval -E '[ (assert false; true) (builtins.throw "") <nixpkgs> ]'
+[ <CODE> <CODE> <CODE> ]
+```
+
+If this wasn't the case, the VM could encounter the error in a situation where
+the error would not have needed to pass through the `tryEval` builtin, causing
+evaluation to abort.
+
+### Present
+
+The current system (mostly implemented in cl/9289) uses a very different
+approach: Instead of relying on the thunk boundary, catchable errors are no
+longer errors, but special values. They are created at the relevant points (e.g.
+`builtins.throw`) and propagated whenever they are encountered by VM ops or
+builtins. Finally, they either encounter `builtins.tryEval` (and are converted to
+an ordinary value again) or the top level where they become a normal error again.
+
+The problems with this mostly stem from the confusion between values and errors
+that it necessitates:
+
+- In most circumstances, catchable errors end up being errors again, as `tryEval`
+  is not used a lot. So `throw`s usually end up causing evaluation to abort.
+  Consequently, not only `Value::Catchable` is necessary, but also a corresponding
+  error variant that is _only_ created if a catchable value remains at the end of
+  evaluation. A requirement that was missed until cl/10991 (!) which illustrate
+  how strange that architecture is. A consequence of this is that catchable
+  errors have no location information at all.
+- `Value::Catchable` is similar to other internal values in Tvix, but is much
+  more problematic. Aside from thunks, internal values only exist for a brief
+  amount of time on the stack and it is very clear what parts of the VM or
+  builtins need to handle them. This means that the rest of the implementation
+  need to consider them, keeping the complexity caused by the internal value
+  low. `Value::Catchable`, on the other hand, may exist anywhere and be passed
+  to any VM op or builtin, so it needs to be correctly propagated _everywhere_.
+  This causes a lot of noise in the code as well as a big potential for bugs.
+  Essentially, catchable errors require as much attention by the Tvix developer
+  as laziness. This doesn't really correlate to the importance of the two
+  features to the Nix language.
+
+### Future?
+
+The core assumption of the original solution does offer a path forward: After
+cl/9289 we should be in a better position to introspect an error occurring from
+within the VM code, but we need a better way of storing such an error to prevent
+another b/281. If catchable errors can only be generated in thunks, we can just
+use the thunk representation for this. This would mean that `Thunk::force_`
+would need to check if evaluation was successful and (in case of failure)
+change the thunk representation
+
+- either to the original `ThunkRepr::Suspended` which would be simple, but of
+  course mean duplicated evaluation work in some expressions. In fact, this
+  would probably leave a lot of easy performance on the table for use cases we
+  would like to support, e.g. tree walkers for nixpkgs.
+- or to a new `ThunkRepr` variant that stores the kind of the error and all
+  necessary location info so stack traces can work properly. This of course
+  reintroduces some of the difficulty of having two kinds of errors, but it is
+  hopefully less problematic, as the thunk boundary (i.e. `Thunk::force`) is
+  where errors would usually occur.
+
+Besides the question whether this proposal can actually be implemented, another
+consideration is whether the underlying assumption will hold in the future, i.e.
+can we implement optimizations for thunk elimination in a way that thunks that
+generate catchable errors are never eliminated?
diff --git a/tvix/docs/src/eval/known-optimisation-potential.md b/tvix/docs/src/eval/known-optimisation-potential.md
new file mode 100644
index 000000000000..11babcb59ac1
--- /dev/null
+++ b/tvix/docs/src/eval/known-optimisation-potential.md
@@ -0,0 +1,161 @@
+# Known Optimisation Potential
+
+There are several areas of the Tvix evaluator code base where
+potentially large performance gains can be achieved through
+optimisations that we are already aware of.
+
+The shape of most optimisations is that of moving more work into the
+compiler to simplify the runtime execution of Nix code. This leads, in
+some cases, to drastically higher complexity in both the compiler
+itself and in invariants that need to be guaranteed between the
+runtime and the compiler.
+
+For this reason, and because we lack the infrastructure to adequately
+track their impact (WIP), we have not yet implemented these
+optimisations, but note the most important ones here.
+
+* Use "open upvalues" [hard]
+
+  Right now, Tvix will immediately close over all upvalues that are
+  created and clone them into the `Closure::upvalues` array.
+
+  Instead of doing this, we can statically determine most locals that
+  are closed over *and escape their scope* (similar to how the
+  `compiler::scope::Scope` struct currently tracks whether locals are
+  used at all).
+
+  If we implement the machinery to track this, we can implement some
+  upvalues at runtime by simply sticking stack indices in the upvalue
+  array and only copy the values where we know that they escape.
+
+* Avoid `with` value duplication [easy]
+
+  If a `with` makes use of a local identifier in a scope that can not
+  close before the with (e.g. not across `LambdaCtx` boundaries), we
+  can avoid the allocation of the phantom value and duplication of the
+  `NixAttrs` value on the stack. In this case we simply push the stack
+  index of the known local.
+
+* Multiple attribute selection [medium]
+
+  An instruction could be introduced that avoids repeatedly pushing an
+  attribute set to/from the stack if multiple keys are being selected
+  from it. This occurs, for example, when inheriting from an attribute
+  set or when binding function formals.
+
+* Split closure/function representation [easy]
+
+  Functions have fewer fields that need to be populated at runtime and
+  can directly use the `value::function::Lambda` representation where
+  possible.
+
+* Apply `compiler::optimise_select` to other set operations [medium]
+
+  In addition to selects, statically known attribute resolution could
+  also be used for things like `?` or `with`. The latter might be a
+  little more complicated but is worth investigating.
+
+* Inline fully applied builtins with equivalent operators [medium]
+
+  Some `builtins` have equivalent operators, e.g. `builtins.sub`
+  corresponds to the `-` operator, `builtins.hasAttr` to the `?`
+  operator etc. These operators additionally compile to a primitive
+  VM opcode, so they should be just as cheap (if not cheaper) as
+  a builtin application.
+
+  In case the compiler encounters a fully applied builtin (i.e.
+  no currying is occurring) and the `builtins` global is unshadowed,
+  it could compile the equivalent operator bytecode instead: For
+  example, `builtins.sub 20 22` would be compiled as `20 - 22`.
+  This would ensure that equivalent `builtins` can also benefit
+  from special optimisations we may implement for certain operators
+  (in the absence of currying). E.g. we could optimise access
+  to the `builtins` attribute set which a call to
+  `builtins.getAttr "foo" builtins` should also profit from.
+
+* Avoid nested `VM::run` calls [hard]
+
+  Currently when encountering Nix-native callables (thunks, closures)
+  the VM's run loop will nest and return the value of the nested call
+  frame one level up. This makes the Rust call stack almost mirror the
+  Nix call stack, which is usually undesirable.
+
+  It is possible to detect situations where this is avoidable and
+  instead set up the VM in such a way that it continues and produces
+  the desired result in the same run loop, but this is kind of tricky
+  to get right - especially while other parts are still in flux.
+
+  For details consult the commit with Gerrit change ID
+  `I96828ab6a628136e0bac1bf03555faa4e6b74ece`, in which the initial
+  attempt at doing this was reverted.
+
+* Avoid thunks if only identifier closing is required [medium]
+
+  Some constructs, like `with`, mostly do not change runtime behaviour
+  if thunked. However, they are wrapped in thunks to ensure that
+  deferred identifiers are resolved correctly.
+
+  This can be avoided, as we statically analyse the scope and should
+  be able to tell whether any such logic was required.
+
+* Intern literals [easy]
+
+  Currently, the compiler emits a separate entry in the constant
+  table for each literal.  So the program `1 + 1 + 1` will have
+  three entries in its `Chunk::constants` instead of only one.
+
+* Do some list and attribute set operations in place [hard]
+
+  Algorithms that can not do a lot of work inside `builtins` like `map`,
+  `filter` or `foldl'` usually perform terribly if they use data structures like
+  lists and attribute sets.
+
+  `builtins` can do work in place on a copy of a `Value`, but naïvely expressed
+  recursive algorithms will usually use `//` and `++` to do a single change to a
+  `Value` at a time, requiring a full copy of the data structure each time.
+  It would be a big improvement if we could do some of these operations in place
+  without requiring a new copy.
+
+  There are probably two approaches: We could determine statically if a value is
+  reachable from elsewhere and emit a special in place instruction if not. An
+  easier alternative is probably to rely on reference counting at runtime: If no
+  other reference to a value exists, we can extend the list or update the
+  attribute set in place.
+
+  An **alternative** to this is using [persistent data
+  structures](https://en.wikipedia.org/wiki/Persistent_data_structure) or at the
+  very least [immutable data structures](https://docs.rs/im/latest/im/) that can
+  be copied more efficiently than the stock structures we are using at the
+  moment.
+
+* Skip finalising unfinalised thunks or non-thunks instead of crashing [easy]
+
+  Currently `OpFinalise` crashes the VM if it is called on values that don't
+  need to be finalised. This helps catching miscompilations where `OpFinalise`
+  operates on the wrong `StackIdx`. In the case of function argument patterns,
+  however, this means extra VM stack and instruction overhead for dynamically
+  determining if finalisation is necessary or not. This wouldn't be necessary
+  if `OpFinalise` would just noop on any values that don't need to be finalised
+  (anymore).
+
+* Phantom binding for from expression of inherits [easy]
+
+  The from expression of an inherit is reevaluated for each inherit. This can
+  be demonstrated using the following Nix expression which, counter-intuitively,
+  will print “plonk” twice.
+
+  ```nix
+  let
+    inherit (builtins.trace "plonk" { a = null; b = null; }) a b;
+  in
+  builtins.seq a (builtins.seq b null)
+  ```
+
+  In most Nix code, the from expression is just an identifier, so it is not
+  terribly inefficient, but in some cases a more expensive expression may
+  be used. We should create a phantom binding for the from expression that
+  is reused in the inherits, so only a single thunk is created for the from
+  expression.
+
+  Since we discovered this, C++ Nix has implemented a similar optimization:
+  <https://github.com/NixOS/nix/pull/9847>.
diff --git a/tvix/docs/src/eval/language-issues.md b/tvix/docs/src/eval/language-issues.md
new file mode 100644
index 000000000000..152e6594a1d0
--- /dev/null
+++ b/tvix/docs/src/eval/language-issues.md
@@ -0,0 +1,46 @@
+# Nix language issues
+
+In the absence of a language standard, what Nix (the language) is, is prescribed
+by the behavior of the C++ Nix implementation. Still, there are reasons not to
+accept some behavior:
+
+* Tvix aims for nixpkgs compatibility only. This means we can ignore behavior in
+  edge cases nixpkgs doesn't trigger as well as obscure features it doesn't use
+  (e.g. `__overrides`).
+* Some behavior of the Nix evaluator seems to be unintentional or an
+  implementation detail leaking out into language behavior.
+
+Especially in the latter case, it makes sense to raise the respective issue and
+maybe to get rid of the behavior in all implementations for good. Below is an
+(incomplete) list of such issues:
+
+* [Behaviour of nested attribute sets depends on definition order][i7111]
+* [Partially constructed attribute sets are observable during dynamic attr names construction][i7012]
+* [Nix parsers merges multiple attribute set literals for the same key incorrectly depending on definition order][i7115]
+
+On the other hand, there is behavior that seems to violate one's expectation
+about the language at first, but has good enough reasons from an implementor's
+perspective to keep them:
+
+* Dynamic keys are forbidden in `let` and `inherit`. This makes sure that we
+  only need to do runtime identifier lookups for `with`. More dynamic (i.e.
+  runtime) lookups would make the scoping system even more complicated as well
+  as hurt performance.
+* Dynamic attributes of `rec` sets are not added to its scope. This makes sense
+  for the same reason.
+* Dynamic and nested attributes in attribute sets don't get merged. This is a
+  tricky one, but avoids doing runtime (recursive) merges of attribute sets.
+  Instead all necessary merging can be inferred statically, i.e. the C++ Nix
+  implementation already merges at parse time, making nested attribute keys
+  syntactic sugar effectively.
+
+Other behavior is just odd, surprising or underdocumented:
+
+* `builtins.foldl'` doesn't force the initial accumulator (but all other
+  intermediate accumulator values), differing from e.g. Haskell, see
+  the [relevant PR discussion][p7158].
+
+[i7111]: https://github.com/NixOS/nix/issues/7111
+[i7012]: https://github.com/NixOS/nix/issues/7012
+[i7115]: https://github.com/NixOS/nix/issues/7115
+[p7158]: https://github.com/NixOS/nix/pull/7158
diff --git a/tvix/docs/src/eval/opcodes-attrsets.md b/tvix/docs/src/eval/opcodes-attrsets.md
new file mode 100644
index 000000000000..7026f3319dda
--- /dev/null
+++ b/tvix/docs/src/eval/opcodes-attrsets.md
@@ -0,0 +1,122 @@
+# attrset-opcodes
+
+The problem with attrset literals is twofold:
+
+1. The keys of attribute sets may be dynamically evaluated.
+
+   Access:
+
+   ```nix
+   let
+     k = "foo";
+     attrs = { /* etc. */ };
+   in attrs."${k}"
+   ```
+
+   Literal:
+   ```nix
+   let
+     k = "foo";
+   in {
+     "${k}" = 42;
+   }
+   ```
+
+   The problem with this is that the attribute set key is not known at
+   compile time, and needs to be dynamically evaluated by the VM as an
+   expression.
+
+   For the most part this should be pretty simple, assuming a
+   theoretical instruction set:
+
+   ```
+   0000  OP_CONSTANT(0) # key "foo"
+   0001  OP_CONSTANT(1) # value 42
+   0002  OP_ATTR_SET(1) # construct attrset from 2 stack values
+   ```
+
+   The operation pushing the key needs to be replaced with one that
+   leaves a single value (the key) on the stack, i.e. the code for the
+   expression, e.g.:
+
+   ```
+   0000..000n <operations leaving a string value on the stack>
+   000n+1     OP_CONSTANT(1) # value 42
+   000n+2     OP_ATTR_SET(1) # construct attrset from 2 stack values
+   ```
+
+   This is fairly easy to do by simply recursing in the compiler when
+   the key expression is encountered.
+
+2. The keys of attribute sets may be nested.
+
+   This is the non-trivial part of dealing with attribute set
+   literals. Specifically, the nesting can be arbitrarily deep and the
+   AST does not guarantee that related set keys are located
+   adjacently.
+
+   Furthermore, this frequently occurs in practice in Nix. We need a
+   bytecode representation that makes it possible to construct nested
+   attribute sets at runtime.
+
+   Proposal: AttrPath values
+
+   If we can leave a value representing an attribute path on the
+   stack, we can offload the construction of nested attribute sets to
+   the `OpAttrSet` operation.
+
+   Under the hood, OpAttrSet in practice constructs a `Map<NixString,
+   Value>` attribute set in most cases. This means it expects to pop
+   the value of the key of the stack, but is otherwise free to do
+   whatever it wants with the underlying map.
+
+   In a simple example, we could have code like this:
+
+   ```nix
+   {
+     a.b = 15;
+   }
+   ```
+
+   This would be compiled to a new `OpAttrPath` instruction that
+   constructs and pushes an attribute path from a given number of
+   fragments (which are popped off the stack).
+
+   For example,
+
+   ```
+   0000 OP_CONSTANT(0)  # key "a"
+   0001 OP_CONSTANT(1)  # key "b"
+   0002 OP_ATTR_PATH(2) # construct attrpath from 2 fragments
+   0003 OP_CONSTANT(2)  # value 42
+   0004 OP_ATTRS(1)     # construct attrset from one pair
+   ```
+
+   Right before `0004` the stack would be left like this:
+
+   [ AttrPath[a,b], 42 ]
+
+   Inside of the `OP_ATTRS` instruction we could then begin
+   construction of the map and insert the nested attribute sets as
+   required, as well as validate that there are no duplicate keys.
+
+3. Both of these cases can occur simultaneously, but this is not a
+   problem as the opcodes combine perfectly fine, e.g.:
+
+   ```nix
+   let
+     k = "a";
+   in {
+     "${k}".b = 42;
+   }
+   ```
+
+   results in
+
+   ```
+   0000..000n <operations leaving a string value on the stack>
+   000n+1     OP_CONSTANT(1)  # key "b"
+   000n+2     OP_ATTR_PATH(2) # construct attrpath from 2 fragments
+   000n+3     OP_CONSTANT(2)  # value 42
+   000n+4     OP_ATTR_SET(1)  # construct attrset from 2 stack values
+   ```
diff --git a/tvix/docs/src/eval/recursive-attrs.md b/tvix/docs/src/eval/recursive-attrs.md
new file mode 100644
index 000000000000..5ce1cb2b64ff
--- /dev/null
+++ b/tvix/docs/src/eval/recursive-attrs.md
@@ -0,0 +1,67 @@
+# Recursive attribute sets
+
+The construction behaviour of recursive attribute sets is very
+specific, and a bit peculiar.
+
+In essence, there are multiple "phases" of scoping that take place
+during attribute set construction:
+
+1. Every inherited value without an explicit source is inherited only
+   from the **outer** scope in which the attribute set is enclosed.
+
+2. A new scope is opened in which all recursive keys are evaluated.
+   This only considers **statically known keys**, attributes can
+   **not** recurse into dynamic keys in `self`!
+
+   For example, this code is invalid in C++ Nix:
+
+   ```
+   nix-repl> rec { ${"a"+""} = 2; b = a * 10; }
+   error: undefined variable 'a' at (string):1:26
+   ```
+
+3. Finally, a third scope is opened in which dynamic keys are
+   evaluated.
+
+This behaviour, while possibly a bit strange and unexpected, actually
+simplifies the implementation of recursive attribute sets in Tvix as
+well.
+
+Essentially, a recursive attribute set like this:
+
+```nix
+rec {
+  inherit a;
+  b = a * 10;
+  ${"c" + ""} = b * 2;
+}
+```
+
+Can be compiled like the following expression:
+
+```nix
+let
+  inherit a;
+in let
+  b = a * 10;
+  in {
+    inherit a b;
+    ${"c" + ""} = b * 2;
+  }
+```
+
+Completely deferring the resolution of recursive identifiers to the
+existing handling of recursive scopes (i.e. deferred access) in let
+bindings.
+
+In practice, we can further specialise this and compile each scope
+directly into the form expected by `OpAttrs` (that is, leaving
+attribute names on the stack) before each value's position.
+
+C++ Nix's Implementation
+------------------------
+
+* [`ExprAttrs`](https://github.com/NixOS/nix/blob/2097c30b08af19a9b42705fbc07463bea60dfb5b/src/libexpr/nixexpr.hh#L241-L268)
+  (AST representation of attribute sets)
+* [`ExprAttrs::eval`](https://github.com/NixOS/nix/blob/075bf6e5565aff9fba0ea02f3333c82adf4dccee/src/libexpr/eval.cc#L1333-L1414)
+* [`addAttr`](https://github.com/NixOS/nix/blob/master/src/libexpr/parser.y#L98-L156) (`ExprAttrs` construction in the parser)
diff --git a/tvix/docs/src/eval/vm-loop.md b/tvix/docs/src/eval/vm-loop.md
new file mode 100644
index 000000000000..a75c7eec31df
--- /dev/null
+++ b/tvix/docs/src/eval/vm-loop.md
@@ -0,0 +1,314 @@
+# tvix-eval VM loop
+
+This document describes the new tvix-eval VM execution loop implemented in the
+chain focusing around cl/8104.
+
+## Background
+
+The VM loop implemented in Tvix prior to cl/8104 had several functions:
+
+1. Advancing the instruction pointer for a chunk of Tvix bytecode and
+   executing instructions in a loop until a result was yielded.
+
+2. Tracking Nix call frames as functions/thunks were entered/exited.
+
+3. Catching trampoline requests returned from instructions to force suspended
+   thunks without increasing stack size *where possible*.
+
+4. Handling trampolines through an inner trampoline loop, switching between a
+   code execution mode and execution of subsequent trampolines.
+
+This implementation of the trampoline logic was added on to the existing VM,
+which previously always recursed for thunk forcing. There are some cases (for
+example values that need to be forced *inside* of the execution of a builtin)
+where trampolines could not previously be used, and the VM recursed anyways.
+
+As a result of this trampoline logic being added "on top" of the existing VM
+loop the code became quite difficult to understand. This led to several bugs,
+for example: b/251, b/246, b/245, and b/238.
+
+These bugs were tricky to deal with, as we had to try and make the VM do
+things that are somewhat difficult to fit into its model. We could of course
+keep extending the trampoline logic to accommodate all sorts of concepts (such
+as finalisers), but that seems like it does not solve the root problem.
+
+## New VM loop
+
+In cl/8104, a unified new solution is implemented with which the VM is capable
+of evaluating everything without increasing the call stack size.
+
+This is done by introducing a new frame stack in the VM, on which execution
+frames are enqueued that are either:
+
+1. A bytecode frame, consisting of Tvix bytecode that evaluates compiled Nix
+   code.
+2. A generator frame, consisting of some VM logic implemented in pure Rust
+   code that can be *suspended* when it hits a point where the VM would
+   previously need to recurse.
+
+We do this by making use of the `async` *keyword* in Rust, but notably
+*without* introducing asynchronous I/O or concurrency in tvix-eval (the
+complexity of which is currently undesirable for us).
+
+Specifically, when writing a Rust function that uses the `async` keyword, such
+as:
+
+```rust
+async fn some_builtin(input: Value) -> Result<Value, ErrorKind> {
+  let mut out = NixList::new();
+
+  for element in input.to_list()? {
+    let result = do_something_that_requires_the_vm(element).await;
+    out.push(result);
+  }
+
+  Ok(out)
+}
+```
+
+The compiler actually generates a state-machine under-the-hood which allows
+the execution of that function to be *suspended* whenever it hits an `await`.
+
+We use the [`genawaiter`][] crate that gives us a data structure and simple
+interface for getting instances of these state machines that can be stored in
+a struct (in our case, a *generator frame*).
+
+The execution of the VM then becomes the execution of an *outer loop*, which
+is responsible for selecting the next generator frame to execute, and two
+*inner loops*, which drive the execution of a bytecode frame or generator
+frame forward until it either yields a value or asks to be suspended in favour
+of another frame.
+
+All "communication" between frames happens solely through values left on the
+stack: Whenever a frame of either type runs to completion, it is expected to
+leave a *single* value on the stack. It follows that the whole VM, upon
+completion of the last (or initial, depending on your perspective) frame
+yields its result as the return value.
+
+The core of the VM restructuring is cl/8104, unfortunately one of the largest
+single commit changes we've had to make yet, as it touches pretty much all
+areas of tvix-eval. The introduction of the generators and the
+message/response system we built to request something from the VM, suspend a
+generator, and wait for the return is in cl/8148.
+
+The next sections describe in detail how the three different loops work.
+
+### Outer VM loop
+
+The outer VM loop is responsible for selecting the next frame to run, and
+dispatching it correctly to inner loops, as well as determining when to shut
+down the VM and return the final result.
+
+```
+                          ╭──────────────────╮
+                 ╭────────┤ match frame kind ├──────╮
+                 │        ╰──────────────────╯      │
+                 │                                  │
+    ┏━━━━━━━━━━━━┷━━━━━┓                ╭───────────┴───────────╮
+───►┃ frame_stack.pop()┃                ▼                       ▼
+    ┗━━━━━━━━━━━━━━━━━━┛       ┏━━━━━━━━━━━━━━━━┓      ┏━━━━━━━━━━━━━━━━━┓
+                 ▲             ┃ bytecode frame ┃      ┃ generator frame ┃
+                 │             ┗━━━━━━━━┯━━━━━━━┛      ┗━━━━━━━━┯━━━━━━━━┛
+                 │[yes, cont.]          │                       │
+                 │                      ▼                       ▼
+    ┏━━━━━━━━┓   │             ╔════════════════╗      ╔═════════════════╗
+◄───┨ return ┃   │             ║ inner bytecode ║      ║ inner generator ║
+    ┗━━━━━━━━┛   │             ║      loop      ║      ║      loop       ║
+        ▲        │             ╚════════╤═══════╝      ╚════════╤════════╝
+        │   ╭────┴─────╮                │                       │
+        │   │ has next │                ╰───────────┬───────────╯
+   [no] ╰───┤  frame?  │                            │
+            ╰────┬─────╯                            ▼
+                 │                         ┏━━━━━━━━━━━━━━━━━┓
+                 │                         ┃ frame completed ┃
+                 ╰─────────────────────────┨  or suspended   ┃
+                                           ┗━━━━━━━━━━━━━━━━━┛
+```
+
+Initially, the VM always pops a frame from the frame stack and then inspects
+the type of frame it found. As a consequence the next frame to execute is
+always the frame at the top of the stack, and setting up a VM initially for
+code execution is done by leaving a bytecode frame with the code to execute on
+the stack and passing control to the outer loop.
+
+Control is dispatched to either of the inner loops (depending on the type of
+frame) and the cycle continues once they return.
+
+When an inner loop returns, it has either finished its execution (and left its
+result value on the *value stack*), or its frame has requested to be
+suspended.
+
+Frames request suspension by re-enqueueing *themselves* through VM helper
+methods, and then leaving the frame they want to run *on top* of themselves in
+the frame stack before yielding control back to the outer loop.
+
+The inner control loops inform the outer loops about whether the frame has
+been *completed* or *suspended* by returning a boolean.
+
+### Inner bytecode loop
+
+The inner bytecode loop drives the execution of some Tvix bytecode by
+continously looking at the next instruction to execute, and dispatching to the
+instruction handler.
+
+```
+   ┏━━━━━━━━━━━━━┓
+◄──┨ return true ┃
+   ┗━━━━━━━━━━━━━┛
+          ▲
+     ╔════╧═════╗
+     ║ OpReturn ║
+     ╚══════════╝
+          ▲
+          ╰──┬────────────────────────────╮
+             │                            ▼
+             │                 ╔═════════════════════╗
+    ┏━━━━━━━━┷━━━━━┓           ║ execute instruction ║
+───►┃ inspect next ┃           ╚══════════╤══════════╝
+    ┃  instruction ┃                      │
+    ┗━━━━━━━━━━━━━━┛                      │
+             ▲                      ╭─────┴─────╮
+             ╰──────────────────────┤ suspends? │
+                       [no]         ╰─────┬─────╯
+                                          │
+                                          │
+   ┏━━━━━━━━━━━━━━┓                       │
+◄──┨ return false ┃───────────────────────╯
+   ┗━━━━━━━━━━━━━━┛              [yes]
+```
+
+With this refactoring, the compiler now emits a special `OpReturn` instruction
+at the end of bytecode chunks. This is a signal to the runtime that the chunk
+has completed and that its current value should be returned, without having to
+perform instruction pointer arithmetic.
+
+When `OpReturn` is encountered, the inner bytecode loop returns control to the
+outer loop and informs it (by returning `true`) that the bytecode frame has
+completed.
+
+Any other instruction may also request a suspension of the bytecode frame (for
+example, instructions that need to force a value). In this case the inner loop
+is responsible for setting up the frame stack correctly, and returning `false`
+to inform the outer loop of the suspension
+
+### Inner generator loop
+
+The inner generator loop is responsible for driving the execution of a
+generator frame by continously calling [`Gen::resume`][] until it requests a
+suspension (as a result of which control is returned to the outer loop), or
+until the generator is done and yields a value.
+
+```
+   ┏━━━━━━━━━━━━━┓
+◄──┨ return true ┃ ◄───────────────────╮
+   ┗━━━━━━━━━━━━━┛                     │
+                                       │
+                               [Done]  │
+                    ╭──────────────────┴─────────╮
+                    │ inspect generator response │◄────────────╮
+                    ╰──────────────────┬─────────╯             │
+                            [yielded]  │              ┏━━━━━━━━┷━━━━━━━━┓
+                                       │              ┃ gen.resume(msg) ┃◄──
+                                       ▼              ┗━━━━━━━━━━━━━━━━━┛
+                                 ╭────────────╮                ▲
+                                 │ same-frame │                │
+                                 │  request?  ├────────────────╯
+                                 ╰─────┬──────╯      [yes]
+   ┏━━━━━━━━━━━━━━┓                    │
+◄──┨ return false ┃ ◄──────────────────╯
+   ┗━━━━━━━━━━━━━━┛                [no]
+```
+
+On each execution of a generator frame, `resume_with` is called with a
+[`VMResponse`][] (i.e. a message *from* the VM *to* the generator). For a newly
+created generator, the initial message is just `Empty`.
+
+A generator may then respond by signaling that it has finished execution
+(`Done`), in which case the inner generator loop returns control to the outer
+loop and informs it that this generator is done (by returning `true`).
+
+A generator may also respond by signaling that it needs some data from the VM.
+This is implemented through a request-response pattern, in which the generator
+returns a `Yielded` message containing a [`VMRequest`][]. These requests can be
+very simple ("Tell me the current store path") or more complex ("Call this Nix
+function with these values").
+
+Requests are divided into two classes: Same-frame requests (requests that can be
+responded to *without* returning control to the outer loop, i.e. without
+executing a *different* frame), and multi-frame generator requests. Based on the
+type of request, the inner generator loop will either handle it right away and
+send the response in a new `resume_with` call, or return `false` to the outer
+generator loop after setting up the frame stack.
+
+Most of this logic is implemented in cl/8148.
+
+[`Gen::resume`]: https://docs.rs/genawaiter/0.99.1/genawaiter/rc/struct.Gen.html#method.resume_with
+[`VMRequest`]: https://cs.tvl.fyi/depot@2696839770c1ccb62929ff2575a633c07f5c9593/-/blob/tvix/eval/src/vm/generators.rs?L44
+[`VMResponse`]: https://cs.tvl.fyi/depot@2696839770c1ccb62929ff2575a633c07f5c9593/-/blob/tvix/eval/src/vm/generators.rs?L169
+
+## Advantages & Disadvantages of the approach
+
+This approach has several advantages:
+
+* The execution model is much simpler than before, making it fairly
+  straightforward to build up a mental model of what the VM does.
+
+* All "out of band requests" inside the VM are handled through the same
+  abstraction (generators).
+
+* Implementation is not difficult, albeit a little verbose in some cases (we
+  can argue about whether or not to introduce macros for simplifying it).
+
+* Several parts of the VM execution are now much easier to document,
+  potentially letting us onboard tvix-eval contributors faster.
+
+* The linear VM execution itself is much easier to trace now, with for example
+  the `RuntimeObserver` (and by extension `tvixbolt`) giving much clearer
+  output now.
+
+But it also comes with some disadvantages:
+
+* Even though we "only" use the `async` keyword without a full async-I/O
+  runtime, we still encounter many of the drawbacks of the fragmented Rust
+  async ecosystem.
+
+  The biggest issue with this is that parts of the standard library become
+  unavailable to us, for example the built-in `Vec::sort_by` can no longer be
+  used for sorting in Nix because our comparators themselves are `async`.
+
+  This led us to having to implement some logic on our own, as the design of
+  `async` in Rust even makes it difficult to provide usecase-generic
+  implementations of concepts like sorting.
+
+* We need to allocate quite a few new structures on the heap in order to drive
+  generators, as generators involve storing `Future` types (with unknown
+  sizes) inside of structs.
+
+  In initial testing this seems to make no significant difference in
+  performance (our performance in an actual nixpkgs-eval is still bottlenecked
+  by I/O concerns and reference scanning), but is something to keep in mind
+  later on when we start optimising more after the low-hanging fruits have
+  been reaped.
+
+## Alternatives considered
+
+1. Tacking on more functionality onto the existing VM loop
+   implementation to accomodate problems as they show up. This is not
+   preferred as the code is already getting messy.
+
+2. Making tvix-eval a fully `async` project, pulling in something like Tokio
+   or `async-std` as a runtime. This is not preferred due to the massively
+   increased complexity of those solutions, and all the known issues of fully
+   buying in to the async ecosystem.
+
+   tvix-eval fundamentally should work for use-cases besides building Nix
+   packages (e.g. for `//tvix/serde`), and its profile should be as slim as
+   possible.
+
+3. Convincing the Rust developers that Rust needs a way to guarantee
+   constant-stack-depth tail calls through something like a `tailcall`
+   keyword.
+
+4. ... ?
+
+[`genawaiter`]: https://docs.rs/genawaiter/
diff --git a/tvix/docs/src/figures/component-flow.puml b/tvix/docs/src/figures/component-flow.puml
new file mode 100644
index 000000000000..5b6d79b82313
--- /dev/null
+++ b/tvix/docs/src/figures/component-flow.puml
@@ -0,0 +1,60 @@
+@startuml
+
+title Tvix build flow
+
+actor User
+participant CLI
+participant "Coordinator" as Coord
+participant "Evaluator" as Eval
+database Store
+participant "Builder" as Build
+
+note over CLI,Eval
+    Typically runs locally on the invoking machine
+end note
+/ note over Store, Build
+    Can be either local or remote
+end note
+
+User-->CLI: User initiates build of `hello` (analogous to `nix-build -f '<nixpkgs>' -A hello`)
+
+CLI-->Coord: CLI invokes coordinator
+
+Coord-->Eval: Sends message to start evaluation of `<nixpkgs>` (path lookup) with attribute `hello`
+note right: The paths to the evaluator are local file system paths
+
+Coord<--Eval: Yields derivations to be built
+note right
+    Immediately starts streaming derivations as they are instantiated across
+    the dependency graph so they can be built while the evaluation is still running.
+
+    There are two types of build requests: One for regular "fire and forget" builds,
+    and another for IFD (import from derivation).
+
+    These are distinct because IFD needs to be fed back into the evaluator for
+    further processing while a regular build does not.
+end note
+
+loop while has more derivations
+
+    Coord-->Store: Check if desired paths are in store
+    alt Store has path
+        Coord<--Store: Success response
+    else Store does not have path
+        Coord-->Build: Request derivation to be built
+
+        alt Build failure
+            Coord<--Build: Fail response
+            note left: It's up to the coordinator whether to exit on build failure
+        else Build success
+            Build-->Store: Push outputs to store
+            Build<--Coord: Send success & pushed response
+        end
+
+    end
+end
+
+CLI<--Coord: Respond success/fail
+User<--CLI: Exit success/fail
+
+@enduml
diff --git a/tvix/docs/src/getting-started.md b/tvix/docs/src/getting-started.md
new file mode 100644
index 000000000000..1cbb6de7d4f7
--- /dev/null
+++ b/tvix/docs/src/getting-started.md
@@ -0,0 +1,59 @@
+# Getting Started
+
+## Getting the code, a developer shell, & building the CLI
+
+Tvix can be built with the Rust standard `cargo build`. A Nix shell is provided
+with the correctly-versioned tooling to build.
+
+### TVL monorepo
+
+```console
+$ git clone https://code.tvl.fyi/depot.git
+$ cd depot
+```
+
+[Direnv][] is highly recommended in order to enable [`mg`][mg], a tool for
+workflows in monorepos. Follow the [Direnv installation
+instructions][direnv-inst], then after it’s set up continue with:
+
+```console
+$ direnv allow
+$ mg shell //tvix:shell
+$ cd tvix
+$ cargo build
+```
+
+### Or just Tvix
+
+At present, this option isn’t suitable for contributions & lacks the tooling of
+the monorepo, but still provides a `shell.nix` which can be used for building
+the Tvix project.
+
+```console
+$ git clone https://code.tvl.fyi/depot.git:workspace=views/tvix.git
+$ cd tvix
+$ nix-shell
+$ cargo build
+```
+
+
+# Builds & tests
+
+All projects are built using [Nix][] to avoid ‘build pollution’ via the user’s
+local environment.
+
+If you have Nix installed and are contributing to a project tracked in this
+repository, you can usually build the project by calling `nix-build -A
+path.to.project`.
+
+For example, to build a project located at `//tools/foo` you would call
+`nix-build -A tools.foo`
+
+If the project has tests, check that they still work before submitting your
+change.
+
+
+[Direnv]: https://direnv.net
+[direnv-inst]: https://direnv.net/docs/installation.html
+[Nix]: https://nixos.org/nix/
+[mg]: https://code.tvl.fyi/tree/tools/magrathea
diff --git a/tvix/docs/src/introduction.md b/tvix/docs/src/introduction.md
new file mode 100644
index 000000000000..744fbeec9fbe
--- /dev/null
+++ b/tvix/docs/src/introduction.md
@@ -0,0 +1,23 @@
+# Introduction
+
+Tvix  (\[tvɪks\], [🔈][pronunciation]) is a new Rust implementation of the
+components of the [Nix package manager][Nix].
+
+Tvix’s modularity & composability allows recombining its parts in novel ways.
+It also provides library access to Nix data formats and concepts. In the
+long-run, Tvix aims to produce a Nixpkgs-compatible alternative to NixCpp
+with respects to evaluation and building Nix expressions & systems.
+
+Tvix still is in its early stages of development, **you cannot yet use it as a
+Nix replacement**. However, if you willing to roll up your sleeves and pipe
+together some existing functionality, it may already provide most of what is
+needed for your usecase! [Get in touch](./community.md) if you want to 
+collaborate or contribute.
+
+Tvix is developed as a GPLv3-licensed free software project with
+source code available in the [TVL monorepo][].
+
+[Nix]: https://nixos.org
+[TVL]: https://tvl.fyi
+[TVL monorepo]: https://cs.tvl.fyi/depot/-/tree/tvix
+[pronunciation]: data:audio/mpeg;base64,SUQzBAAAAAAAI1RTU0UAAAAPAAADTGF2ZjU4Ljc2LjEwMAAAAAAAAAAAAAAA//NgxAAdEr31SGDGIQAoBRVe/7Yuv8SvC9C3dwMwOCEAE6YG/Cc/fQkSu5PE0Jzr6cRKhH6IEIIiIhOyU5Xdxbk5/0+vpvppXc/hQkrn7iCrn6e/XPield3Mq5oXh3cOLjgB3/+Zl4HAMuH/mAZpM8w8PDw9wAIk//5//gAODz/D1QNAKDLoeuUxZd2VeX0qV80WaVrSQech1Qpo//NixBofrA4YAVQoAYBzKPexmkFBrpGj2IrFxAh2NqwmjI9Bz+zGah1nVTsQQYlvF69rIzoTfLO6sqxMenRnYhhQe0yrQUNoTIrIymeh9/WtTnRjsk4wUEzE/10PE3ibzlRnQc6/o9BICCbzEVRN/E4CAAcgABBIgEtvMot7RXUwz1qnlmvx3zszWyTQ3PnVX6qRmWjvMuWn9CQGyP/zYsQrJuN+NYuYWAEASB1virzQzUvD2SBsJ598MYyYWUpZ6qA6mEkHbSSIDhm8/H9plcHHrYez50ll+Wj//uv7ZT1GfzU3fbbr+n7LPuVZHL9SPn+nww5M3d//3WdmV5O//9HYa2v9/MO7/iPY+tbPP3mqOf7M9YkkMSav2ub42nG43G5HIvvggKFUSYCII1JJlw5Z6Zl7q1gf4Yn/82LEHyfjCo29jHgCUQzqTIkkWtXFjjE6fsipzrFmZQMjI5ItTsH9c63TUWysQ5U0gKuGfMHW9xc1on58bw/fZTZkXamGN4tIM0bGPq0SlY8sJwn//v8/eNU1//j5zvGMb1u8e9/hz3nUPeb4v////r5+a1/9//97/3nesw9/3pKBi////wi4LtEAQQBKQdcrtttdriljiVQZDRBu//NgxA8jssa+X4xAAIyZIQuhQmsVXnSZQfa38AQAguDwKgVRGIoN7QuWGlkB4o01SRTOVz0IMe20lLfhYMST5BsMuNazInz4SXvROUIqeXKGlRF+tFu/ufB9OrGVnjnFYHB/ZL/3//yicvMVv8JzcxLo90MU0OFzaD9BIV4nxgAM8o54spq1vWJ3J1euBsqXgjzNIQwkDXFa+z0M//NixA8jY36IAZlAAE0Rhou7HpU1l3MchEFXAVQ+LzxKHghCoadED8LrAiMaa3xRZJAknkov/0FXWIS2baSv/8BUUcW+QUiR1evWF///Bt0o6RW5m5hYJUYsXF8p////0IR5Zrg8JXOOYVF8gZJHCk1zqTKx/Hf/NfEKJk//Q4OOvjk2lgr2KwVRuroANCW22220WC22WxxSJFJw///zYsQRIPpawl+PQAJIok7ku2H1FcavOpWYUpCxe5s1Q+UMGBAYIxQMDgDA7pVwZaCAHim/+9y90RRhX/+fFaUgdCMJty+f/76mfduCSBt2zTfz///0ic173oKSszA1maHKny9l0yXDjzAfxax9QnSpfGXFwuopIpFDapMcuhPXAR3d0afa5G6KRfINNDCox6CHM0gRpAHhcjpg5S//82LEHSehUqZ/2MAAstHIU01fRJTCUN0W5ONikDty+mkOcMdZggMiyaKrk5FopDHCDIREY5AUvBrQy6mEtXwj5KF2NnjqmcNPxI1hFYIw4cDPo7jsORGJyxhvu7VJ9JSU9PfHgh+s+kasgvIDmlycQOTKARwslggLg44MRq0e9zQPs94laXjb3mzJ4UOxRZAAVyUswcgDasxCuqXL//NgxA4fmbakXspMYGEihpTRnTNbuO2ScS7G8HnUYI0VUba51KySxTFPhAGIBWhZaUlpDwIKEUEFhRRIzMs0un31sV8pgYkDIo/////NZ7kCXBR5z/SmcfCtV7CBYROck2wt1PNAIWBpH20Ch06Mi7nAk6+0XTcL0j1jHvMKBjRANkjutdRwilbDIuehzRxbsWWi8O2LtKEAnFGg//NixB4Y2VKpHsDEnBAxM9VjUzUTxwpSxo3AKtq31VWBlKwUfHv8MKcPFQF6xg8NDlvb/jYab0ka3Q02RWr6nnoat/kxFMsfSd9N1V70VMQEQlUAAWVIF9yCKxm5BjQQyFCjvEAhZkFiwpgEl+ZTMQ9VBAo8lZIkcRIsmUt2JOCoEsO02W3zMnWjMttVnOJbLY1fZOAgf61FKRAJy//zYsRKHYIKclTJhJz/EmyjLKhmM5jG/m/MKFyxPj5WLwlshMRM9RJbjSEpKJ2XajtyqPrsVoMplgElrTccgAtkE8gNDDUgCYAXcJ+KZwji3XvujuyuqqkxkbfGZ1DKtgZj2KNQokgJo0LNS43+WVCkYca9L7GVPh3PpOFRqX6l/5/3Rur//7Lw/2/gXzNXUmb/8+r+cqrG1/jHV1L/82LEZB9r2lmdUxgBY12NW/8yra/VWCY2onY2OqTCqWoDlFYuNJIHSADU0zPoXEANHC/CJQ9VXi2VkPWYu/o3ey675hhm83g4Gol1zFaZnNPdWtnvpzz3pm23+qfMRrP3/6/r/X0+hh85+nmX/Xfm9z9jxwgQSQmIOEPp/VLtr+7q30XaQOBi7SH6gvr/6+/759gUGYNgwPy9etHb//NgxHYY48Zc8484AJ+xN8+cpj5c1N6jT6efgxULyUJDRo8HjxIoqkgnU6rJKkyo1MmEVdlbq8/tio2Xzz+92+/hh7Y2PfDnnZb////tivb//ppv///3GtRNroKtdtiYhOyKSA54dcWIJUKiA8CqGhH/EQqaTS22Kh0VuREVT2/1jkddqAETOd02plawNh7D0gUlIdr7skBZn36Z//NixKEgqp6NlYxYAILyK/Me3NLnji/AGNiP/ppZtSKjlJ0l2MObV3mZmlKRXWLGCQYQn9E46Jzv2Dzpxf/ysWLHKsCQYdizplszP2DBZSnMOXSKJbJb9FksHm6vsrKhodn6+PayULvD6VXqeH9uhIKMilwo0yP2CBnvLSy5ndd/2V52KZe9tOpbuyd50yIzNGgAOzgxhghLnrj73v/zYsSuHxo60l/PYAIQvV0+Y8c5Eve9aCLA7KyvyIpHHTD8kJXBY/XQ6tJix57v/z7aytZXaxOQCgTtIiBnYRmvAn2ZvJfF28zb+Z/Wrwcqb/5/Lgd8RmsQIfWbvKmJZVsISTlOJrSMeiTp8eUdLeIww1K0tPobChnijzJqdpCJGzAMMm+137+u1ZW2uNLKYbv3Gp79FOb9/28bG1f/82LEwR+bMspeEZOsn/dzWfGbF0+9GL7VV/TnVRMHHkLI1ESlvkmsXHOKdIxhqJB5JP0v3//d3ec/7ZetjPpTRMCX/o6Cn9pNqMqqp20AIAUhbGQ2DwQQQajwGmTvpPPSJnvJq0tNMO2ctFOry7oUse5nGs7ig8jCIYQ3qdHYjFUUgVysGNUxCFFnQIsEwuheniLPf/0BXDnTKQBN//NgxNIfOz7PHjBM8CgRwalCwMUpnK7E5qBQEbDK2TUjYjWH2k3dej3zKqqtTHLqs9vnyUABQryVhSrrkyQmUKEyNw88ZKsHE3objjf+74httv+/7btZmU0T2JS6eKit8fLYwttZwfiZViggsacmKkdh5ktgIc8eqhp1WK2SNdYROKUApEUkD5b///7JlyI0S////2igt0ZUAoGR//NixOQd0rLHHlmEPAgysKqZnSLShsPjwV5YYAMPciIi7Ci6nWZJTz0MYSRsSxCiSQ4VJOUHQ7EQ9wkfzy/kPTrqVTK7NYDABwwlZGkyjLS80AYda6qhOy6TSym7epLitq8UJx4QIyIy1Hp4tNC4hSzspu9rPQ/P/9ZVIn0qoAIfdBUESW1UKGLSz4/pEzRC1EQ6kweN/JKHRUpUe//zYsT8KGruolJL04QXyh0uNArGozLD+BhSxsBVGq6QCMKqsGYMBMrUgqki2HXQme1PcskWvW74a8ra7QoAigIobgqjopUJPqbNBY8/DxS92FTwxupQx5wodcqHbcTeCAXqmqkinH6iNJHaKmXLGKEJhKncFgtEGSwkgSSOXMhMm5/3nEpmStNNRqu3p21/LaxJL1RjKUv/Khuu8yP/82DE6iJa2pZMSUfE0e5SmVHKxvSa6AMK0cpeIjjGM6sxUOVjKIh0ylZSpV8xqCQc3A1lRkqr88y7rO1aKlwMtBNoJmhGGCDxgh6UJfVM2PXVKmLM5lkah6mdmczj0nZS6sSjUtgKHoakUZGigFACs7Sl4SZKW8wDBTbOU6IMuq/75Tvn5yU53/OJJUcSx5kjM+fsYU+Z//USxW//82LE7yQTFmmCwYswDAQ86RpBWGiqw1Dp3DqwVATjoifDskVQVcoKB0NHutyw0Gmekq5p7JVuUeqqTEFNRaqqUJagECihZAmgmpE40ouLi2v/mRkfRrLJUcZlDBQoYGCDo5GyhgcdDIjWWWW//////mrKGBgg7oZMoYGiORqGCggYMFUcjL/7LmasoYGCDhI6GTLLZZZZb5GrWSx0//NixO4ikg4wCtGFFDJlWVDJlYMFBAwjoZFLP/8mVrLY5GrWSo5MrUKCBwxKtUxBTUUzLjEwMFVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVf/zYsTtH1vZLBQwRq1VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVU=
diff --git a/tvix/docs/src/lang-version.md b/tvix/docs/src/lang-version.md
new file mode 100644
index 000000000000..c288274c9105
--- /dev/null
+++ b/tvix/docs/src/lang-version.md
@@ -0,0 +1,62 @@
+# Nix language version history
+
+The Nix language (“Nix”) has its own versioning mechanism independent from its
+most popular implementation (“C++ Nix”): `builtins.langVersion`. It has been
+increased whenever the language has changed syntactically or semantically in a
+way that would not be introspectable otherwise. In particular, this does not
+include addition (or removal) of `builtins`, as this can be introspected using
+standard attribute set operations.
+
+Changes to `builtins.langVersion` are best found by viewing the git history of
+C++ Nix using `git log -G 'mkInt\\(v, [0-9]\\)'` for `builtins.langVersion` < 7.
+After that point `git log -G 'v\\.mkInt\\([0-9]+\\)'` should work. To reduce the
+amount of false positives, specify the version number you are interested in
+explicitly.
+
+## 1
+
+The first version of the Nix language is its state at the point when
+`builtins.langVersion` was added in [8b8ee53] which was first released
+as part of C++ Nix 1.2.
+
+## 2
+
+Nix version 2 changed the behavior of `builtins.storePath`: It would now [try to
+substitute the given path if missing][storePath-substitute], instead of creating
+an evaluation failure. `builtins.langVersion` was increased in [e36229d].
+
+## 3
+
+Nix version 3 changed the behavior of the `==` behavior. Strings would now be
+considered [equal even if they had differing string context][equal-no-ctx].
+
+## 4
+
+Nix version 4 [added the float type][float] to the language.
+
+## 5
+
+The [increase of `builtins.langVersion` to 5][langVersion-5] did not signify a
+language change, but added support for structured attributes to the Nix daemon.
+Eelco Dolstra writes as to what changed:
+
+> The structured attributes support. Unfortunately that's not so much a language
+> change as a build.cc (i.e. daemon) change, but we don't really have a way to
+> express that...
+
+Maybe `builtins.nixVersion` (which was added in version 1) should have been
+used instead. In any case, the [only `langVersion` check][nixpkgs-langVersion-5]
+in nixpkgs verifies a lower bound of 5.
+
+## 6
+
+Nix version 6 added support for [comparing two lists][list-comparison].
+
+[8b8ee53]: https://github.com/nixos/nix/commit/8b8ee53bc73769bb25d967ba259dabc9b23e2e6f
+[storePath-substitute]: https://github.com/nixos/nix/commit/22d665019a3770148929b7504c73bcdbe025ec12
+[e36229d]: https://github.com/nixos/nix/commit/e36229d27f9ab508e0abf1892f3e8c263d2f8c58
+[equal-no-ctx]: https://github.com/nixos/nix/commit/ee7fe64c0ac00f2be11604a2a6509eb86dc19f0a
+[float]: https://github.com/nixos/nix/commit/14ebde52893263930cdcde1406cc91cc5c42556f
+[langVersion-5]: https://github.com/nixos/nix/commit/8191992c83bf4387b03c5fdaba818dc2b520462d
+[list-comparison]: https://github.com/nixos/nix/commit/09471d2680292af48b2788108de56a8da755d661
+[nixpkgs-langVersion-5]: https://github.com/NixOS/nixpkgs/blob/d7ac3423d321b8b145ccdd1aed9dfdb280f5e391/pkgs/build-support/closure-info.nix#L11
diff --git a/tvix/docs/src/language-spec.md b/tvix/docs/src/language-spec.md
new file mode 100644
index 000000000000..b3908b2cf48a
--- /dev/null
+++ b/tvix/docs/src/language-spec.md
@@ -0,0 +1,69 @@
+# Specification of the Nix Language
+
+```admonish attention
+This document is a work in progress. Please keep an eye on
+[`topic:nix-spec`](https://cl.tvl.fyi/q/topic:nix-spec) for ongoing
+CLs.
+```
+
+Nix is a general-purpose, functional programming language which this
+document aims to describe.
+
+## Background
+
+Nix was designed and implemented as part of the [Nix package
+manager](https://nixos.org/nix). It is primarily used for generating
+so-called [*derivations*](#derivations), which are data structures
+describing how to build a package.
+
+The language has been described in the
+[thesis](https://edolstra.github.io/pubs/phd-thesis.pdf) introducing
+the package manager, but only on a high-level. At the time of writing,
+Nix is informally specified (via its only complete implementation in
+the package manager) and there is no complete overview over its -
+sometimes surprising - semantics.
+
+The primary project written in Nix is
+[nixpkgs](https://github.com/NixOS/nixpkgs/). Uncertainties in the
+process of writing this specification are resolved by investigating
+patterns in nixpkgs, which we consider canonical. The code in nixpkgs
+uses a reasonable subset of the features exposed by the current
+implementation, some of which are *accidental*, and is thus more
+useful for specifying how the language should work.
+
+## Introduction to Nix
+
+Nix is a general-purpose, partially lazy, functional programming
+language which provides higher-order functions, type reflection,
+primitive data types such as integers, strings and floats, and
+compound data structures such as lists and attribute sets.
+
+Nix has syntactic sugar for common operations, such as those for
+attribute sets, and also provides a wide range of built-in functions
+which have organically accumulated over time.
+
+Nix has a variety of legacy features that are not in practical use,
+but are documented in sections of this specification for the sake of
+completeness.
+
+This document describes the syntax and abstract semantics of the Nix
+language, but leaves out implementation details about how Nix can be
+interpreted/compiled/analysed etc.
+
+### Program structure
+
+This section describes the semantic structure of Nix, and how it
+relates to the rest of the specification.
+
+Each Nix program is a single [*expression*](#expressions) denoting a
+[*value*](#values) (commonly a [*function*](#functions)). Each value
+has a [*type*](#types), however this type is not statically known.
+
+Nix code is modularised through the use of the
+[*import*](#builtins-import) built-in function. No separate module
+system exists.
+
+In addition to chapters describing the building blocks mentioned
+above, this specificiation also describes the [*syntax*](#syntax), the
+available [built-in functions](#builtins), [*error handling*](#errors)
+and known [*deficiencies*](#deficiencies) in the language.
diff --git a/tvix/docs/src/nix-daemon/changelog.md b/tvix/docs/src/nix-daemon/changelog.md
new file mode 100644
index 000000000000..41c168374c50
--- /dev/null
+++ b/tvix/docs/src/nix-daemon/changelog.md
@@ -0,0 +1,202 @@
+
+
+## Nix version protocol
+
+| Nix version     | Protocol |
+| --------------- | -------- |
+| 0.11            | 1.02     |
+| 0.12            | 1.04     |
+| 0.13            | 1.05     |
+| 0.14            | 1.05     |
+| 0.15            | 1.05     |
+| 0.16            | 1.06     |
+| 1.0             | 1.10     |
+| 1.1             | 1.11     |
+| 1.2             | 1.12     |
+| 1.3 - 1.5.3     | 1.13     |
+| 1.6 - 1.10      | 1.14     |
+| 1.11 - 1.11.16  | 1.15     |
+| 2.0 - 2.0.4     | 1.20     |
+| 2.1 - 2.3.18    | 1.21     |
+| 2.4 - 2.6.1     | 1.32     |
+| 2.7.0           | 1.33     |
+| 2.8.0 - 2.14.1  | 1.34     |
+| 2.15.0 - 2.19.4 | 1.35     |
+| 2.20.0 - 2.22.0 | 1.37     |
+
+In commit [be64fbb501][be64fbb501] support was droped for protocol versions older than 1.10.
+This happened when the protocol was between 1.17 and 1.18 and was released with Nix 2.0.
+So this means that any version of Nix 2.x can't talk to Nix 0.x.
+
+## Operation History
+
+| Op              | Id | Commit         | Protocol | Nix Version | Notes |
+| --------------- | -- | -------------- | -------- | ----------- | ----- |
+| *Quit           | 0  | [a711689368][a711689368] || 0.11 | Became dead code in [7951c3c54][7951c3c54] (Nix 0.11) and removed in [d3c61d83b][d3c61d83b] (Nix 1.8) |
+| IsValidPath     | 1  | [a711689368][a711689368] || 0.11 ||
+| HasSubstitutes  | 3  | [0565b5f2b3][0565b5f2b3] || 0.11 | Obsolete [09a6321aeb][09a6321aeb]<br>Nix 1.2 Protocol 1.12 |
+| QueryPathHash   | 4  | [0565b5f2b3][0565b5f2b3] || 0.11 | Obsolete [e0204f8d46][e0204f8d46]<br>Nix 2.0 Protocol 1.16 |
+| QueryReferences | 5  | [0565b5f2b3][0565b5f2b3] || 0.11 | Obsolete [e0204f8d46][e0204f8d46]<br>Nix 2.0 Protocol 1.16 |
+| QueryReferrers  | 6  | [0565b5f2b3][0565b5f2b3] || 0.11 ||
+| AddToStore      | 7  | [0263279071][0263279071] || 0.11 ||
+| AddTextToStore  | 8  | [0263279071][0263279071] || 0.11 | Obsolete [c602ebfb34][c602ebfb34]<br>Nix 2.4 Protocol 1.25 |
+| BuildPaths      | 9  | [0565b5f2b3][0565b5f2b3] || 0.11 ||
+| EnsurePath      | 10 | [0565b5f2b3][0565b5f2b3] || 0.11 ||
+| AddTempRoot     | 11 | [e25fad691a][e25fad691a] || 0.11 ||
+| AddIndirectRoot | 12 | [74033a844f][74033a844f] || 0.11 ||
+| SyncWithGC      | 13 | [e25fad691a][e25fad691a] || 0.11 | Obsolete [9947f1646a][9947f1646a]<br> Nix 2.5.0 Protocol 1.32 |
+| FindRoots       | 14 | [29cf434a35][29cf434a35] || 0.11 ||
+| *CollectGarbage | 15 | [a9c4f66cfb][a9c4f66cfb] || 0.11 | Removed [a72709afd8][a72709afd8]<br>Nix 0.12 Protocol 1.02 |
+| ExportPath      | 16 | [0f5da8a83c][0f5da8a83c] || 0.11 | Obsolete [538a64e8c3][538a64e8c3]<br>Nix 2.0 Protocol 1.17 |
+| *ImportPath     | 17 | [0f5da8a83c][0f5da8a83c] || 0.11 | Removed [273b288a7e][273b288a7e]<br>Nix 1.0 Protocol 1.09 |
+| QueryDeriver    | 18 | [6d1a1191b0][6d1a1191b0] || 0.11 | Obsolete [e0204f8d46][e0204f8d46]<br>Nix 2.0 Protocol 1.16 |
+| SetOptions      | 19 | [f3441e6122][f3441e6122] || 0.11 ||
+| CollectGarbage              | 20 | [a72709afd8][a72709afd8] | 1.02  | 0.12   ||
+| QuerySubstitutablePathInfo  | 21 | [03427e76f1][03427e76f1] | 1.02  | 0.12   ||
+| QueryDerivationOutputs      | 22 | [e42401ee7b][e42401ee7b] | 1.05  | 1.0    | Obsolete [d38f860c3e][d38f860c3e]<br>Nix 2.4 Protocol 1.22* |
+| QueryAllValidPaths          | 23 | [24035b98b1][24035b98b1] | 1.05  | 1.0    ||
+| *QueryFailedPaths            | 24 | [f92c9a0ac5][f92c9a0ac5] | 1.05  | 1.0    | Removed [8cffec848][8cffec848]<br>Nix 2.0 Protocol 1.16 |
+| *ClearFailedPaths            | 25 | [f92c9a0ac5][f92c9a0ac5] | 1.05  | 1.0    | Removed [8cffec848][8cffec848]<br>Nix 2.0 Protocol 1.16 |
+| QueryPathInfo               | 26 | [1db6259076][1db6259076] | 1.06  | 1.0    ||
+| ImportPaths                 | 27 | [273b288a7e][273b288a7e] | 1.09  | 1.0    | Obsolete [538a64e8c3][538a64e8c3]<br>Nix 2.0 Protocol 1.17 |
+| QueryDerivationOutputNames  | 28 | [af2e53fd48][af2e53fd48]<br>([194d21f9f6][194d21f9f6]) | 1.08      | 1.0 | Obsolete<br>[045b07200c][045b07200c]<br>Nix 2.4 Protocol 1.21 |
+| QueryPathFromHashPart       | 29 | [ccc52adfb2][ccc52adfb2] | 1.11  | 1.1    ||
+| QuerySubstitutablePathInfos | 30 | [eb3036da87][eb3036da87] | 1.12* | 1.2    ||
+| QueryValidPaths             | 31 | [58ef4d9a95][58ef4d9a95] | 1.12  | 1.2    ||
+| QuerySubstitutablePaths     | 32 | [09a6321aeb][09a6321aeb] | 1.12  | 1.2    ||
+| QueryValidDerivers          | 33 | [2754a07ead][2754a07ead] | 1.13* | 1.3    ||
+| OptimiseStore               | 34 | [8fb8c26b6d][2754a07ead] | 1.14  | 1.8    ||
+| VerifyStore                 | 35 | [b755752f76][b755752f76] | 1.14  | 1.9    ||
+| BuildDerivation             | 36 | [71a5161365][71a5161365] | 1.14  | 1.10   ||
+| AddSignatures               | 37 | [d0f5719c2a][d0f5719c2a] | 1.16  | 2.0    ||
+| NarFromPath                 | 38 | [b4b5e9ce2f][b4b5e9ce2f] | 1.17  | 2.0    ||
+| AddToStoreNar               | 39 | [584f8a62de][584f8a62de] | 1.17  | 2.0    ||
+| QueryMissing                | 40 | [ba20730b3f][ba20730b3f] | 1.19* | 2.0    ||
+| QueryDerivationOutputMap    | 41 | [d38f860c3e][d38f860c3e] | 1.22* | 2.4    ||
+| RegisterDrvOutput           | 42 | [58cdab64ac][58cdab64ac] | 1.27  | 2.4    ||
+| QueryRealisation            | 43 | [58cdab64ac][58cdab64ac] | 1.27  | 2.4    ||
+| AddMultipleToStore          | 44 | [fe1f34fa60][fe1f34fa60] | 1.32* | 2.4    ||
+| AddBuildLog                 | 45 | [4dda1f92aa][4dda1f92aa] | 1.32  | 2.6.0  ||
+| BuildPathsWithResults       | 46 | [a4604f1928][a4604f1928] | 1.34* | 2.8.0  ||
+| AddPermRoot                 | 47 | [226b0f3956][226b0f3956] | 1.36* | 2.20.0 ||
+
+Notes: Ops that start with * have been removed.
+Protocol version that ends with * was bumped while adding that operation. Otherwise protocol version referes to the protocol version at the time the operation was added (so only at the next protocol version can you assume the operation is present/removed/obsolete since it was added/removed/obsoleted between protocol versions).
+
+## Protocol version change log
+
+- 1.01 [f3441e6122][f3441e6122] Initial Version
+- 1.02 [c370755583][c370755583] Use build hook
+- 1.03 [db4f4a8425][db4f4a8425] Backward compatibility check
+- 1.04 [96598e7b06][96598e7b06] SetOptions buildVerbosity
+- 1.05 [60ec75048a][60ec75048a] SetOptions useAtime & maxAtime
+- 1.06 [6846ed8b44][6846ed8b44] SetOptions buildCores
+- 1.07 [bdf089f463][bdf089f463] QuerySubstitutablePathInfo narSize
+- 1.08 [b1eb252172][b1eb252172] STDERR_ERROR exit status
+- 1.09 [e0bd307802][e0bd307802] ImportPath not supported on versions older than 1.09
+- 1.10 [db5b86ef13][db5b86ef13] SetOptions build-use-substitutess
+- 1.11 [4bc4da331a][4bc4da331a] open connection reserveSpace
+- 1.12 [eb3036da87][eb3036da87] Implement QuerySubstitutablePathInfos
+- 1.13 [2754a07ead][2754a07ead] Implement QueryValidDerivers
+- 1.14 [a583a2bc59][a583a2bc59] open connection cpu affinity
+- 1.15 [d1e3bf01bc][d1e3bf01bc] BuildPaths buildMode
+- 1.16 [9cee600c88][9cee600c88] QueryPathInfo ultimate & sigs
+- 1.17 [ddea253ff8][ddea253ff8] QueryPathInfo returns valid bool
+- 1.18 [4b8f1b0ec0][4b8f1b0ec0] Select between AddToStoreNar and ImportPaths
+- 1.19 [ba20730b3f][ba20730b3f] Implement QueryMissing
+- 1.20 [cfc8132391][cfc8132391] Don't send activity and result logs to old clients
+- 1.21 [6185d25e52][6185d25e52] AddToStoreNar uses TunnelLogger for data
+- 1.22 [d38f860c3e][d38f860c3e] Implement QueryDerivationOutputMap and obsolete QueryDerivationOutputs
+- 1.23 [4c0077a07d][4c0077a07d] AddToStoreNar uses FramedSink/-Source for data
+- 1.24 [5ccd94501d][5ccd94501d] Allow trustless building of CA derivations
+- 1.25 [e34fe47d0c][e34fe47d0c] New implementation of AddToStore
+- 1.26 [c43e882f54][c43e882f54] STDERR_ERROR serialize exception
+- 1.27 [3a63fc6cd5][3a63fc6cd5] QueryValidPaths substitute flag
+- 1.28 [27b5747ca7][27b5747ca7] BuildDerivation returns builtOutputs
+- 1.29 [9d309de0de][9d309de0de] BuildDerivation returns timesBuilt, isNonDeterministic, startTime & stopTime
+- 1.30 [e5951a6b2f][e5951a6b2f] Bump version number for DerivedPath changes
+- 1.31 [a8416866cf][a8416866cf] RegisterDrvOutput & QueryRealisation send realisations as JSON
+- 1.32 [fe1f34fa60][fe1f34fa60] Implement AddMultipleToStore
+- 1.33 [35dbdbedd4][35dbdbedd4] open connection sends nix version
+- 1.34 [a4604f1928][a4604f1928] Implement BuildPathsWithResults
+- 1.35 [9207f94582][9207f94582] open connection sends trusted option
+- 1.36 [226b0f3956][226b0f3956] Implement AddPermRoot
+- 1.37 [1e3d811840][1e3d811840] Serialize BuildResult send cpuUser & cpuSystem
+
+
+
+[0263279071]: https://github.com/NixOS/nix/commit/0263279071
+[03427e76f1]: https://github.com/NixOS/nix/commit/03427e76f1
+[045b07200c]: https://github.com/NixOS/nix/commit/045b07200c
+[0565b5f2b3]: https://github.com/NixOS/nix/commit/0565b5f2b3
+[09a6321aeb]: https://github.com/NixOS/nix/commit/09a6321aeb
+[0f5da8a83c]: https://github.com/NixOS/nix/commit/0f5da8a83c
+[194d21f9f6]: https://github.com/NixOS/nix/commit/194d21f9f6
+[1db6259076]: https://github.com/NixOS/nix/commit/1db6259076
+[1e3d811840]: https://github.com/NixOS/nix/commit/1e3d811840
+[24035b98b1]: https://github.com/NixOS/nix/commit/24035b98b1
+[226b0f3956]: https://github.com/NixOS/nix/commit/226b0f3956
+[273b288a7e]: https://github.com/NixOS/nix/commit/273b288a7e
+[2754a07ead]: https://github.com/NixOS/nix/commit/2754a07ead
+[27b5747ca7]: https://github.com/NixOS/nix/commit/27b5747ca7
+[29cf434a35]: https://github.com/NixOS/nix/commit/29cf434a35
+[35dbdbedd4]: https://github.com/NixOS/nix/commit/35dbdbedd4
+[3a63fc6cd5]: https://github.com/NixOS/nix/commit/3a63fc6cd5
+[4b8f1b0ec0]: https://github.com/NixOS/nix/commit/4b8f1b0ec0
+[4bc4da331a]: https://github.com/NixOS/nix/commit/4bc4da331a
+[4c0077a07d]: https://github.com/NixOS/nix/commit/4c0077a07d
+[4dda1f92aa]: https://github.com/NixOS/nix/commit/4dda1f92aa
+[538a64e8c3]: https://github.com/NixOS/nix/commit/538a64e8c3
+[584f8a62de]: https://github.com/NixOS/nix/commit/584f8a62de
+[58cdab64ac]: https://github.com/NixOS/nix/commit/58cdab64ac
+[58ef4d9a95]: https://github.com/NixOS/nix/commit/58ef4d9a95
+[5ccd94501d]: https://github.com/NixOS/nix/commit/5ccd94501d
+[60ec75048a]: https://github.com/NixOS/nix/commit/60ec75048a
+[6185d25e52]: https://github.com/NixOS/nix/commit/6185d25e52
+[6846ed8b44]: https://github.com/NixOS/nix/commit/6846ed8b44
+[6d1a1191b0]: https://github.com/NixOS/nix/commit/6d1a1191b0
+[71a5161365]: https://github.com/NixOS/nix/commit/71a5161365
+[74033a844f]: https://github.com/NixOS/nix/commit/74033a844f
+[7951c3c54]: https://github.com/NixOS/nix/commit/7951c3c54
+[8cffec848]: https://github.com/NixOS/nix/commit/8cffec848
+[8fb8c26b6d]: https://github.com/NixOS/nix/commit/8fb8c26b6d
+[9207f94582]: https://github.com/NixOS/nix/commit/9207f94582
+[96598e7b06]: https://github.com/NixOS/nix/commit/96598e7b06
+[9947f1646a]: https://github.com/NixOS/nix/commit/9947f1646a
+[9cee600c88]: https://github.com/NixOS/nix/commit/9cee600c88
+[9d309de0de]: https://github.com/NixOS/nix/commit/9d309de0de
+[a4604f1928]: https://github.com/NixOS/nix/commit/a4604f1928
+[a583a2bc59]: https://github.com/NixOS/nix/commit/a583a2bc59
+[a711689368]: https://github.com/NixOS/nix/commit/a711689368
+[a72709afd8]: https://github.com/NixOS/nix/commit/a72709afd8
+[a8416866cf]: https://github.com/NixOS/nix/commit/a8416866cf
+[a9c4f66cfb]: https://github.com/NixOS/nix/commit/a9c4f66cfb
+[af2e53fd48]: https://github.com/NixOS/nix/commit/af2e53fd48
+[b1eb252172]: https://github.com/NixOS/nix/commit/b1eb252172
+[b4b5e9ce2f]: https://github.com/NixOS/nix/commit/b4b5e9ce2f
+[b755752f76]: https://github.com/NixOS/nix/commit/b755752f76
+[ba20730b3f]: https://github.com/NixOS/nix/commit/ba20730b3f
+[bdf089f463]: https://github.com/NixOS/nix/commit/bdf089f463
+[be64fbb501]: https://github.com/NixOS/nix/commit/be64fbb501
+[c370755583]: https://github.com/NixOS/nix/commit/c370755583
+[c43e882f54]: https://github.com/NixOS/nix/commit/c43e882f54
+[c602ebfb34]: https://github.com/NixOS/nix/commit/c602ebfb34
+[ccc52adfb2]: https://github.com/NixOS/nix/commit/ccc52adfb2
+[cfc8132391]: https://github.com/NixOS/nix/commit/cfc8132391
+[d0f5719c2a]: https://github.com/NixOS/nix/commit/d0f5719c2a
+[d1e3bf01bc]: https://github.com/NixOS/nix/commit/d1e3bf01bc
+[d38f860c3e]: https://github.com/NixOS/nix/commit/d38f860c3e
+[d3c61d83b]: https://github.com/NixOS/nix/commit/d3c61d83b
+[db4f4a8425]: https://github.com/NixOS/nix/commit/db4f4a8425
+[db5b86ef13]: https://github.com/NixOS/nix/commit/db5b86ef13
+[ddea253ff8]: https://github.com/NixOS/nix/commit/ddea253ff8
+[e0204f8d46]: https://github.com/NixOS/nix/commit/e0204f8d46
+[e0bd307802]: https://github.com/NixOS/nix/commit/e0bd307802
+[e25fad691a]: https://github.com/NixOS/nix/commit/e25fad691a
+[e34fe47d0c]: https://github.com/NixOS/nix/commit/e34fe47d0c
+[e42401ee7b]: https://github.com/NixOS/nix/commit/e42401ee7b
+[e5951a6b2f]: https://github.com/NixOS/nix/commit/e5951a6b2f
+[eb3036da87]: https://github.com/NixOS/nix/commit/eb3036da87
+[f3441e6122]: https://github.com/NixOS/nix/commit/f3441e6122
+[f92c9a0ac5]: https://github.com/NixOS/nix/commit/f92c9a0ac5
+[fe1f34fa60]: https://github.com/NixOS/nix/commit/fe1f34fa60
diff --git a/tvix/docs/src/nix-daemon/handshake.md b/tvix/docs/src/nix-daemon/handshake.md
new file mode 100644
index 000000000000..0a436372b3ff
--- /dev/null
+++ b/tvix/docs/src/nix-daemon/handshake.md
@@ -0,0 +1,32 @@
+
+
+## client -> server
+- 0x6e697863 :: [Int](#int) (hardcoded, 'nixc' in ASCII)
+
+## server -> client
+- 0x6478696f :: [Int](#int) (hardcoded, 'dxio' in ASCII)
+- protocolVersion :: [Int](#int)
+
+## client -> server
+- clientVersion :: [Int](#int)
+
+### If clientVersion is 1.14 or later
+- sendCpu :: [Bool](#bool) (hardcoded to false in client)
+#### If sendCpu is true
+- cpuAffinity :: [Int](#int) (obsolete and ignored)
+
+### If clientVersion is 1.11 or later
+- reserveSpace :: [Bool](#bool) (obsolete, ignored and set to false)
+
+
+## server -> client
+
+### If clientVersion is 1.33 or later
+- nixVersion :: String
+
+### If clientVersion is 1.35 or later
+- trusted :: OptTrusted
+
+## server -> client
+- send logs
+- [operation](./operations.md) :: Int
\ No newline at end of file
diff --git a/tvix/docs/src/nix-daemon/index.md b/tvix/docs/src/nix-daemon/index.md
new file mode 100644
index 000000000000..e47c20151e0d
--- /dev/null
+++ b/tvix/docs/src/nix-daemon/index.md
@@ -0,0 +1,15 @@
+# Nix Daemon Protocol
+
+The Nix Daemon protocol is what's used to communicate with the `nix-daemon`,
+either on the local system (in which case the communication happens via a Unix
+domain socket), or with a remote Nix (in which this is tunneled over SSH).
+
+It uses a custom binary format which isn't too documented. The subpages here
+collect serve as an in-depth detail about some of the inner workings, data types
+etc.
+
+A first implementation of this exists in
+[griff/Nix.rs](https://github.com/griff/Nix.rs/tree/main).
+
+Work is underway to port / factor this out into reusable building blocks into
+the [nix-compat] crate.
diff --git a/tvix/docs/src/nix-daemon/logging.md b/tvix/docs/src/nix-daemon/logging.md
new file mode 100644
index 000000000000..c2828b13c21a
--- /dev/null
+++ b/tvix/docs/src/nix-daemon/logging.md
@@ -0,0 +1,124 @@
+# Logging
+
+Because the daemon protocol only has one sender stream and one receiver stream
+logging messages need to be carefully interleaved with requests and responses.
+Usually this means that after the operation and all of its inputs (the request)
+has been read logging hijacks the sender stream (in the server case) and uses
+it to send typed logging messages while the request is being processed. When
+the response has been generated it will send `STDERR_LAST` to mark that what
+follows is the response data to the request. If the request failed a
+`STDERR_ERROR` message is sent with the error and no response is sent.
+
+While not in this state between request reading and response sending all
+messages and activities are buffered until next time the logger can send data.
+
+The logging messages supported are:
+- [`STDERR_LAST`](#stderr_last)
+- [`STDERR_ERROR`](#stderr_error)
+- [`STDERR_NEXT`](#stderr_next)
+- [`STDERR_READ`](#stderr_read)
+- [`STDERR_WRITE`](#stderr_write)
+- [`STDERR_START_ACTIVITY`](#stderr_start_activity)
+- [`STDERR_STOP_ACTIVITY`](#stderr_stop_activity)
+- [`STDERR_RESULT`](#stderr_result)
+
+
+### `STDERR_LAST`
+Marks the end of the logs, normal processing can resume.
+
+- 0x616c7473 :: [UInt64][se-UInt64] (hardcoded, 'alts' in ASCII)
+
+
+### `STDERR_ERROR`
+This also marks the end of this log "session" and so it
+has the same effect as `STDERR_LAST`.
+On the client the error is thrown as an exception and no response is read.
+
+#### If protocol version is 1.26 or newer
+- 0x63787470 :: [UInt64][se-UInt64] (hardcoded, 'cxtp' in ASCII)
+- error :: [Error][se-Error]
+
+#### If protocol version is older than 1.26
+- 0x63787470 :: [UInt64][se-UInt64] (hardcoded, 'cxtp' in ASCII)
+- msg :: [String][se-String] (If logger is JSON, invalid UTF-8 is replaced with U+FFFD)
+- exitStatus :: [Int][se-Int]
+
+
+### `STDERR_NEXT`
+Normal string log message.
+
+- 0x6f6c6d67 :: [UInt64][se-UInt64] (hardcoded, 'olmg' in ASCII)
+- msg :: [String][se-String] (If logger is JSON, invalid UTF-8 is replaced with U+FFFD)
+
+
+### `STDERR_READ`
+Reader interface used by ImportsPaths and AddToStoreNar (between 1.21 and 1.23).
+It works by sending a desired buffer length and then on the receiver stream it
+reads bytes buffer of that length. If it receives 0 bytes it sees this as an
+unexpected EOF.
+
+- 0x64617461 :: [UInt64][se-UInt64] (hardcoded, 'data' in ASCII)
+- desiredLen :: [Size][se-Size]
+
+
+### `STDERR_WRITE`
+Writer interface used by ExportPath. Simply writes a buffer.
+
+- 0x64617416 :: [UInt64][se-UInt64] (hardcoded)
+- buffer :: [Bytes][se-Bytes]
+
+
+### `STDERR_START_ACTIVITY`
+Begins an activity. In other tracing frameworks this would be called a span.
+
+Implemented in protocol 1.20. To achieve backwards compatible with older
+versions of the protocol instead of sending an `STDERR_START_ACTIVITY`
+the level is checked against enabled logging level and the text field is
+sent as a simple log message with `STDERR_NEXT`.
+
+- 0x53545254 :: [UInt64][se-UInt64] (hardcoded, 'STRT' in ASCII)
+- act :: [UInt64][se-UInt64]
+- level :: [Verbosity][se-Verbosity]
+- type :: [ActivityType][se-ActivityType]
+- text :: [String][se-String] (If logger is JSON, invalid UTF-8 is replaced with U+FFFD)
+- fields :: [List][se-List] of [Field][se-Field]
+- parent :: [UInt64][se-UInt64]
+
+
+act is atomic (nextId++ + (getPid() << 32))
+
+
+### `STDERR_STOP_ACTIVITY`
+Stops the given activity. The activity id should not send any more results.
+Just sends `ActivityId`.
+
+Implemented in protocol 1.20. When backwards compatible with older versions of
+the protocol and this message would have been sent it is instead ignored.
+
+- 0x53544f50 :: [UInt64][se-UInt64] (hardcoded, 'STOP' in ASCII)
+
+
+### `STDERR_RESULT`
+Sends results for a given activity.
+
+Implemented in protocol 1.20. When backwards compatible with older versions of
+the protocol and this message would have been sent it is instead ignored.
+
+- 0x52534c54 :: [UInt64][se-UInt64] (hardcoded, 'RSLT' in ASCII)
+- act :: [UInt64][se-UInt64]
+- type :: [ResultType][se-ResultType]
+- fields :: [List][se-List] of [Field][se-Field]
+
+
+
+[se-UInt64]: ./serialization.md#uint64
+[se-Int]: ./serialization.md#int
+[se-Size]: ./serialization.md#size
+[se-Verbosity]: ./serialization.md#verbosity
+[se-ActivityType]: ./serialization.md#activitytype
+[se-ResultType]: ./serialization.md#resulttype
+[se-Bytes]: ./serialization.md#bytes
+[se-String]: ./serialization.md#string
+[se-List]: ./serialization.md#list-of-x
+[se-Error]: ./serialization.md#error
+[se-Field]: ./serialization.md#field
\ No newline at end of file
diff --git a/tvix/docs/src/nix-daemon/operations.md b/tvix/docs/src/nix-daemon/operations.md
new file mode 100644
index 000000000000..80708c9104b5
--- /dev/null
+++ b/tvix/docs/src/nix-daemon/operations.md
@@ -0,0 +1,904 @@
+
+# TOC
+
+| Operation                                                   | Id |
+| ----------------------------------------------------------- | -- |
+| [IsValidPath](#isvalidpath)                                 | 1  |
+| [HasSubstitutes](#hassubstitutes)                           | 3  |
+| [QueryReferrers](#queryreferrers)                           | 6  |
+| [AddToStore](#addtostore)                                   | 7  |
+| [BuildPaths](#buildpaths)                                   | 9  |
+| [EnsurePath](#ensurepath)                                   | 10 |
+| [AddTempRoot](#addtemproot)                                 | 11 |
+| [AddIndirectRoot](#addindirectroot)                         | 12 |
+| [FindRoots](#findroots)                                     | 14 |
+| [SetOptions](#setoptions)                                   | 19 |
+| [CollectGarbage](#collectgarbage)                           | 20 |
+| [QueryAllValidPaths](#queryallvalidpaths)                   | 23 |
+| [QueryPathInfo](#querypathinfo)                             | 26 |
+| [QueryPathFromHashPart](#querypathfromhashpart)             | 29 |
+| [QueryValidPaths](#queryvalidpaths)                         | 31 |
+| [QuerySubstitutablePaths](#querysubstitutablepaths)         | 32 |
+| [QueryValidDerivers](#queryvalidderivers)                   | 33 |
+| [OptimiseStore](#optimisestore)                             | 34 |
+| [VerifyStore](#verifystore)                                 | 35 |
+| [BuildDerivation](#buildderivation)                         | 36 |
+| [AddSignatures](#addsignatures)                             | 37 |
+| [NarFromPath](#narfrompath)                                 | 38 |
+| [AddToStoreNar](#addtostore)                                | 39 |
+| [QueryMissing](#querymissing)                               | 40 |
+| [QueryDerivationOutputMap](#queryderivationoutputmap)       | 41 |
+| [RegisterDrvOutput](#registerdrvoutput)                     | 42 |
+| [QueryRealisation](#queryrealisation)                       | 43 |
+| [AddMultipleToStore](#addmultipletostore)                   | 44 |
+| [AddBuildLog](#addbuildlog)                                 | 45 |
+| [BuildPathsWithResults](#buildpathswithresults)             | 46 |
+| [AddPermRoot](#addpermroot)                                 | 47 |
+
+
+## Obsolete operations
+
+| Operation                                                   | Id |
+| ----------------------------------------------------------- | -- |
+| [QueryPathHash](#querypathhash)                             | 4  |
+| [QueryReferences](#queryreferences)                         | 5  |
+| [AddTextToStore](#addtexttostore)                           | 8  |
+| [SyncWithGC](#syncwithgc)                                   | 13 |
+| [ExportPath](#exportpath)                                   | 16 |
+| [QueryDeriver](#queryderiver)                               | 18 |
+| [QuerySubstitutablePathInfo](#querysubstitutablepathinfo)   | 21 |
+| [QueryDerivationOutputs](#queryderivationoutputs)           | 22 |
+| [ImportPaths](#importpaths)                                 | 27 |
+| [QueryDerivationOutputNames](#queryderivationoutputnames)   | 28 |
+| [QuerySubstitutablePathInfos](#querysubstitutablepathinfos) | 30 |
+
+
+## Removed operations
+
+| Operation                                         | Id |
+| ------------------------------------------------- | -- |
+| [Quit](#quit-removed)                             | 0  |
+| [ImportPath](#importpath-removed)                 | 17 |
+| [Old CollectGarbage](#old-collectgarbage-removed) | 15 |
+| [QueryFailedPaths](#queryfailedpaths)             | 24 |
+| [ClearFailedPaths](#clearfailedpaths)             | 25 |
+
+
+
+## Quit (removed)
+
+**Id:** 0<br>
+**Introduced:** Nix 0.11<br>
+**Removed:** Became dead code in Nix 0.11 and removed in Nix 1.8
+
+
+## IsValidPath
+
+**Id:** 1<br>
+**Introduced:** Nix 0.11<br>
+
+As the name says checks that a store path is valid i.e. in the store.
+
+This is a pretty core operation used everywhere.
+
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+isValid :: [Bool][se-Bool]
+
+
+## HasSubstitutes
+
+**Id:** 3<br>
+**Introduced:** Nix 0.11<br>
+**Obsolete** Protocol 1.12, Nix 1.2<br>
+
+Replaced by QuerySubstitutablePaths.
+
+Checks if we can substitute the input path from a substituter. Uses
+QuerySubstitutablePaths under the hood :/
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+hasSubstitutes :: [Bool][se-Bool]
+
+
+## QueryPathHash
+
+**Id:** 4<br>
+**Introduced:** Nix 0.11<br>
+**Obsolete:** Protocol 1.16, Nix 2.0<br>
+
+Retrieves the base16 NAR hash of a given store path.
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+hash :: [NARHash][se-NARHash]
+
+
+## QueryReferences
+
+**Id:** 5<br>
+**Introduced:** Nix 0.11<br>
+**Obsolete:** Protocol 1.16, Nix 2.0<br>
+
+Retrieves the references of a given path
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+references :: [Set][se-Set] of [StorePath][se-StorePath]
+
+
+## QueryReferrers
+
+**Id:** 6<br>
+**Introduced:** Nix 0.11<br>
+
+Retrieves the referrers of a given path.
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+referrers :: [Set][se-Set] of [StorePath][se-StorePath]
+
+
+## AddToStore
+
+**Id:** 7<br>
+**Introduced:** Nix 0.11<br>
+
+Add a new path to the store.
+
+### Before protocol version 1.25
+#### Inputs
+- baseName :: [StorePathName][se-StorePathName]
+- fixed :: [Bool64][se-Bool64]
+- recursive :: [FileIngestionMethod][se-FileIngestionMethod]
+- hashAlgo :: [HashAlgorithm][se-HashAlgorithm]
+- NAR dump
+
+If fixed is `true`, hashAlgo is forced to `sha256` and recursive is forced to
+`NixArchive`.
+
+Only `Flat` and `NixArchive` values are supported for the recursive input
+parameter.
+
+#### Outputs
+path :: [StorePath][se-StorePath]
+
+### Protocol version 1.25 or newer
+#### Inputs
+- name :: [StorePathName][se-StorePathName]
+- camStr :: [ContentAddressMethodWithAlgo][se-ContentAddressMethodWithAlgo]
+- refs :: [Set][se-Set] of [StorePath][se-StorePath]
+- repairBool :: [Bool64][se-Bool64]
+- [Framed][se-Framed] NAR dump
+
+#### Outputs
+info :: [ValidPathInfo][se-ValidPathInfo]
+
+
+## AddTextToStore
+
+**Id:** 8<br>
+**Introduced:** Nix 0.11<br>
+**Obsolete:** Protocol 1.25, Nix 2.4
+
+Add a text file as a store path.
+
+This was obsoleted by adding the functionality implemented by this operation
+to [AddToStore](#addtostore). And so this corresponds to calling
+[AddToStore](#addtostore) with `camStr` set to `text:sha256` and `text`
+wrapped as a NAR.
+
+### Inputs
+- suffix :: [StorePathName][se-StorePathName]
+- text :: [Bytes][se-Bytes]
+- refs :: [Set][se-Set] of [StorePath][se-StorePath]
+
+### Outpus
+path :: [StorePath][se-StorePath]
+
+
+## BuildPaths
+
+**Id:** 9<br>
+****Introduced:**** Nix 0.11<br>
+
+Build (or substitute) a list of derivations.
+
+### Inputs
+paths :: [Set][se-Set] of [DerivedPath][se-DerivedPath]
+
+#### Protocol 1.15 or newer
+mode :: [BuildMode][se-BuildMode] (defaults to Normal)
+
+Check that connection is trusted before allowing Repair mode.
+
+### Outputs
+1 :: [Int][se-Int] (hardcoded and ignored by client)
+
+
+## EnsurePath
+
+**Id:** 10<br>
+**Introduced:** Nix 0.11<br>
+
+Checks if a path is valid. Note: it may be made valid by running a substitute.
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+1 :: [Int][se-Int] (hardcoded and ignored by client)
+
+
+## AddTempRoot
+
+**Id:** 11<br>
+**Introduced:** Nix 0.11<br>
+
+Creates a temporary GC root for the given store path.
+
+Temporary GC roots are valid only for the life of the connection and are used
+primarily to prevent the GC from pulling the rug out from under the client and
+deleting store paths that the client is actively doing something with.
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+1 :: [Int][se-Int] (hardcoded and ignored by client)
+
+
+## AddIndirectRoot
+
+**Id:** 12<br>
+**Introduced:** Nix 0.11<br>
+
+Add an indirect root, which is a weak reference to the user-facing symlink
+created by [AddPermRoot](#addpermroot).
+
+Only ever sent on the local unix socket nix daemon.
+
+### Inputs
+path :: [Path][se-Path]
+
+### Outputs
+1 :: [Int][se-Int] (hardcoded and ignored by client)
+
+
+## SyncWithGC
+
+**Id:** 13<br>
+**Introduced:** Nix 0.11<br>
+**Obsolete:** Protocol 1.32, Nix 2.5.0
+
+Acquire the global GC lock, then immediately release it.  This function must be
+called after registering a new permanent root, but before exiting.  Otherwise,
+it is possible that a running garbage collector doesn't see the new root and
+deletes the stuff we've just built.  By acquiring the lock briefly, we ensure
+that either:
+
+- The collector is already running, and so we block until the
+    collector is finished.  The collector will know about our
+    *temporary* locks, which should include whatever it is we
+    want to register as a permanent lock.
+- The collector isn't running, or it's just started but hasn't
+    acquired the GC lock yet.  In that case we get and release
+    the lock right away, then exit.  The collector scans the
+    permanent root and sees ours.
+
+In either case the permanent root is seen by the collector.
+
+Was made obsolete by using [AddTempRoot](#addtemproot) to accomplish the same
+thing.
+
+
+## FindRoots
+
+**Id:** 14<br>
+**Introduced:** Nix 0.11<br>
+
+Find the GC roots.
+
+### Outputs
+roots :: [Map][se-Map] of [Path][se-Path] to [StorePath][se-StorePath]
+
+The key is the link pointing to the given store path.
+
+
+## Old CollectGarbage (removed)
+
+**Id:** 15<br>
+**Introduced:** Nix 0.11<br>
+**Removed:** Protocol 1.02, Nix 0.12<br>
+
+
+## ExportPath
+
+**Id:** 16<br>
+**Introduced:** Nix 0.11<br>
+**Obsolete:** Protocol 1.17, Nix 2.0<br>
+
+Export a store path in the binary format nix-store --import expects. See implementation there https://github.com/NixOS/nix/blob/db3bf180a569cb20db42c5e4669d2277be6f46b6/src/libstore/export-import.cc#L29 for more details.
+
+### Inputs
+- path :: [StorePath][se-StorePath]
+- sign :: [Int][se-Int] (ignored and hardcoded to 0 in client)
+
+### Outputs
+Uses [`STDERR_WRITE`](./logging.md#stderr_write) to send dump in
+[export format][se-ExportFormat]
+
+After dump it outputs.
+
+1 :: [Int][se-Int] (hardcoded)
+
+
+## ImportPath (removed)
+
+**Id:** 17<br>
+**Introduced:** Nix 0.11<br>
+**Removed:** Protocol 1.09, Nix 1.0<br>
+
+
+## QueryDeriver
+
+**Id:** 18<br>
+**Introduced:** Nix 0.11<br>
+**Obsolete:** Protocol 1.16, Nix 2.0<br>
+
+Returns the store path of the derivation for a given store path.
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+deriver :: [OptStorePath][se-OptStorePath]
+
+
+## SetOptions
+
+**Id:** 19<br>
+**Introduced:** Nix 0.11<br>
+
+Sends client options to the remote side.
+
+Only ever used right after the handshake.
+
+### Inputs
+
+- keepFailed :: [Bool][se-Bool]
+- keepGoing :: [Bool][se-Bool]
+- tryFallback :: [Bool][se-Bool]
+- verbosity :: [Verbosity][se-Verbosity]
+- maxBuildJobs :: [Int][se-Int]
+- maxSilentTime :: [Time][se-Time]
+- useBuildHook :: [Bool][se-Bool] (ignored and hardcoded to true in client)
+- verboseBuild :: [Verbosity][se-Verbosity]
+- logType :: [Int][se-Int] (ignored and hardcoded to 0 in client)
+- printBuildTrace :: [Int][se-Int] (ignored and hardcoded to 0 in client)
+- buildCores :: [Int][se-Int]
+- useSubstitutes :: [Bool][se-Bool]
+
+### Protocol 1.12 or newer
+otherSettings :: [Map][se-Map] of [String][se-String] to [String][se-String]
+
+
+## CollectGarbage
+
+**Id:** 20<br>
+**Introduced:** Protocol 1.02, Nix 0.12<br>
+
+Find the GC roots.
+
+### Inputs
+- action :: [GCAction][se-GCAction]
+- pathsToDelete :: [Set][se-Set] of [StorePath][se-StorePath]
+- ignoreLiveness :: [Bool64][se-Bool64]
+- maxFreed :: [UInt64][se-UInt64]
+- removed :: [Int][se-Int] (ignored and hardcoded to 0 in client)
+- removed :: [Int][se-Int] (ignored and hardcoded to 0 in client)
+- removed :: [Int][se-Int] (ignored and hardcoded to 0 in client)
+
+### Outputs
+- pathsDeleted :: [Set][se-Set] of [Path][se-Path]
+- bytesFreed :: [UInt64][se-UInt64]
+- 0 :: [UInt64][se-UInt64] (hardcoded, obsolete and ignored by client)
+
+Depending on the value of the action input the value of output pathsDeleted
+is either, the GC roots, or the paths that would be or have been deleted.
+
+
+## QuerySubstitutablePathInfo
+
+**Id:** 21<br>
+**Introduced:** Protocol 1.02, Nix 0.12<br>
+**Obsolete:** Protocol 1.12, Nix 1.2<br>
+
+Retrieves the various substitutable paths infos for a given path.
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+found :: [Bool][se-Bool]
+
+#### If found is true
+- info :: [SubstitutablePathInfo][se-SubstitutablePathInfo]
+
+
+## QueryDerivationOutputs
+
+**Id:** 22<br>
+**Introduced:** Protocol 1.05, Nix 1.0<br>
+**Obsolete:** Protocol 1.22*, Nix 2.4<br>
+
+Retrieves all the outputs paths of a given derivation.
+
+### Inputs
+path :: [StorePath][se-StorePath] (must point to a derivation)
+
+### Outputs
+derivationOutputs :: [Set][se-Set] of [StorePath][se-StorePath]
+
+
+## QueryAllValidPaths
+
+**Id:** 23<br>
+**Introduced:** Protocol 1.05, Nix 1.0<br>
+
+Retrieves all the valid paths contained in the store.
+
+### Outputs
+paths :: [Set][se-Set] of [StorePath][se-StorePath]
+
+
+## QueryFailedPaths (removed)
+
+**Id:** 24<br>
+**Introduced:** Protocol 1.05, Nix 1.0<br>
+**Removed:** Protocol 1.16, Nix 2.0<br>
+
+Failed build caching API only ever used by Hydra.
+
+
+## ClearFailedPaths (removed)
+
+**Id:** 25<br>
+**Introduced:** Protocol 1.05, Nix 1.0<br>
+**Removed:** Protocol 1.16, Nix 2.0<br>
+
+Failed build caching API only ever used by Hydra.
+
+
+## QueryPathInfo
+
+**Id:** 26<br>
+**Introduced:** Protocol 1.06, Nix 1.0<br>
+
+Retrieves the pathInfo for a given path.
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+
+#### If protocol version is 1.17 or newer
+success :: [Bool64][se-Bool64]
+
+##### If success is true
+pathInfo :: [UnkeyedValidPathInfo][se-UnkeyedValidPathInfo]
+
+#### If protocol version is older than 1.17
+If info not found return error with [`STDERR_ERROR`](./logging.md#stderr_error)
+
+pathInfo :: [UnkeyedValidPathInfo][se-UnkeyedValidPathInfo]
+
+
+## ImportPaths
+
+**Id:** 27<br>
+**Introduced:** Protocol 1.09, Nix 1.0<br>
+**Obsolete:** Protocol 1.17, Nix 2.0<br>
+
+Older way of adding a store path to the remote store.
+
+It was obsoleted and replaced by AddToStoreNar because it sends the NAR
+before the metadata about the store path and so you would typically have
+to store the NAR in memory or temporarily on disk before processing it.
+
+### Inputs
+[List of NAR dumps][se-ImportPaths] coming from one or more ExportPath operations.
+
+### Outputs
+importedPaths :: [List][se-List] of [StorePath][se-StorePath]
+
+
+## QueryDerivationOutputNames
+
+**Id:** 28<br>
+**Introduced:** Protocol 1.08, Nix 1.0<br>
+**Obsolete:** Protocol 1.21, Nix 2.4<br>
+
+Retrieves the name of the outputs of a given derivation. EG. out, dev, etc.
+
+### Inputs
+path :: [StorePath][se-StorePath] (must be a derivation path)
+
+### Outputs
+names :: [Set][se-Set] of [OutputName][se-OutputName]
+
+
+## QueryPathFromHashPart
+
+**Id:** 29<br>
+**Introduced:** Protocol 1.11, Nix 1.1<br>
+
+Retrieves a store path from a nixbase32 (input) hash.
+
+### Inputs
+hashPart :: [StorePathHash][se-StorePathHash]
+
+### Outputs
+path :: [OptStorePath][se-OptStorePath]
+
+
+## QuerySubstitutablePathInfos
+
+**Id:** 30<br>
+**Introduced:** Protocol 1.12*, Nix 1.2<br>
+**Obsolete:** Protocol 1.19*, Nix 2.0<br>
+
+Retrieves the various substitutable paths infos for set of store paths.
+
+Only ever used in the fallback for QueryMissing which means that if protocol is 1.19 or later
+it is never sent and is therefore obsolete after that.
+
+### Inputs
+#### If protocol version is 1.22 or newer
+paths :: [Map][se-Map] of [StorePath][se-StorePath] to [OptContentAddress][se-OptContentAddress] 
+
+#### If protocol version older than 1.22
+paths :: [Set][se-Set] of [StorePath][se-StorePath]
+
+### Outputs
+infos :: [Map][se-Map] of [StorePath][se-StorePath] to [SubstitutablePathInfo][se-SubstitutablePathInfo]
+
+
+## QueryValidPaths
+
+**Id:** 31<br>
+**Introduced:** Protocol 1.12, Nix 1.2<br>
+
+Takes a list of store paths and returns a new list only containing the valid store paths
+
+## Inputs
+paths :: [Set][se-Set] of [StorePath][se-StorePath]
+
+### If protocol version is 1.27 or newer
+substitute :: [Bool][se-Bool] (defaults to false if not sent)
+
+## Outputs
+paths :: [Set][se-Set] of [StorePath][se-StorePath]
+
+
+## QuerySubstitutablePaths
+
+**Id:** 32<br>
+**Introduced:** Protocol 1.12, Nix 1.2<br>
+
+Takes a set of store path, returns a filtered new set of paths that can be
+substituted.
+
+In versions of the protocol prior to 1.12 [HasSubstitutes](#hassubstitutes)
+is used to implement the functionality that this operation provides.
+
+### Inputs
+paths :: [Set][se-Set] of [StorePath][se-StorePath]
+
+### Outputs
+paths :: [Set][se-Set] of [StorePath][se-StorePath]
+
+
+## QueryValidDerivers
+
+**Id:** 33<br>
+**Introduced:** Protocol 1.13*, Nix 1.3<br>
+
+Retrieves the derivers of a given path.
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+derivers :: [Set][se-Set] of [StorePath][se-StorePath]
+
+
+## OptimiseStore
+
+**Id:** 34<br>
+**Introduced:** Protocol 1.14, Nix 1.8<br>
+
+Optimise store by hardlinking files with the same content.
+
+### Outputs
+1 :: [Int][se-Int] (hardcoded and ignored by client)
+
+
+## VerifyStore
+
+**Id:** 35<br>
+**Introduced:** Protocol 1.14, Nix 1.9<br>
+
+Verify store either only db and existence of path or entire contents of store
+paths against the NAR hash. 
+
+### Inputs
+- checkContents :: [Bool64][se-Bool64]
+- repair :: [Bool64][se-Bool64]
+
+### Outputs
+errors :: [Bool][se-Bool]
+
+
+## BuildDerivation
+
+**Id:** 36<br>
+**Introduced:** Protocol 1.14, Nix 1.10<br>
+
+Main build operation used when remote building.
+
+When functioning as a remote builder this operation is used instead of
+BuildPaths so that it doesn't have to send the entire tree of derivations
+to the remote side first before it can start building. What this does
+instead is have a reduced version of the derivation to be built sent as
+part of its input and then only building that derivation.
+
+The paths required by the build need to be part of the remote store
+(by copying with AddToStoreNar or substituting) before this operation is
+called.
+
+### Inputs
+- drvPath :: [StorePath][se-StorePath]
+- drv :: [BasicDerivation][se-BasicDerivation]
+- buildMode :: [BuildMode][se-BuildMode]
+
+### Outputs
+buildResult :: [BuildResult][se-BuildResult]
+
+
+## AddSignatures
+
+**Id:** 37<br>
+**Introduced:** Protocol 1.16, Nix 2.0<br>
+
+Add the signatures associated to a given path. Used by `nix store copy-sigs` and `nix store sign`.
+
+### Inputs
+- path :: [StorePath][se-StorePath]
+- signatures :: [Set][se-Set] of [Signature][se-Signature]
+
+### Outputs
+1 :: [Int][se-Int] (hardcoded and ignored by client)
+
+
+## NarFromPath
+
+**Id:** 38<br>
+**Introduced:** Protocol 1.17, Nix 2.0<br>
+
+Main way of getting the contents of a store path to the client.
+
+As the name suggests this is done by sending a NAR file.
+
+It replaced the now obsolete ExportPath operation and is used by newer clients to
+implement the export functionality for cli. It is also used when remote building
+to transfer build results from remote builder to client.
+
+### Inputs
+path :: [StorePath][se-StorePath]
+
+### Outputs
+NAR dumped straight to the stream.
+
+
+## AddToStoreNar
+
+**Id:** 39<br>
+**Introduced:** Protocol 1.17, Nix 2.0<br>
+
+Dumps a path as a NAR
+
+### Inputs
+- path :: [StorePath][se-StorePath]
+- deriver :: [OptStorePath][se-OptStorePath]
+- narHash :: [NARHash][se-NARHash]
+- references :: [Set][se-Set] of [StorePath][se-StorePath]
+- registrationTime :: [Time][se-Time]
+- narSize :: [UInt64][se-UInt64]
+- ultimate :: [Bool64][se-Bool64]
+- signatures :: [Set][se-Set] of [Signature][se-Signature]
+- ca :: [OptContentAddress][se-OptContentAddress]
+- repair :: [Bool64][se-Bool64]
+- dontCheckSigs :: [Bool64][se-Bool64]
+
+#### If protocol version is 1.23 or newer
+[Framed][se-Framed] NAR dump
+
+#### If protocol version is between 1.21 and 1.23
+NAR dump sent using [`STDERR_READ`](./logging.md#stderr_read)
+
+#### If protocol version is older than 1.21
+NAR dump sent raw on stream
+
+
+## QueryMissing
+
+**Id:** 40<br>
+**Introduced:** Protocol 1.19*, Nix 2.0<br>
+
+### Inputs
+targets :: [List][se-List] of [DerivedPath][se-DerivedPath]
+
+### Outputs
+- willBuild :: [Set][se-Set] of [StorePath][se-StorePath]
+- willSubstitute :: [Set][se-Set] of [StorePath][se-StorePath]
+- unknown :: [Set][se-Set] of [StorePath][se-StorePath]
+- downloadSize :: [UInt64][se-UInt64]
+- narSize :: [UInt64][se-UInt64]
+
+
+## QueryDerivationOutputMap
+
+**Id:** 41<br>
+**Introduced:** Protocol 1.22*, Nix 2.4<br>
+
+Retrieves an associative map outputName -> storePath for a given derivation.
+
+### Inputs
+path :: [StorePath][se-StorePath]  (must be a derivation path)
+
+### Outputs
+outputs :: [Map][se-Map] of [OutputName][se-OutputName] to [OptStorePath][se-OptStorePath]
+
+
+## RegisterDrvOutput
+
+**Id:** 42<br>
+**Introduced:** Protocol 1.27, Nix 2.4<br>
+
+Registers a DRV output
+
+### Inputs
+#### If protocol is 1.31 or newer
+realisation :: [Realisation][se-Realisation]
+
+#### If protocol is older than 1.31
+- outputId :: [DrvOutput][se-DrvOutput]
+- outputPath :: [StorePath][se-StorePath]
+
+
+## QueryRealisation
+
+**Id:** 43<br>
+**Introduced:** Protocol 1.27, Nix 2.4<br>
+
+Retrieves the realisations attached to a drv output id realisations.
+
+### Inputs
+outputId :: [DrvOutput][se-DrvOutput]
+
+### Outputs
+#### If protocol is 1.31 or newer
+realisations :: [Set][se-Set] of [Realisation][se-Realisation]
+
+#### If protocol is older than 1.31
+outPaths :: [Set][se-Set] of [StorePath][se-StorePath]
+
+
+## AddMultipleToStore
+
+**Id:** 44<br>
+**Introduced:** Protocol 1.32*, Nix 2.4<br>
+
+A pipelined version of [AddToStoreNar](#addtostorenar) where you can add
+multiple paths in one go.
+
+Added because the protocol doesn't support pipelining and so on a low latency
+connection waiting for the request/response of [AddToStoreNar](#addtostorenar)
+for each small NAR was costly.
+
+### Inputs
+- repair :: [Bool64][se-Bool64]
+- dontCheckSigs :: [Bool64][se-Bool64]
+- [Framed][se-Framed] stream of [add multiple NAR dump][se-AddMultipleToStore]
+
+
+## AddBuildLog
+
+**Id:** 45<br>
+**Introduced:** Protocol 1.32, Nix 2.6.0<br>
+
+Attach some build logs to a given build.
+
+### Inputs
+- path :: [BaseStorePath][se-BaseStorePath]
+- [Framed][se-Framed] stream of log lines
+
+### Outputs
+1 :: [Int][se-Int] (hardcoded and ignored by client)
+
+
+## BuildPathsWithResults
+
+**Id:** 46<br>
+**Introduced:** Protocol 1.34*, Nix 2.8.0<br>
+
+Build (or substitute) a list of derivations and returns a list of results.
+
+### Inputs
+- drvs :: [List][se-List] of [DerivedPath][se-DerivedPath]
+- mode :: [BuildMode][se-BuildMode]
+
+### Outputs
+results :: [List][se-List] of [KeyedBuildResult][se-KeyedBuildResult]
+
+
+## AddPermRoot
+
+**Id:** 47<br>
+**Introduced:** Protocol 1.36*, Nix 2.20.0<br>
+
+### Inputs
+- storePath :: [StorePath][se-StorePath]
+- gcRoot :: [Path][se-Path]
+
+### Outputs
+gcRoot :: [Path][se-Path]
+
+
+
+[se-Int]: ./serialization.md#int
+[se-UInt8]: ./serialization.md#uint8
+[se-UInt64]: ./serialization.md#uint64
+[se-Bool]: ./serialization.md#bool
+[se-Bool64]: ./serialization.md#bool64
+[se-Time]: ./serialization.md#time
+[se-FileIngestionMethod]: ./serialization.md#fileingestionmethod
+[se-BuildMode]: ./serialization.md#buildmode
+[se-Verbosity]: ./serialization.md#verbosity
+[se-GCAction]: ./serialization.md#gcaction
+[se-Bytes]: ./serialization.md#bytes
+[se-String]: ./serialization.md#string
+[se-StorePath]: ./serialization.md#storepath
+[se-BaseStorePath]: ./serialization.md#basestorepath
+[se-OptStorePath]: ./serialization.md#optstorepath
+[se-ContentAddressMethodWithAlgo]: ./serialization.md#contentaddressmethodwithalgo
+[se-OptContentAddress]: ./serialization.md#optcontentaddress
+[se-DerivedPath]: ./serialization.md#derivedpath
+[se-DrvOutput]: ./serialization.md#drvoutput
+[se-Realisation]: ./serialization.md#realisation
+[se-List]: ./serialization.md#list-of-x
+[se-Set]: ./serialization.md#set-of-x
+[se-Map]: ./serialization.md#map-of-x-to-y
+[se-SubstitutablePathInfo]: ./serialization.md#substitutablepathinfo
+[se-ValidPathInfo]: ./serialization.md#validpathinfo
+[se-UnkeyedValidPathInfo]: ./serialization.md#unkeyedvalidpathinfo
+[se-BuildResult]: ./serialization.md#buildmode
+[se-KeyedBuildResult]: ./serialization.md#keyedbuildresult
+[se-BasicDerivation]: ./serialization.md#basicderivation
+[se-Framed]: ./serialization.md#framed
+[se-AddMultipleToStore]: ./serialization.md#addmultipletostore-format
+[se-ExportFormat]: ./serialization.md#export-path-format
+[se-ImportPaths]: ./serialization.md#import-paths-format
\ No newline at end of file
diff --git a/tvix/docs/src/nix-daemon/serialization.md b/tvix/docs/src/nix-daemon/serialization.md
new file mode 100644
index 000000000000..1042c956ba71
--- /dev/null
+++ b/tvix/docs/src/nix-daemon/serialization.md
@@ -0,0 +1,409 @@
+
+### UInt64
+Little endian byte order
+
+### Bytes
+
+- len :: [UInt64](#uint64)
+- len bytes of content
+- padding with zeros to ensure 64 bit alignment of content + padding
+
+
+## Int serializers
+
+### Int
+[UInt64](#uint64) cast to C `unsigned int` with upper bounds checking.
+
+### Int64
+[UInt64](#uint64) cast to C `int64_t` with upper bounds checking.
+This means that negative numbers can be written but not read.
+Since this is only used for cpuSystem and cpuUser it is fine that
+negative numbers aren't supported.
+
+### UInt8
+[UInt64](#uint64) cast to C `uint8_t` with upper bounds checking.
+
+### Size
+[UInt64](#uint64) cast to C `size_t` with upper bounds checking.
+
+### Time
+[UInt64](#uint64) cast to C `time_t` with upper bounds checking.
+This means that negative numbers can be written but not read.
+
+### Bool
+Sent as an [Int](#int) where 0 is false and everything else is true.
+
+### Bool64
+Sent as an [UInt64](#uint64) where 0 is false and everything else is true.
+
+### FileIngestionMethod
+An [UInt8](#uint8) enum with the following possible values:
+
+| Name       | Int |
+| ---------- | --- |
+| Flat       |  0  |
+| NixArchive |  1  |
+
+### BuildMode
+An [Int](#int) enum with the following possible values:
+
+| Name   | Int |
+| ------ | --- |
+| Normal |  0  |
+| Repair |  1  |
+| Check  |  2  |
+
+### Verbosity
+An [Int](#int) enum with the following possible values:
+
+| Name      | Int |
+| --------- | --- |
+| Error     |  0  |
+| Warn      |  1  |
+| Notice    |  2  |
+| Info      |  3  |
+| Talkative |  4  |
+| Chatty    |  5  |
+| Debug     |  6  |
+| Vomit     |  7  |
+
+### GCAction
+An [Int](#int) enum with the following possible values:
+
+| Name           | Int |
+| -------------- | --- |
+| ReturnLive     |  0  |
+| ReturnDead     |  1  |
+| DeleteDead     |  2  |
+| DeleteSpecific |  3  |
+
+### BuildStatus
+An [Int](#int) enum with the following possible values:
+
+| Name                   | Int |
+| ---------------------- | --- |
+| Built                  |  0  |
+| Substituted            |  1  |
+| AlreadyValid           |  2  |
+| PermanentFailure       |  3  |
+| InputRejected          |  4  |
+| OutputRejected         |  5  |
+| TransientFailure       |  6  |
+| CachedFailure          |  7  |
+| TimedOut               |  8  |
+| MiscFailure            |  9  |
+| DependencyFailed       | 10  |
+| LogLimitExceeded       | 11  |
+| NotDeterministic       | 12  |
+| ResolvesToAlreadyValid | 13  |
+| NoSubstituters         | 14  |
+
+### ActivityType
+An [Int](#int) enum with the following possible values:
+
+| Name          | Int |
+| ------------- | --- |
+| Unknown       |   0 |
+| CopyPath      | 100 |
+| FileTransfer  | 101 |
+| Realise       | 102 |
+| CopyPaths     | 103 |
+| Builds        | 104 |
+| Build         | 105 |
+| OptimiseStore | 106 |
+| VerifyPaths   | 107 |
+| Substitute    | 108 |
+| QueryPathInfo | 109 |
+| PostBuildHook | 110 |
+| BuildWaiting  | 111 |
+| FetchTree     | 112 |
+
+### ResultType
+An [Int](#int) enum with the following possible values:
+
+| Name             | Int |
+| ---------------- | --- |
+| FileLinked       | 100 |
+| BuildLogLine     | 101 |
+| UntrustedPath    | 102 |
+| CorruptedPath    | 103 |
+| SetPhase         | 104 |
+| Progress         | 105 |
+| SetExpected      | 106 |
+| PostBuildLogLine | 107 |
+| FetchStatus      | 108 |
+
+### FieldType
+An [Int](#int) enum with the following possible values:
+
+| Name   | Int |
+| ------ | --- |
+| Int    |  0  |
+| String |  1  |
+
+### OptTrusted
+An [UInt8](#uint8) optional enum with the following possible values:
+
+| Name             | Int |
+| ---------------- | --- |
+| None             |  0  |
+| Some(Trusted)    |  1  |
+| Some(NotTrusted) |  2  |
+
+
+## Bytes serializers
+
+### String
+Simply a [Bytes](#bytes) that has some UTF-8 string like semantics sometimes.
+
+### StorePath
+[String](#string) representation of a full store path including the store directory.
+
+### BaseStorePath
+[String](#string) representation of the basename of a store path. That is the store path
+without the /nix/store prefix.
+
+### StorePathName
+[String](#string) representation of the name part of a base store path. This is the part
+of the store path after the nixbase32 hash and '-'
+
+It must have the following format:
+- Deny ".", "..", or those strings followed by '-'
+- Otherwise check that each character is 0-9, a-z, A-Z or one of +-._?=
+
+### StorePathHash
+[String](#string) representation of the hash part of a base store path. This is the part
+of the store path at the beginning and before the '-' and is in nixbase32 format.
+
+
+### OutputName
+[String](#string) representation of the name of a derivation output.
+This is usually combined with the name in the derivation to form the store path name for the
+store path with this output.
+
+Since output name is usually combined to form a store path name its format must follow the
+same rules as [StorePathName](#storepathname):
+- Deny ".", "..", or those strings followed by '-'
+- Otherwise check that each character is 0-9, a-z, A-Z or one of +-._?=
+
+
+### OptStorePath
+Optional store path.
+
+If no store path this is serialized as the empty string otherwise it is the same as
+[StorePath](#storepath).
+
+### Path
+[String](#string) representation of an absolute path.
+
+### NARHash
+[String](#string) base16-encoded NAR SHA256 hash without algorithm prefix.
+
+### Signature
+[String](#string) with a signature for the given store path or realisation. This should be
+in the format `name`:`base 64 encoded signature` but this is not enforced in the protocol.
+
+### HashAlgorithm
+[String](#string) with one of the following values:
+- md5
+- sha1
+- sha256
+- sha512
+
+### HashDigest
+[String](#string) with a hash digest in any encoding
+
+### OptHashDigest
+Optional version of [HashDigest](#hashdigest) where empty string means
+no value.
+
+
+### ContentAddressMethodWithAlgo
+[String](#string) with one of the following formats:
+- text:[HashAlgorithm](#hashalgorithm)
+- fixed:r:[HashAlgorithm](#hashalgorithm)
+- fixed:[HashAlgorithm](#hashalgorithm)
+
+### OptContentAddressMethodWithAlgo
+Optional version of [ContentAddressMethodWithAlgo](#contentaddressmethodwithalgo)
+where empty string means no value.
+
+### ContentAddress
+[String](#string) with the format:
+- [ContentAddressMethodWithAlgo](#contentaddressmethodwithalgo):[HashDigest](#hashdigest)
+
+### OptContentAddress
+Optional version of [ContentAddress](#contentaddress) where empty string means
+no content address.
+
+### DerivedPath
+#### If protocol is 1.30 or newer
+output-names = [OutputName](#outputname), { "," [OutputName](#outputname) }<br>
+output-spec = "*" | output-names<br>
+derived-path = [StorePath](#storepath), [ "!", output-spec ]<br>
+
+#### If protocol is older than 1.30
+[StorePath](#storepath), [ "!", [OutputName](#outputname), { "," [OutputName](#outputname) } ]
+
+### DrvOutput
+[String](#string) with format:
+- `hash with any prefix` "!" [OutputName](#outputname)
+
+### Realisation
+A JSON object sent as a [String](#string).
+
+The JSON object has the following keys:
+| Key                   | Value                            |
+| --------------------- | -------------------------------- |
+| id                    | [DrvOutput](#drvoutput)          |
+| outPath               | [StorePath](#storepath)          |
+| signatures            | Array of [Signature](#signature) |
+| dependentRealisations | Object where key is [DrvOutput](#drvoutput) and value is [StorePath](#storepath) |
+
+
+## Complex serializers
+
+### List of x
+A list is encoded as a [Size](#size) length n followed by n encodings of x
+
+### Map of x to y
+A map is encoded as a [Size](#size) length n followed by n encodings of pairs of x and y
+
+### Set of x
+A set is encoded as a [Size](#size) length n followed by n encodings of x
+
+### BuildResult
+- status :: [BuildStatus](#buildstatus)
+- errorMsg :: [String](#string)
+
+#### Protocol 1.29 or newer
+- timesBuilt :: [Int](#int) (defaults to 0)
+- isNonDeterministic :: [Bool64](#bool64) (defaults to false)
+- startTime :: [Time](#time) (defaults to 0)
+- stopTime :: [Time](#time) (defaults to 0)
+
+#### Protocol 1.37 or newer
+- cpuUser :: [OptMicroseconds](#optmicroseconds) (defaults to none)
+- cpuSystem :: [OptMicroseconds](#optmicroseconds) (defaults to none)
+
+#### Protocol 1.28 or newer
+builtOutputs ::  [Map](#map-of-x-to-y) of [DrvOutput](#drvoutput) to [Realisation](#realisations)
+
+### KeyedBuildResult
+- path :: [DerivedPath](#derivedpath)
+- result :: [BuildResult](#buildresult)
+
+### OptMicroseconds
+Optional microseconds.
+
+- tag :: [UInt8](#uint8)
+
+#### If tag is 1
+- seconds :: [Int64](#int64)
+
+
+### SubstitutablePathInfo
+- deriver :: [OptStorePath](#optstorepath)
+- references :: [Set][#set-of-x] of [StorePath](#storepath)
+- downloadSize :: [UInt64](#uint64)
+- narSize :: [UInt64](#uint64)
+
+
+### UnkeyedValidPathInfo
+- deriver :: [OptStorePath](#optstorepath)
+- narHash :: [NARHash](#narhash)
+- references :: [Set](#set-of-x) of [StorePath](#storepath)
+- registrationTime :: [Time](#time)
+- narSize :: [UInt64](#uint64)
+
+#### If protocol version is 1.16 or above
+- ultimate :: [Bool64](#bool64) (defaults to false)
+- signatures :: [Set](#set-of-x) of [Signature](#signature)
+- ca :: [OptContentAddress](#optcontentaddress)
+
+
+### ValidPathInfo
+- path :: [StorePath](#storepath)
+- info :: [UnkeyedValidPathInfo](#unkeyedvalidpathinfo)
+
+### DerivationOutput
+- path :: [OptStorePath](#optstorepath)
+- hashAlgo :: [OptContentAddressMethodWithAlgo](#optcontentaddressmethodwithalgo)
+- hash :: [OptHashDigest](#opthashdigest)
+
+### BasicDerivation
+- outputs :: [Map](#map-of-x-to-y) of [OutputName](#outputname) to [DerivationOutput](#derivationoutput)
+- inputSrcs :: [Set](#set-of-x) of [StorePath](#storepath)
+- platform :: [String](#string)
+- builder :: [String](#string)
+- args :: [List](#list-of-x) of [String](#string)
+- env :: [Map](#map-of-x-to-y) of [String](#string) to [String](#string)
+
+### TraceLine
+- havePos :: [Size](#size) (hardcoded to 0)
+- hint :: [String](#string) (If logger is JSON, invalid UTF-8 is replaced with U+FFFD)
+
+### Error
+- type :: [String](#string) (hardcoded to `Error`)
+- level :: [Verbosity](#verbosity)
+- name :: [String](#string) (removed and hardcoded to `Error`)
+- msg :: [String](#string) (If logger is JSON, invalid UTF-8 is replaced with U+FFFD)
+- havePos :: [Size](#size) (hardcoded to 0)
+- traces :: [List](#list-of-x) of [TraceLine](#traceline)
+
+## Field
+- type :: [FieldType](#fieldtype)
+
+### If type is Int
+- value :: [UInt64](#uint64)
+
+### If type is String
+- value :: [String](#string)
+
+
+## Framed
+
+At protocol 1.23 [AddToStoreNar](./operations.md#addtostorenar) introduced a
+framed streaming for sending the NAR dump and later versions of the protocol
+also used this framing for other operations.
+
+At its core the framed streaming is just a series of [UInt64](#uint64) `size`
+followed by `size` bytes. The stream is terminated when `size` is zero.
+
+Unlike [Bytes](#bytes), frames are *NOT* padded.
+
+This method of sending data has the advantage of not having to parse the data
+to find where it ends. Older versions of the protocol would potentially parse
+the NAR twice.
+
+
+## AddMultipleToStore format
+
+Paths must be topologically sorted.
+
+- expected :: [UInt64](#uint64)
+
+### Repeated expected times
+- info :: [ValidPathInfo](#validpathinfo)
+- NAR dump
+
+
+## Export path format
+- NAR dump
+- 0x4558494es :: [Int](#int) (hardcoded, 'EXIN' in ASCII)
+- path :: [StorePath](#storepath)
+- references :: [Set](#set-of-x) of [StorePath](#storepath)
+- deriver :: [OptStorePath](#optstorepath)
+- hasSignature :: [Int](#int) (hardcoded to 0 in newer versions)
+
+#### If hasSignature is 1
+- signature :: [String](#string) (ignored)
+
+
+## Import paths format
+
+- hasNext :: [UInt64](#uint64)
+
+### While hasNext is 1
+- [Export path format](#export-path-format)
+- hasNext :: [UInt64](#uint64)
diff --git a/tvix/docs/src/store/api.md b/tvix/docs/src/store/api.md
new file mode 100644
index 000000000000..89495a0d1c32
--- /dev/null
+++ b/tvix/docs/src/store/api.md
@@ -0,0 +1,287 @@
+# tvix-[ca]store API
+
+This document outlines the design of the API exposed by tvix-castore and tvix-
+store, as well as other implementations of this store protocol.
+
+This document is meant to be read side-by-side with
+[Data Model](../castore/data-model.md) which describes the data model
+in more detail.
+
+The store API has four main consumers:
+
+1. The evaluator (or more correctly, the CLI/coordinator, in the Tvix
+   case) communicates with the store to:
+
+   * Upload files and directories (e.g. from `builtins.path`, or `src = ./path`
+     Nix expressions).
+   * Read files from the store where necessary (e.g. when `nixpkgs` is
+     located in the store, or for IFD).
+
+2. The builder communicates with the store to:
+
+   * Upload files and directories after a build, to persist build artifacts in
+     the store.
+
+3. Tvix clients (such as users that have Tvix installed, or, depending
+   on perspective, builder environments) expect the store to
+   "materialise" on disk to provide a directory layout with store
+   paths.
+
+4. Stores may communicate with other stores, to substitute already built store
+   paths, i.e. a store acts as a binary cache for other stores.
+
+The store API attempts to reuse parts of its API between these three
+consumers by making similarities explicit in the protocol. This leads
+to a protocol that is slightly more complex than a simple "file
+upload/download"-system, but at significantly greater efficiency, both in terms
+of deduplication opportunities as well as granularity.
+
+## The Store model
+
+Contents inside a tvix-store can be grouped into three different message types:
+
+ * Blobs
+ * Directories
+ * PathInfo (see further down)
+
+(check `castore.md` for more detailed field descriptions)
+
+### Blobs
+A blob object contains the literal file contents of regular (or executable)
+files.
+
+### Directory
+A directory object describes the direct children of a directory.
+
+It contains:
+ - name of child (regular or executable) files, and their [blake3][blake3] hash.
+ - name of child symlinks, and their target (as string)
+ - name of child directories, and their [blake3][blake3] hash (forming a Merkle DAG)
+
+### Content-addressed Store Model
+For example, lets consider a directory layout like this, with some
+imaginary hashes of file contents:
+
+```
+.
+├── file-1.txt        hash: 5891b5b522d5df086d0ff0b110fb
+└── nested
+    └── file-2.txt    hash: abc6fd595fc079d3114d4b71a4d8
+```
+
+A hash for the *directory* `nested` can be created by creating the `Directory`
+object:
+
+```json
+{
+  "directories": [],
+  "files": [{
+    "name": "file-2.txt",
+    "digest": "abc6fd595fc079d3114d4b71a4d8",
+    "size": 123,
+  }],
+  "symlink": [],
+}
+```
+
+And then hashing a serialised form of that data structure. We use the blake3
+hash of the canonical protobuf representation. Let's assume the hash was
+`ff0029485729bcde993720749232`.
+
+To create the directory object one layer up, we now refer to our `nested`
+directory object in `directories`, and to `file-1.txt` in `files`:
+
+```json
+{
+  "directories": [{
+    "name": "nested",
+    "digest": "ff0029485729bcde993720749232",
+    "size": 1,
+  }],
+  "files": [{
+    "name": "file-1.txt",
+    "digest": "5891b5b522d5df086d0ff0b110fb",
+    "size": 124,
+  }]
+}
+```
+
+This Merkle DAG of Directory objects, and flat store of blobs can be used to
+describe any file/directory/symlink inside a store path. Due to its content-
+addressed nature, it'll automatically deduplicate (re-)used (sub)directories,
+and allow substitution from any (untrusted) source.
+
+The thing that's now only missing is the metadata to map/"mount" from the
+content-addressed world to a physical path.
+
+### PathInfo
+As most paths in the Nix store currently are input-addressed [^input-addressed],
+and the `tvix-castore` data model is also not intrinsically using NAR hashes,
+we need something mapping from an input-addressed "output path hash" (or a Nix-
+specific content-addressed path) to the contents in the `tvix-castore` world.
+
+That's what `PathInfo` provides. It embeds the root node (Directory, File or
+Symlink) at a given store path.
+
+The root nodes' `name` field is populated with the (base)name inside
+`/nix/store`, so `xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-pname-1.2.3`.
+
+The `PathInfo` message also stores references to other store paths, and some
+more NARInfo-specific metadata (signatures, narhash, narsize).
+
+
+## API overview
+
+There's three different services:
+
+### BlobService
+`BlobService` can be used to store and retrieve blobs of data, used to host
+regular file contents.
+
+It is content-addressed, using [blake3][blake3]
+as a hashing function.
+
+As blake3 is a tree hash, there's an opportunity to do
+[verified streaming][bao] of parts of the file,
+which doesn't need to trust any more information than the root hash itself.
+Future extensions of the `BlobService` protocol will enable this.
+
+### DirectoryService
+`DirectoryService` allows lookups (and uploads) of `Directory` messages, and
+whole reference graphs of them.
+
+
+### PathInfoService
+The PathInfo service provides lookups from a store path hash to a `PathInfo`
+message.
+
+## Example flows
+
+Below there are some common use cases of tvix-store, and how the different
+services are used.
+
+###  Upload files and directories
+This is needed for `builtins.path` or `src = ./path` in Nix expressions (A), as
+well as for uploading build artifacts to a store (B).
+
+The path specified needs to be (recursively, BFS-style) traversed.
+ * All file contents need to be hashed with blake3, and submitted to the
+   *BlobService* if not already present.
+   A reference to them needs to be added to the parent Directory object that's
+   constructed.
+ * All symlinks need to be added to the parent directory they reside in.
+ * Whenever a Directory has been fully traversed, it needs to be uploaded to
+   the *DirectoryService* and a reference to it needs to be added to the parent
+   Directory object.
+
+Most of the hashing / directory traversal/uploading can happen in parallel,
+as long as Directory objects only refer to Directory objects and Blobs that
+have already been uploaded.
+
+When reaching the root, a `PathInfo` object needs to be constructed.
+
+ * In the case of content-addressed paths (A), the name of the root node is
+   based on the NAR representation of the contents.
+   It might make sense to be able to offload the NAR calculation to the store,
+   which can cache it.
+ * In the case of build artifacts (B), the output path is input-addressed and
+   known upfront.
+
+Contrary to Nix, this has the advantage of not having to upload a lot of things
+to the store that didn't change.
+
+### Reading files from the store from the evaluator
+This is the case when `nixpkgs` is located in the store, or IFD in general.
+
+The store client asks the `PathInfoService` for the `PathInfo` of the output
+path in the request, and looks at the root node.
+
+If something other than the root of the store path is requested, like for
+example `maintainers/maintainer-list.nix`, the root_node Directory is inspected
+and potentially a chain of `Directory` objects requested from
+*DirectoryService*. [^n+1query].
+
+When the desired file is reached, the *BlobService* can be used to read the
+contents of this file, and return it back to the evaluator.
+
+FUTUREWORK: Define how importing from symlinks should/does work.
+
+Contrary to Nix, this has the advantage of not having to copy all of the
+contents of a store path to the evaluating machine, but really only fetching
+the files the evaluator currently cares about.
+
+### Materializing store paths on disk
+This is useful for people running a Tvix-only system, or running builds on a
+"Tvix remote builder" in its own mount namespace.
+
+In a system with Nix installed, we can't simply manually "extract" things to
+`/nix/store`, as Nix assumes to own all writes to this location.
+In these use cases, we're probably better off exposing a tvix-store as a local
+binary cache (that's what `//tvix/nar-bridge` does).
+
+Assuming we are in an environment where we control `/nix/store` exclusively, a
+"realize to disk" would either "extract" things from the `tvix-store` to a
+filesystem, or expose a `FUSE`/`virtio-fs` filesystem.
+
+The latter is already implemented, and particularly interesting for (remote)
+build workloads, as build inputs can be realized on-demand, which saves copying
+around a lot of never- accessed files.
+
+In both cases, the API interactions are similar.
+ * The *PathInfoService* is asked for the `PathInfo` of the requested store path.
+ * If everything should be "extracted", the *DirectoryService* is asked for all
+   `Directory` objects in the closure, the file structure is created, all Blobs
+   are downloaded and placed in their corresponding location and all symlinks
+   are created accordingly.
+ * If this is a FUSE filesystem, we can decide to only request a subset,
+   similar to the "Reading files from the store from the evaluator" use case,
+   even though it might make sense to keep all Directory objects around.
+   (See the caveat in "Trust model" though!)
+
+### Stores communicating with other stores
+The gRPC API exposed by the tvix-store allows composing multiple stores, and
+implementing some caching strategies, that store clients don't need to be aware
+of.
+
+ * For example, a caching strategy could have a fast local tvix-store, that's
+   asked first and filled with data from a slower remote tvix-store.
+
+ * Multiple stores could be asked for the same data, and whatever store returns
+   the right data first wins.
+
+
+## Trust model / Distribution
+As already described above, the only non-content-addressed service is the
+`PathInfo` service.
+
+This means, all other messages (such as `Blob` and `Directory` messages) can be
+substituted from many different, untrusted sources/mirrors, which will make
+plugging in additional substitution strategies like IPFS, local network
+neighbors super simple. That's also why it's living in the `tvix-castore` crate.
+
+As for `PathInfo`, we don't specify an additional signature mechanism yet, but
+carry the NAR-based signatures from Nix along.
+
+This means, if we don't trust a remote `PathInfo` object, we currently need to
+"stream" the NAR representation to validate these signatures.
+
+However, the slow part is downloading of NAR files, and considering we have
+more granularity available, we might only need to download some small blobs,
+rather than a whole NAR file.
+
+A future signature mechanism, that is only signing (parts of) the `PathInfo`
+message, which only points to content-addressed data will enable verified
+partial access into a store path, opening up opportunities for lazy filesystem
+access etc.
+
+
+
+[blake3]: https://github.com/BLAKE3-team/BLAKE3
+[bao]: https://github.com/oconnor663/bao
+[^input-addressed]: Nix hashes the A-Term representation of a .drv, after doing
+                    some replacements on refered Input Derivations to calculate
+                    output paths.
+[^n+1query]: This would expose an N+1 query problem. However it's not a problem
+             in practice, as there's usually always a "local" caching store in
+             the loop, and *DirectoryService* supports a recursive lookup for
+             all `Directory` children of a `Directory`
diff --git a/tvix/docs/src/value-pointer-equality.md b/tvix/docs/src/value-pointer-equality.md
new file mode 100644
index 000000000000..a4539513ef73
--- /dev/null
+++ b/tvix/docs/src/value-pointer-equality.md
@@ -0,0 +1,340 @@
+# Value Pointer Equality in Nix
+
+## Introduction
+
+It is a piece of semi-obscure Nix trivia that while functions are generally not
+comparable, they can be compared in certain situations. This is actually quite an
+important fact, as it is essential for the evaluation of nixpkgs: The attribute sets
+used to represent platforms in nixpkgs, like `stdenv.buildPlatform`, contain functions,
+such as `stdenv.buildPlatform.canExecute`. When writing cross logic, one invariably
+ends up writing expressions that compare these sets, e.g. `stdenv.buildPlatform !=
+stdenv.hostPlatform`. Since attribute set equality is the equality of their attribute
+names and values, we also end up comparing the functions within them.  We can summarize
+the relevant part of this behavior for platform comparisons in the following (true)
+Nix expressions:
+
+* `stdenv.hostPlatform.canExecute != stdenv.hostPlatform.canExecute`
+* `stdenv.hostPlatform == stdenv.hostPlatform`
+
+This fact is commonly referred to as pointer equality of functions (or function pointer
+equality) which is not an entirely accurate name, as we'll see. This account of the
+behavior states that, while functions are incomparable in general, they are comparable
+insofar, as they occupy the same spot in an attribute set.
+
+However, [a maybe lesser known trick][puck-issue] is to write a function such as the
+following to allow comparing functions:
+
+```nix
+let
+  pointerEqual = lhs: rhs: { x = lhs; } == { x = rhs; };
+
+  f = name: "Hello, my name is ${name}";
+  g = name: "Hello, my name is ${name}";
+in
+[
+  (pointerEqual f f) # => true
+  (pointerEqual f g) # => false
+]
+```
+
+Here, clearly, the function is not contained at the same position in one and the same
+attribute set, but at the same position in two entirely different attribute sets. We can
+also see that we are not comparing the functions themselves (e.g. their AST), but
+rather if they are the same individual value (i.e. pointer equal).
+
+To figure out the _actual_ semantics, we'll first have a look at how value (pointer) equality
+works in C++ Nix, the only production ready Nix implementation currently available.
+
+## Nix (Pointer) Equality in C++ Nix
+
+```admonish info
+The summary presented here is up-to-date as of 2023-06-27 and was tested with
+Nix 2.3, 2.11 and 2.15.
+```
+
+### `EvalState::eqValues` and `ExprOpEq::eval`
+
+The function implementing equality in C++ Nix is `EvalState::eqValues` which starts with
+[the following bit of code][eqValues-pointer-eq]:
+
+```cpp
+bool EvalState::eqValues(Value & v1, Value & v2)
+{
+    forceValue(v1);
+    forceValue(v2);
+
+    /* !!! Hack to support some old broken code that relies on pointer
+       equality tests between sets.  (Specifically, builderDefs calls
+       uniqList on a list of sets.)  Will remove this eventually. */
+    if (&v1 == &v2) return true;
+```
+
+So this immediately looks more like pointer equality of arbitrary *values* instead of functions. In fact
+there is [no special code facilitating function equality][eqValues-function-eq]:
+
+```cpp
+        /* Functions are incomparable. */
+        case nFunction:
+            return false;
+```
+
+So one takeaway of this is that pointer equality is neither dependent on functions nor attribute sets.
+In fact, we can also write our `pointerEqual` function as:
+
+```nix
+lhs: rhs: [ lhs ] == [ rhs ]
+```
+
+It's interesting that `EvalState::eqValues` forces the left and right-hand value before trying pointer
+equality. It explains that `let x = throw ""; in x == x` does not evaluate successfully, but it is puzzling why
+`let f = x: x; in f == f` does not return `true`. In fact, why do we need to wrap the values in a list or
+attribute set at all for our `pointerEqual` function to work?
+
+The answer lies in [the code that evaluates `ExprOpEq`][ExprOpEq],
+i.e. an expression involving the `==` operator:
+
+```cpp
+void ExprOpEq::eval(EvalState & state, Env & env, Value & v)
+{
+    Value v1; e1->eval(state, env, v1);
+    Value v2; e2->eval(state, env, v2);
+    v.mkBool(state.eqValues(v1, v2));
+}
+```
+
+As you can see, two _distinct_ `Value` structs are created, so they can never be pointer equal even
+if the `union` inside points to the same bit of memory. We can thus understand what actually happens
+when we check the equality of an attribute set (or list), by looking at the following expression:
+
+```nix
+let
+  x = { name = throw "nameless"; };
+in
+
+x == x # => causes an evaluation error
+```
+
+Because `x` can't be pointer equal, as it'll end up in the distinct structs `v1` and `v2`, it needs to be compared
+by value. For this reason, the `name` attribute will be forced and an evaluation error caused.
+If we rewrite the expression to use…
+
+```nix
+{ inherit x; } == { inherit x; } # => true
+```
+
+…, it'll work: The two attribute sets are compared by value, but their `x` attribute turns out to be pointer
+equal _after_ forcing it. This does not throw, since forcing an attribute set does not force its attributes'
+values (as forcing a list doesn't force its elements).
+
+As we have seen, pointer equality can not only be used to compare function values, but also other
+otherwise incomparable values, such as lists and attribute sets that would cause an evaluation
+error if they were forced recursively. We can even switch out the `throw` for an `abort`. The limitation is
+of course that we need to use a value that behaves differently depending on whether it is forced
+“normally” (think `builtins.seq`) or recursively (think `builtins.deepSeq`), so thunks will generally be
+evaluated before pointer equality can kick into effect.
+
+### Other Comparisons
+
+The `!=` operator uses `EvalState::eqValues` internally as well, so it behaves exactly as `!(a == b)`.
+
+The `>`, `<`, `>=` and `<=` operators all desugar to [CompareValues][]
+eventually which generally looks at the value type before comparing. It does,
+however, rely on `EvalState::eqValues` for list comparisons
+([introduced in Nix 2.5][nix-2.5-changelog]), so it is possible to compare lists
+with e.g. functions in them, as long as they are equal by pointer:
+
+```nix
+let
+  f = x: x + 42;
+in
+
+[
+  ([ f 2 ] > [ f 1 ]) # => true
+  ([ f 2 ] > [ (x: x) 1]) # => error: cannot compare a function with a function
+  ([ f ] > [ f ]) # => false
+]
+```
+
+Finally, since `builtins.elem` relies on `EvalState::eqValues`, you can check for
+a function by pointer equality:
+
+```nix
+let
+  f = x: f x;
+in
+builtins.elem f [ f 2 3 ] # => true
+```
+
+### Pointer Equality Preserving Nix Operations
+
+We have seen that pointer equality is established by comparing the memory
+location of two C++ `Value` structs. But how does this _representation_ relate
+to Nix values _themselves_ (in the sense of a platonic ideal if you will)? In
+Nix, values have no identity (ignoring `unsafeGetAttrPos`) or memory location.
+
+Since Nix is purely functional, values can't be mutated, so they need to be
+copied frequently. With Nix being garbage collected, there is no strong
+expectation when a copy is made, we probably just hope it is done as seldomly as
+possible to save on memory. With pointer equality leaking the memory location of
+the `Value` structs to an extent, it is now suddenly our business to know
+exactly _when_ a copy of a value is made.
+
+Evaluation in C++ Nix mainly proceeds along the following [two
+functions][eval-maybeThunk].
+
+```cpp
+struct Expr
+{
+    /* … */
+    virtual void eval(EvalState & state, Env & env, Value & v);
+    virtual Value * maybeThunk(EvalState & state, Env & env);
+    /* … */
+};
+```
+
+As you can see, `Expr::eval` always takes a reference to a struct _allocated by
+the caller_ to place the evaluation result in. Anything that is processed using
+`Expr::eval` will be a copy of the `Value` struct even if the value before and
+after are the same.
+
+`Expr::maybeThunk`, on the other hand, returns a pointer to a `Value` which may
+already exist or be newly allocated. So, if evaluation passes through `maybeThunk`,
+Nix values _can_ retain their pointer equality. Since Nix is lazy, a lot of
+evaluation needs to be thunked and pass through `maybeThunk`—knowing under what
+circumstances `maybeThunk` will return a pointer to an already existing `Value`
+struct thus means knowing the circumstances under which pointer equality of a
+Nix value will be preserved in C++ Nix.
+
+The [default case][maybeThunk-default] of `Expr::maybeThunk` allocates a new
+`Value` which holds the delayed computation of the `Expr` as a thunk:
+
+```cpp
+
+Value * Expr::maybeThunk(EvalState & state, Env & env)
+{
+    Value * v = state.allocValue();
+    mkThunk(*v, env, this);
+    return v;
+}
+```
+
+Consequently, only special cased expressions could preserve pointer equality.
+These are `ExprInt`, `ExprFloat`, `ExprString`, `ExprPath`—all of which relate
+to creating new values—and [finally, `ExprVar`][maybeThunk-ExprVar]:
+
+```cpp
+Value * ExprVar::maybeThunk(EvalState & state, Env & env)
+{
+    Value * v = state.lookupVar(&env, *this, true);
+    /* The value might not be initialised in the environment yet.
+       In that case, ignore it. */
+    if (v) { state.nrAvoided++; return v; }
+    return Expr::maybeThunk(state, env);
+}
+```
+
+Here we may actually return an already existing `Value` struct. Consequently,
+accessing a value from the scope is the only thing you can do with a value in
+C++ Nix that preserves its pointer equality, as the following example shows:
+For example, using the select operator to get a value from an attribute set
+or even passing a value trough the identity function invalidates its pointer
+equality to itself (or rather, its former self).
+
+```nix
+let
+  pointerEqual = a: b: [ a ] == [ b ];
+  id = x: x;
+
+  f = _: null;
+  x = { inherit f; };
+  y = { inherit f; };
+in
+
+[
+  (pointerEqual f f)      # => true
+
+  (pointerEqual f (id f)) # => false
+
+  (pointerEqual x.f y.f)  # => false
+  (pointerEqual x.f x.f)  # => false
+
+  (pointerEqual x x)      # => true
+  (pointerEqual x y)      # => true
+]
+```
+
+In the last two cases, the example also shows that there is another way to
+preserve pointer equality: Storing a value in an attribute set (or list)
+preserves its pointer equality even if the structure holding it is modified in
+some way (as long as the value we care about is left untouched). The catch is,
+of course, that there is no way to get the value out of the structure while
+preserving pointer equality (which requires using the select operator or a call
+to `builtins.elemAt`).
+
+We initially illustrated the issue of pointer equality using the following
+true expressions:
+
+* `stdenv.hostPlatform.canExecute != stdenv.hostPlatform.canExecute`
+* `stdenv.hostPlatform == stdenv.hostPlatform`
+
+We can now add a third one, illustrating that pointer equality is invalidated
+by select operations:
+
+* `[ stdenv.hostPlatform.canExecute ] != [ stdenv.hostPlatform.canExecute ]`
+
+To summarize, pointer equality is established on the memory location of the
+`Value` struct in C++ Nix. Except for simple values (`int`, `bool`, …),
+the `Value` struct only consists of a pointer to the actual representation
+of the value (attribute set, list, function, …) and is thus cheap to copy.
+In practice, this happens when a value passes through the evaluation of
+almost any Nix expression. Only in the select cases described above
+a value preserves its pointer equality despite being unchanged by an
+expression. We can call this behavior *exterior pointer equality*.
+
+## Summary
+
+When comparing two Nix values, we must force both of them (non-recursively!), but are
+allowed to short-circuit the comparison based on pointer equality, i.e. if they are at
+the same exact value in memory, they are deemed equal immediately. This is completely
+independent of what type of value they are. If they are not pointer equal, they are
+(recursively) compared by value as expected.
+
+However, when evaluating the Nix expression `a == b`, we *must* invoke our implementation's
+value equality function in a way that `a` and `b` themselves can never be deemed pointer equal.
+Any values we encounter while recursing during the equality check must be compared by
+pointer as described above, though.
+
+## Stability of the Feature
+
+Keen readers will have noticed the following comment in the C++ Nix source code,
+indicating that pointer comparison may be removed in the future.
+
+```cpp
+    /* !!! Hack to support some old broken code that relies on pointer
+       equality tests between sets.  (Specifically, builderDefs calls
+       uniqList on a list of sets.)  Will remove this eventually. */
+```
+
+Now, I can't speak for the upstream C++ Nix developers, but sure can speculate.
+As already pointed out, this feature is currently needed for evaluating nixpkgs.
+While its use could realistically be eliminated (only bothersome spot is probably
+the `emulator` function, but that should also be doable), removing the feature
+would seriously compromise C++ Nix's ability to evaluate historical nixpkgs
+revision which is arguably a strength of the system.
+
+Another indication that it is likely here to stay is that it has already
+[outlived builderDefs][], even though
+it was (apparently) reintroduced just for this use case. More research into
+the history of this feature would still be prudent, especially the reason for
+its original introduction (maybe performance?).
+
+[puck-issue]: https://github.com/NixOS/nix/issues/3371
+[eqValues-pointer-eq]: https://github.com/NixOS/nix/blob/3c618c43c6044eda184df235c193877529e951cb/src/libexpr/eval.cc#L2401-L2404
+[eqValues-function-eq]: https://github.com/NixOS/nix/blob/3c618c43c6044eda184df235c193877529e951cb/src/libexpr/eval.cc#L2458-L2460
+[ExprOpEq]: https://github.com/NixOS/nix/blob/3c618c43c6044eda184df235c193877529e951cb/src/libexpr/eval.cc#L1822-L1827
+[outlived builderDefs]: https://github.com/NixOS/nixpkgs/issues/4210
+[CompareValues]: https://github.com/NixOS/nix/blob/3c618c43c6044eda184df235c193877529e951cb/src/libexpr/primops.cc#L569-L610
+[nix-2.5-changelog]: https://nixos.org/manual/nix/stable/release-notes/rl-2.5.html
+[eval-maybeThunk]: https://github.com/NixOS/nix/blob/3c618c43c6044eda184df235c193877529e951cb/src/libexpr/nixexpr.hh#L161-L162
+[maybeThunk-default]: https://github.com/NixOS/nix/blob/8e770dac9f68162cfbb368e53f928df491babff3/src/libexpr/eval.cc#L1076-L1081
+[maybeThunk-ExprVar]: https://github.com/NixOS/nix/blob/8e770dac9f68162cfbb368e53f928df491babff3/src/libexpr/eval.cc#L1084-L1091