diff options
author | Vincent Ambo <mail@tazj.in> | 2022-08-09T13·56+0300 |
---|---|---|
committer | tazjin <tazjin@tvl.su> | 2022-08-13T15·31+0000 |
commit | 6f13c16f28de01209c92e9b9415b68b00b913894 (patch) | |
tree | 0150aa77147ed385432873be481ba3e30a86a91f /tvix | |
parent | 57a723aaa991b91f58a9a2b64c641d60bdbee2bb (diff) |
docs(tvix/eval): add design documentation for attrset opcodes r/4428
Change-Id: I303b57e035543f4597c6247983d1d533e4014638 Reviewed-on: https://cl.tvl.fyi/c/depot/+/6092 Tested-by: BuildkiteCI Reviewed-by: grfn <grfn@gws.fyi>
Diffstat (limited to 'tvix')
-rw-r--r-- | tvix/eval/docs/opcodes-attrsets.md | 122 |
1 files changed, 122 insertions, 0 deletions
diff --git a/tvix/eval/docs/opcodes-attrsets.md b/tvix/eval/docs/opcodes-attrsets.md new file mode 100644 index 000000000000..7026f3319dda --- /dev/null +++ b/tvix/eval/docs/opcodes-attrsets.md @@ -0,0 +1,122 @@ +# attrset-opcodes + +The problem with attrset literals is twofold: + +1. The keys of attribute sets may be dynamically evaluated. + + Access: + + ```nix + let + k = "foo"; + attrs = { /* etc. */ }; + in attrs."${k}" + ``` + + Literal: + ```nix + let + k = "foo"; + in { + "${k}" = 42; + } + ``` + + The problem with this is that the attribute set key is not known at + compile time, and needs to be dynamically evaluated by the VM as an + expression. + + For the most part this should be pretty simple, assuming a + theoretical instruction set: + + ``` + 0000 OP_CONSTANT(0) # key "foo" + 0001 OP_CONSTANT(1) # value 42 + 0002 OP_ATTR_SET(1) # construct attrset from 2 stack values + ``` + + The operation pushing the key needs to be replaced with one that + leaves a single value (the key) on the stack, i.e. the code for the + expression, e.g.: + + ``` + 0000..000n <operations leaving a string value on the stack> + 000n+1 OP_CONSTANT(1) # value 42 + 000n+2 OP_ATTR_SET(1) # construct attrset from 2 stack values + ``` + + This is fairly easy to do by simply recursing in the compiler when + the key expression is encountered. + +2. The keys of attribute sets may be nested. + + This is the non-trivial part of dealing with attribute set + literals. Specifically, the nesting can be arbitrarily deep and the + AST does not guarantee that related set keys are located + adjacently. + + Furthermore, this frequently occurs in practice in Nix. We need a + bytecode representation that makes it possible to construct nested + attribute sets at runtime. + + Proposal: AttrPath values + + If we can leave a value representing an attribute path on the + stack, we can offload the construction of nested attribute sets to + the `OpAttrSet` operation. + + Under the hood, OpAttrSet in practice constructs a `Map<NixString, + Value>` attribute set in most cases. This means it expects to pop + the value of the key of the stack, but is otherwise free to do + whatever it wants with the underlying map. + + In a simple example, we could have code like this: + + ```nix + { + a.b = 15; + } + ``` + + This would be compiled to a new `OpAttrPath` instruction that + constructs and pushes an attribute path from a given number of + fragments (which are popped off the stack). + + For example, + + ``` + 0000 OP_CONSTANT(0) # key "a" + 0001 OP_CONSTANT(1) # key "b" + 0002 OP_ATTR_PATH(2) # construct attrpath from 2 fragments + 0003 OP_CONSTANT(2) # value 42 + 0004 OP_ATTRS(1) # construct attrset from one pair + ``` + + Right before `0004` the stack would be left like this: + + [ AttrPath[a,b], 42 ] + + Inside of the `OP_ATTRS` instruction we could then begin + construction of the map and insert the nested attribute sets as + required, as well as validate that there are no duplicate keys. + +3. Both of these cases can occur simultaneously, but this is not a + problem as the opcodes combine perfectly fine, e.g.: + + ```nix + let + k = "a"; + in { + "${k}".b = 42; + } + ``` + + results in + + ``` + 0000..000n <operations leaving a string value on the stack> + 000n+1 OP_CONSTANT(1) # key "b" + 000n+2 OP_ATTR_PATH(2) # construct attrpath from 2 fragments + 000n+3 OP_CONSTANT(2) # value 42 + 000n+4 OP_ATTR_SET(1) # construct attrset from 2 stack values + ``` |