about summary refs log tree commit diff
path: root/tvix/eval/docs/opcodes-attrsets.md

attrset-opcodes

The problem with attrset literals is twofold:

  1. The keys of attribute sets may be dynamically evaluated.

    Access:

    let
      k = "foo";
      attrs = { /* etc. */ };
    in attrs."${k}"
    

    Literal:

    let
      k = "foo";
    in {
      "${k}" = 42;
    }
    

    The problem with this is that the attribute set key is not known at compile time, and needs to be dynamically evaluated by the VM as an expression.

    For the most part this should be pretty simple, assuming a theoretical instruction set:

    0000  OP_CONSTANT(0) # key "foo"
    0001  OP_CONSTANT(1) # value 42
    0002  OP_ATTR_SET(1) # construct attrset from 2 stack values
    

    The operation pushing the key needs to be replaced with one that leaves a single value (the key) on the stack, i.e. the code for the expression, e.g.:

    0000..000n <operations leaving a string value on the stack>
    000n+1     OP_CONSTANT(1) # value 42
    000n+2     OP_ATTR_SET(1) # construct attrset from 2 stack values
    

    This is fairly easy to do by simply recursing in the compiler when the key expression is encountered.

  2. The keys of attribute sets may be nested.

    This is the non-trivial part of dealing with attribute set literals. Specifically, the nesting can be arbitrarily deep and the AST does not guarantee that related set keys are located adjacently.

    Furthermore, this frequently occurs in practice in Nix. We need a bytecode representation that makes it possible to construct nested attribute sets at runtime.

    Proposal: AttrPath values

    If we can leave a value representing an attribute path on the stack, we can offload the construction of nested attribute sets to the OpAttrSet operation.

    Under the hood, OpAttrSet in practice constructs a Map<NixString, Value> attribute set in most cases. This means it expects to pop the value of the key of the stack, but is otherwise free to do whatever it wants with the underlying map.

    In a simple example, we could have code like this:

    {
      a.b = 15;
    }
    

    This would be compiled to a new OpAttrPath instruction that constructs and pushes an attribute path from a given number of fragments (which are popped off the stack).

    For example,

    0000 OP_CONSTANT(0)  # key "a"
    0001 OP_CONSTANT(1)  # key "b"
    0002 OP_ATTR_PATH(2) # construct attrpath from 2 fragments
    0003 OP_CONSTANT(2)  # value 42
    0004 OP_ATTRS(1)     # construct attrset from one pair
    

    Right before 0004 the stack would be left like this:

    [ AttrPath[a,b], 42 ]

    Inside of the OP_ATTRS instruction we could then begin construction of the map and insert the nested attribute sets as required, as well as validate that there are no duplicate keys.

  3. Both of these cases can occur simultaneously, but this is not a problem as the opcodes combine perfectly fine, e.g.:

    let
      k = "a";
    in {
      "${k}".b = 42;
    }
    

    results in

    0000..000n <operations leaving a string value on the stack>
    000n+1     OP_CONSTANT(1)  # key "b"
    000n+2     OP_ATTR_PATH(2) # construct attrpath from 2 fragments
    000n+3     OP_CONSTANT(2)  # value 42
    000n+4     OP_ATTR_SET(1)  # construct attrset from 2 stack values