Age | Commit message (Collapse) | Author | Files | Lines |
|
This fixes a very complicated bug (b/246). Evaluation
progresses *much* further after this, leading to several less
complicated bugs likely being uncovered by this
What was the problem?
=====================
Previously, when evaluating a thunk, we had a code path that looked
like this:
match *thunk {
ThunkRepr::Evaluated(Value::Thunk(ref inner_thunk)) => {
let inner_repr = inner_thunk.0.borrow().clone();
drop(thunk);
self.0.replace(inner_repr);
}
/* ... */
}
This code path created a copy of the inner `ThunkRepr` of a nested
thunk, and moved that copy into the `ThunkRepr` of the parent.
The effect of this was that the original `ThunkRepr` (unforced!) lived
on in the original thunk, without the memoization of the subsequent
forcing applying to it.
This had the result that Tvix would repeatedly evaluate these thunks
without ever memoizing them, if they occured repeatedly as shared
inner thunks. Most notably, this would *always* occur when
builtins.import was used.
What's the solution?
====================
I have completely rewritten `Thunk::force_trampoline_self` to make all
flows that can occur in it explicit. I have also removed the outer
loop inside of that function, and resorted to more use of trampolining
instead.
The function is now well-commented and it should be possible to read
it from top-to-bottom and get a general sense of what is going on,
though the trampolining itself (which is implemented in the VM) needs
to be at least partially understood for this.
What's the new problem(s)?
==========================
One new (known) problem is that we have to construct `Error` instances
in all error types here, but we do not have spans available in some
thunk-related situations. Due to b/238 we cannot ask the VM for an
arbitrary span from the callsite leading to the force. This means that
there are now code paths where, under certain conditions, causing an
evaluation error during thunk forcing will panic.
To fix this we will need to investigate and fix b/238, and/or add a
span tracking mechanism to thunks themselves.
What other impacts does this have?
==================================
With this commit, eval of nixpkgs mostly succeeds (things like stdenv
evaluate to the same hashes for us and C++ Nix, meaning we now
construct identical derivations without eval breaking).
Due to this we progress much further into nixpkgs, which lets us
uncover more additional bugs. For example, after this commit we can
quickly see that cl/7949 introduces some kind of behavioural issue and
should not be merged as-is (this was not apparent before).
Additionally, tvix-eval is now seemingly very fast. When doing
performance analysis of a nixpkgs eval, we now mostly see the code
path for shelling out to C++ Nix to add things to the store in there.
We still need those code paths, so we can not (yet) do a performance
analysis beyond that.
Change-Id: I738525bad8bc5ede5d8c737f023b14b8f4160612
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8012
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
|
|
All invocations of the builtin macro had to previously filter through
the `builtin_tuple` function, but it's more sensible to directly
return these from the macro.
Change-Id: I45600ba84d56c9528d3e92570461c319eea595ce
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7825
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
|
|
This is unnecessary, Rc already provides all the boxing we need.
Change-Id: I08cf0939c48da43f04c847526c7e5dae5336d528
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7749
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
Reviewed-by: sterni <sternenseemann@systemli.org>
|
|
Previously the construction of globals (a compiler-only concept) and
builtins (a (now) user-facing API) was intermingled between multiple
different modules, and kind of difficult to understand.
The complexity of this had grown in large part due to the
implementation of `builtins.import`, which required the notorious
"knot-tying" trick using Rc::new_cyclic (see cl/7097) for constructing
the set of globals.
As part of the new `Evaluation` API users should have the ability to
bring their own builtins, and control explicitly whether or not impure
builtins are available (regardless of whether they're compiled in or
not).
To streamline the construction and allow the new API features to work,
this commit restructures things by making these changes:
1. The `tvix_eval::builtins` module is now only responsible for
exporting sets of builtins. It no longer has any knowledge of
whether or not certain sets (e.g. only pure, or pure+impure) are
enabled, and it has no control over which builtins are globally
available (this is now handled in the compiler).
2. The compiler module is now responsible for both constructing the
final attribute set of builtins from the set of builtins supplied
by a user, as well as for populating its globals (that is
identifiers which are available at the top-level scope).
3. The `Evaluation` API now carries a `builtins` field which is
populated with the pure builtins by default, and can be extended by
users.
4. The `import` feature has been moved into the compiler, as a
special case. In general, builtins no longer have the ability to
reference the "fix point" of the globals set.
This should not change any functionality, and in fact preserves minor
differences between Tvix/Nix that we already had (such as
`builtins.builtins` not existing).
Change-Id: Icdf5dd50eb81eb9260d89269d6e08b1e67811a2c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7738
Reviewed-by: sterni <sternenseemann@systemli.org>
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
|
|
Change-Id: I05732073155b430575babb6f076bf465aef98857
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7581
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
|
|
Returns the store directory through EvalIO::store_dir.
Note that this is _optional_ in Tvix, as an evaluation can occur in a
context where there simply is no store directory. In those contexts,
`builtins.storeDir` returns `null` in Tvix.
This would only happen in contexts like Tvixbolt (or completely
unrelated use-cases) in practice.
Co-Authored-By: Vincent Ambo <tazjin@tvl.su>
Change-Id: I5a752c7e89b2f75bd7efb082dbfa5b25e3b1ff3b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7452
Autosubmit: Adam Joseph <adam@westernsemico.com>
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
|
|
Change-Id: I6d782c07166f51587d2f1d06607823268debb5d5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7574
Reviewed-by: grfn <grfn@gws.fyi>
Tested-by: BuildkiteCI
|
|
Change-Id: I49822ce30137777865e7370ee86666636e277b35
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7573
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
|
|
With this change, the behaviour of reading a string from a file path
is controlled by the provided `EvalIO` structure.
This is a huge step towards abstracting away I/O behaviour correctly.
Change-Id: Ifde8e46cd863b16e0301dca45a434ad27560399f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7567
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
|
|
This type carries the information required for calculating a
span (i.e. the chunk and offset), instead of the span itself. The span
is then only calculated in cases where it is required (when throwing
errors).
This reduces the eval time for
`builtins.length (builtins.attrNames (import <nixpkgs> {}))` by *one
third*!
The data structure in chunks that carries span information reduces
in-memory size by trading off the speed of retrieving span
information. This is because the span information is only actually
required when throwing errors (or emitting warnings).
However, somewhere along the way we grew a dependency on carrying span
information in thunks (for correctly reporting error chains). Hitting
the code paths for span retrieval was expensive, and carrying the
spans in a different way would still be less cache-efficient. This
change is the best tradeoff I could come up with.
Refs: b/229.
Change-Id: I27d4c4b5c5f9be90ac47f2db61941e123a78a77b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7558
Reviewed-by: grfn <grfn@gws.fyi>
Tested-by: BuildkiteCI
|
|
Fixes b/212. Based on feedback in https://cl.tvl.fyi/c/depot/+/7492, all
uses of `NixAttrs::from_map` have been removed. Only `from_iter` and
`from_kv` remain.
Change-Id: I52e25f73018c2aa1843197427516b7a852503e2c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7500
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: IslandUsurper <lyle@menteeth.us>
|
|
Before this, tvix was spending most of its time furiously re-parsing
and re-compiling nixpkgs, each time hoping to get a different result...
Change-Id: I1c0cfbf9af622c276275b1f2fb8d4e976f1b5533
Signed-off-by: Adam Joseph <adam@westernsemico.com>
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7361
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
|
|
Add a new `documentation: Option<&'static str>` field to Builtin, and
populate it in the `#[builtins]` macro with the docstring of the builtin
function, if any.
Change-Id: Ic68fdf9b314d15a780731974234e2ae43f6a44b0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7205
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
|
|
Refactor the arguments of a Builtin to be a vec of a new BuiltinArgument
struct, which contains the old strictness boolean and also a static
`name` str - this is automatically determined via the ident for the
corresponding function argument in the proc-macro case, and passed in in
the cases where we're still manually calling Builtin::new.
Currently this name is unused, but in the future this can be used as
part of a documentation system for builtins.
Change-Id: Ib9dadb15b69bf8c9ea1983a4f4f197294a2394a6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7204
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
|
|
Similar to what we did with pure builtins, define the impure builtins
within a module at the top-level using the new #[builtins] attribute
macro
Change-Id: Ie5d5135d00bb65e651531df6eadba642cd4eb08e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7202
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
|
|
CL/6867 added support for builtins.import, which required a cyclic
reference import->globals->builtins->import. This was implemented
using a RefCell, which makes it possible to mutate the builtins
during evaluation. The commit message for CL/6867 expressed a
desire to eliminate this possibility:
This opens up a potentially dangerous footgun in which we could
mutate the builtins at runtime leading to different compiler
invocations seeing different builtins, so it'd be nice to have
some kind of "finalised" status for them or some such, but I'm not
sure how to represent that atm.
This CL replaces the RefCell with Rc::new_cyclic(), making the
globals/builtins immutable once again. At VM runtime (once opcodes
start executing) everything is the same as before this CL, except
that the Rc<RefCell<>> introduced by CL/6867 is turned into an
rc::Weak<>.
The function passed to Rc::new_cyclic works very similarly to
overlays in nixpkgs: a function takes its own result as an argument.
However instead of laziness "breaking the cycle", Rust's
Rc::new_cyclic() instead uses an rc::Weak. This is done to prevent
memory leaks rather than divergence.
This CL also resolves the following TODO from CL/6867:
// TODO: encapsulate this import weirdness in builtins
The main disadvantage of this CL is the fact that the VM now must
ensure that it holds a strong reference to the globals while a
program is executing; failure to do so will cause a panic when the
weak reference in the builtins is upgrade()d.
In theory it should be possible to create strong reference cycles
the same way Rc::new_cyclic() creates weak cycles, but these cycles
would cause a permanent memory leak -- without either an rc::Weak or
RefCell there is no way to break the cycle. At some point we will
have to implement some form of cycle collection; whatever library we
choose for that purpose is likely to provide an "immutable strong
reference cycle" primitive similar to Rc::new_cyclic(), and we
should be able to simply drop it in.
Signed-off-by: Adam Joseph <adam@westernsemico.com>
Change-Id: I34bb5821628eb97e426bdb880b02e2097402adb7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7097
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
|
|
This commit deduplicates the Thunk-like functionality from Closure
and unifies it with Thunk.
Specifically, we now have one and only one way of breaking reference
cycles in the Value-graph: Thunk. No other variant contains a
RefCell. This should make it easier to reason about the behavior of
the VM. InnerClosure and UpvaluesCarrier are no longer necessary.
This refactoring allowed an improvement in code generation:
`Rc<RefCell<>>`s are now created only for closures which do not have
self-references or deferred upvalues, instead of for all closures.
OpClosure has been split into two separate opcodes:
- OpClosure creates non-recursive closures with no deferred
upvalues. The VM will not create an `Rc<RefCell<>>` when executing
this instruction.
- OpThunkClosure is used for closures with self-references or
deferred upvalues. The VM will create a Thunk when executing this
opcode, but the Thunk will start out already in the
`ThunkRepr::Evaluated` state, rather than in the
`ThunkRepr::Suspeneded` state.
To avoid confusion, OpThunk has been renamed OpThunkSuspended.
Thanks to @sterni for suggesting that all this could be done without
adding an additional variant to ThunkRepr. This does however mean
that there will be mutating accesses to `ThunkRepr::Evaluated`,
which was not previously the case. The field `is_finalised:bool`
has been added to `Closure` to ensure that these mutating accesses
are performed only on finalised Closures. Both the check and the
field are present only if `#[cfg(debug_assertions)]`.
Change-Id: I04131501029772f30e28da8281d864427685097f
Signed-off-by: Adam Joseph <adam@westernsemico.com>
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7019
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
|
|
Change-Id: I8aa878dee009901feb453c489ce37c12fa3a31a8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7026
Autosubmit: sterni <sternenseemann@systemli.org>
Reviewed-by: Adam Joseph <adam@westernsemico.com>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
|
|
Currently, the span on *all* thunk force errors is the span at which the
thunk is forced, which for recursive thunk forcing ends up just being
the same span over and over again. This changes the span on thunk force
errors to be the span at which point the thunk is *created*, which is a
bit more helpful (though the printing atm is a little... crowded). To
make this work, we have to thread through the span at which a thunk is
created into a field on the thunk itself.
Change-Id: I81474810a763046e2eb3a8f07acf7d8ec708824a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6932
Autosubmit: grfn <grfn@gws.fyi>
Reviewed-by: Adam Joseph <adam@westernsemico.com>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
|
|
Change-Id: I09f512a60989a37184e73e521d4a3aa23f33a1a8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6922
Autosubmit: grfn <grfn@gws.fyi>
Tested-by: BuildkiteCI
Reviewed-by: Adam Joseph <adam@westernsemico.com>
Reviewed-by: kanepyork <rikingcoding@gmail.com>
|
|
Change-Id: If3fd0b087009a2bfbad8bb7aca0aa20de906eb12
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6921
Tested-by: BuildkiteCI
Reviewed-by: Adam Joseph <adam@westernsemico.com>
Reviewed-by: kanepyork <rikingcoding@gmail.com>
Autosubmit: grfn <grfn@gws.fyi>
Reviewed-by: tazjin <tazjin@tvl.su>
|
|
Change-Id: Ife8a690e9036868964771893ab29a9ae3a2d2365
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6919
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
|
|
Co-authored-by: Griffin Smith <root@gws.fyi>
Change-Id: I5ff19efbe87d8f571f22ab0480500505afa624c5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6552
Autosubmit: wpcarro <wpcarro@gmail.com>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
|
|
This requires actually passing the source directory into `interpret` in
the eval tests, but otherwise this is fairly straightforward - if we're
trying to import a directory, just push `default.nix` onto it and import
that instead.
Change-Id: I0b7d4234f81977e78d14dfa651bf0cf9721017e5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6893
Autosubmit: grfn <grfn@gws.fyi>
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
|
|
This change is quite verbose, so a little bit of explaining:
1. To correctly format parse errors, errors must be able to return
more than one annotated span (the parser returns a list of errors
for each span).
To accomplish this, the structure of how the `Diagnostic` struct
which formats an error is constructed has changed to delegate the
creation of the `SpanLabel` vector to the kind of error.
2. The rnix structures don't have human-readable output formats by
default, so some verbose methods for formatting them in
human-readable ways have been added in the errors module. We might
want to move these out into a submodule.
3. In many cases, the errors returned by rnix are a bit strange - so
while we format them with all information that is easily available
they may look weird or not necessarily help users. Consider this CL
only a first step in the right direction.
Change-Id: Ie7dd74751af9e7ecb35d751f8b087aae5ae6e2e8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6871
Reviewed-by: sterni <sternenseemann@systemli.org>
Autosubmit: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
|
|
This enables the use of string paths (and, in the future,
derivations), as long as their string values represent an absolute
path.
Change-Id: I4b198efeb70415ed52f58bd1da6fa79a24dad14c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6866
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
|
|
This lets the VM emit warnings when it encounters situations that
should only be warned about at runtime.
For starters, this is used to pass through compilation warnings that
come up when `import` is used.
Change-Id: I0c4bc8c534d699999887c430d93629fadfa662c4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6868
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
|
|
Adding `import` to builtins causes causes a bootstrap cycle because
the `import` builtin needs to be initialised with the set of globals
before being inserted into the globals, which also must contain
itself.
To break out of the cycle this hack wraps the builtins passed to the
compiler in an `Rc` (probably sensible anyways, as they will end up
getting cloned a bunch), containing a RefCell which gives us mutable
access to the builtins.
This opens up a potentially dangerous footgun in which we could mutate
the builtins at runtime leading to different compiler invocations
seeing different builtins, so it'd be nice to have some kind of
"finalised" status for them or some such, but I'm not sure how to
represent that atm.
Change-Id: I25f8d4d2a7e8472d401c8ba2f4bbf9d86ab2abcb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6867
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
|
|
This adds an initial working version of builtins.import which
encapsulates the entire functionality of `import` within the builtin
itself, without requiring any changes in the compiler or VM.
The key insight that enables this is that we can simply return a Thunk
from `import` that is constructed from the output of running the
compiler and - ta-da! - no other component needs to know about it.
A couple of notes:
* builtins.import needs to capture variables like the SourceCode
structure. This means it can not currently be constructed the same
way as other builtins and has special handling, which leaks out to
`eval.rs`. I have postponed dealing with that until we have this
working a bit more.
* the `globals` are not yet passed through
* the error representation for the new variants is absolutely not done
yet, we probably want to switch to something that supports
cause-chaining now (like miette)
* there is no mechanism for emitting warnings at runtime; we need to
add that
Change-Id: I3117a7ae3ff2432bf44f5ff05ad35f47faca31d5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6857
Reviewed-by: sterni <sternenseemann@systemli.org>
Reviewed-by: wpcarro <wpcarro@gmail.com>
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
|
|
Returns time since epoch in seconds.
This has a slight behaviour difference from Nix, in that we don't pin
the time between REPL entries (Nix pins it for the program lifetime),
but this is probably inconsequential as long as it is pinned during an
evaluation.
Change-Id: I010c02e93097a209d8ad69e278397c7e30e54c86
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6846
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
Reviewed-by: wpcarro <wpcarro@gmail.com>
|
|
Allows impure builtins that have a different shape than a Rust
function pointer; specifically this is required for
builtins.currentTime which does not work in WASM.
Change-Id: I1362d8eeafe770ce4d1c5ebe4d119aeb0abb5c9b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6849
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
Reviewed-by: wpcarro <wpcarro@gmail.com>
|
|
Sketch out a new set of "impure" builtins, which supplement the existing
set of "pure" builtins but are gated behind a feature flag, which allows
them to be omitted by crates depending on tvix-eval that only want pure
evaluation, such as tvixbolt.
Change-Id: I2736017b5c9b4776bbba8758e108ec84887abd66
Reviewed-on: https://cl.tvl.fyi/c/depot/+/6655
Reviewed-by: wpcarro <wpcarro@gmail.com>
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
Reviewed-by: tazjin <tazjin@tvl.su>
|