about summary refs log tree commit diff
path: root/corp/russian/data-import
diff options
context:
space:
mode:
authorRyan Lahfa <tvl@lahfa.xyz>2023-08-19T16·36+0200
committerraitobezarius <tvl@lahfa.xyz>2023-08-20T19·34+0000
commitd504b440c2ac420b6618b5caa5b08684575cf2ba (patch)
treed40014f01b2010f26312610dfbaba6cbe5e4fdb2 /corp/russian/data-import
parentbb48d8c61bed61e3c9318e214c838df74b60ee8e (diff)
feat(tvix/nix-compat): don't swallow hash validation errors r/6494
Previously, Output deserialization would silence validation errors and
provide `None` for `hash_with_mode` as soon as a validation error would
happen inside of the `NixHashWithMode` deserialization, e.g. invalid
hash length would not provide an validation error but a silent `None`
value.

This is problematic, we workaround a serde limitation here by writing
our own Deserializer.

As you can see, we write some boilerplate code unfortunately, as, for
example:

- `#[serde(fail_if_unparsed_as="Option::is_none")]` is not a thing,
  otherwise, we could have been able to just bubble up errors in case of
  "not fully parsed" (and not missing) values.

- `From<&serde_json::Value> for serde::de::Unexpected` is not a thing,
  otherwise, we could just map invalid type errors and reuse the
  existing types instead of doing extremely bizarre things with
  `serde::de::Unexpected::Other`, note: this is a not problem for
  expected, we know what we expect, we don't know what we received in
  practice.

I decided to write a `NixHashWithMode::from_map` which will eat a map
deserialized via `serde_json`, so our serde magic is not totally "data
model" agnostic.

I wanted to go for data model agnosticity and enable maximal
performance, e.g. building the structure as the map values are streamed
in the Visitor, this is needlessly painful because `Output` and
`NixHashWithMode` are in different files and this really makes sense
only if we write the full implementation in one file, indeed, otherwise,
you end up duplicating the work or having strange coupling.

So, for now, we will allocate a full map of the fields inside the
`Output`, i.e. if any "unknown field" is in that map, we will
deserialize it for no reason.

Doing it properly, as I repeat it in the code and to flokli at C3Camp
2023, requires to patch serde upstream IMHO.

Change-Id: I46fe6ccb8c390c48d6934fd3e3f02a0dfe59557b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9107
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Diffstat (limited to 'corp/russian/data-import')
0 files changed, 0 insertions, 0 deletions