Age | Commit message (Collapse) | Author | Files | Lines |
|
This adds a `parse-bucket-logs.{service,timer}`, running once every
night at 3AM UTC, figuring out the last time it was run and parsing
bucket logs for all previous days.
It invokes the `archeology-parse-bucket-logs` script to produce
a .parquet file with the bucket logs in `s3://nix-cache-log/log/` for
that day (inside a temporary directory), then on success uploads the
produced parquet file to
`s3://nix-archeologist/nix-cache-bucket-logs/yyyy-mm-dd.parquet`.
Change-Id: Ia75ca8c43f8074fbaa34537ffdba68350c504e52
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10011
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
|
|
Clickhouse also has column compression, configurable with the
output_format_parquet_compression_method setting.
It defaults to lz4, and the previous setting got a a zstd-compressed
parquet file with lz4 data.
Set output_format_parquet_compression_method to zstd instead, and sort
by timestamp before assembling the parquet file.
The existing files were updated to the same format with the following query:
```
SELECT * FROM file('bucket_logs_2023-11-11*.pq', 'Parquet', 'auto') ORDER BY timestamp ASC INTO OUTFILE 'bucket_logs_2023-11-11.parquet' SETTINGS output_format_parquet_compression_method = 'zstd'
```
Change-Id: Id63b14c82e7bf4b9907a500528b569a51e277751
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10008
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
|
|
This adds a `archeology-parse-bucket-logs` CLI tool to `$PATH`.
It can be invoked like this:
```
archeology-parse-bucket-logs http://nix-cache-log.s3.amazonaws.com/log/2023-11-10-00-* bucket_logs_2023-11-10-00.pq.zstd
````
… and will produce a zstd-compressed Parquet file for (roughly) that
time range.
As the EC2 instance credentials don't give access to the logs bucket
(yet), other AWS credentials need to be provided.
This can be accomplished by using "AWS_ACCESS_KEY_ID",
"AWS_SECRET_ACCESS_KEY", "AWS_SESSION_TOKEN" from
"Option 2: Manually add a profile to your AWS credentials file (Short-
term credentials)" in AWS IAM Identity Center.
Processing logs for a one-hour range takes a minute or two, the
resulting zstd-compressed Parquet file is around 40-80M in size.
Processing logs for a whole day takes some 25mins, due to the sheer
amount of data (12 GB of raw log data, distributed among 450k individual
files, 20Mio log lines), but at least clickhouse isn't able to parse the
resulting parquet file back in:
> Code: 36. DB::Exception: IOError: Couldn't deserialize thrift: MaxMessageSize reached
For future automation tasks, it's probably better to run this once an
hour, and further join the data later on.
Change-Id: I6c8108c0ec17dc8d4e2dbe923175553325210a5c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10007
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
|
|
Rather than picking up from clickhouse-specific config files, this gets
it to pick up from the ambient environment, which is closer to (but not
the same as) the AWS default credentials chain.
Change-Id: I9c498c231974ed345c3e3d354ec230052b4d0ff2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10006
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
|
|
This behaviour might change (or not), see https://github.com/ClickHouse/
ClickHouse/pull/42003, but as of now, a `--progress` will provide some
progress.
Change-Id: I4891b6e2f96f2656858e71f88a226d24f0d45dc3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10005
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
|
|
Change-Id: Ic34fefdaac82fd1e23d248f2e5fec282384b8fc0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9984
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
|
|
Change-Id: I096b6fed8c73ddd5a417f5183cc113356ffd98c9
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9983
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
|
|
Expose `deps` separately, add a direnv with PATH_add for it to bring
tooling into $PATH.
Change-Id: I432cd2b082cad89e08bef78dc4653e10e137cd6b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9842
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
|
|
Avoid having to re-enter the shell whenever the config is changed.
Change-Id: Ib9f6bb4075e29acaeb4863d64c017695ca85b60b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9841
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
|
|
Change-Id: Ib7ae1871a5d0b16a68c79b68e7e79fd302da79bd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9840
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
|
|
nix copy seems to stall on the EC2 box for unknown reasons.
Change-Id: I30639a52758814968d3b54d716522fb88db80cfe
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9839
Reviewed-by: flokli <flokli@flokli.de>
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
|
|
Change-Id: I3ea801b47267f4c985c2ab5cb1b79b2659894307
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9838
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
|
|
Change-Id: I8470c0a2416c0c397e009affb44f8c7a852cd526
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9837
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
|
|
This add the EC2 box config to the repo.
Change-Id: Id7a888a2cfbf1454cd9f9465018df377e14b4e9f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9836
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
|
|
This adds a deploy-archeology script.
I tried getting morph to work first, but passing it a
depot.ops.nixos.nixosFor seems to be very hard - the NixOS module system
doesn't like the arguments it's called with.
Replace morph with a 3 line bash script, which assumes your ssh_config
contains config for an `archeology` host.
Change-Id: I2bf694c60ded39c201efbbb899f3b5512aa4d0f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9835
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
|
|
Change-Id: Ifdcac829ede4ec469a7ce1b608e78bae11f2766b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9834
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
|
|
Change-Id: Ic31cb8030179ff37b1cc3d3d9241e2582cfe3e5e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9833
Tested-by: BuildkiteCI
Reviewed-by: edef <edef@edef.eu>
Autosubmit: flokli <flokli@flokli.de>
|
|
This adds the materials for the lightning talk held at All Systems Go
Conference 2023, Berlin.
Talk lives at https://media.ccc.de/v/all-systems-go-2023-245-tvix-store
Change-Id: I114b1aec9f1953c148dd29ca88888c16b9fc655d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9751
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
|
|
This adds the materials for the talk held at NixCon 2023, Darmstadt.
It seems it's not really possible to export a PDF out of this,
at least I couldn't get reveal-md to do it properly.
The video recordings live on
https://media.ccc.de/v/nixcon-2023-35254-tvix and
https://www.youtube.com/watch?v=j67prAPYScY .
Change-Id: Id27dc8c8637ffd455981141383c54d8d1484679e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9750
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
|
|
Change-Id: Iacc521dfdd4b4a2d5cef3920cf8189bcce35a488
|
|
Change-Id: Ibdb5b498f8bbc837fffdb38cdf95499b279773aa
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2683
Reviewed-by: lukegb <lukegb@tvl.fyi>
Tested-by: BuildkiteCI
|
|
Change-Id: Id19eafb3f2cc7dfa1ec8c47cbe9c5766ac491516
Reviewed-on: https://cl.tvl.fyi/c/depot/+/2682
Reviewed-by: lukegb <lukegb@tvl.fyi>
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
|