about summary refs log tree commit diff
AgeCommit message (Collapse)AuthorFilesLines
2023-11-19 r/7039 refactor(tvix/nix-compat): move narinfo into separate modFlorian Klink1-2/+4
Change-Id: Id85f979e46946da0345483cbbc6de3dd29c94c63 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10077 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-11-19 r/7038 feat(tvix/store/pathinfoservice): implement NixHTTPPathInfoServiceFlorian Klink6-10/+1043
NixHTTPPathInfoService acts as a bridge in between the Nix HTTP Binary cache protocol provided by Nix binary caches such as cache.nixos.org, and the Tvix Store Model. It implements the [PathInfoService] trait in an interesting way: Every [PathInfoService::get] fetches the .narinfo and referred NAR file, inserting components into a [BlobService] and [DirectoryService], then returning a [PathInfo] struct with the root. Due to this being quite a costly operation, clients are expected to layer this service with store composition, so they're only ingested once. The client is expected to be (indirectly) using the same [BlobService] and [DirectoryService], so able to fetch referred Directories and Blobs. [PathInfoService::put] and [PathInfoService::nar] are not implemented and return an error if called. This behaves very similar to the nar-bridge-pathinfo code in nar-bridge, except it's now in Rust. Change-Id: Ia03d4fed9d0657965d100299af97cd917a03f2f0 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10069 Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-11-19 r/7037 refactor(tvix/castore/blobservice): rm AsyncBufRead from BlobReaderFlorian Klink1-4/+1
There's no need to already require this to be buffered here. Change-Id: Ib9a11b194e0754d87ab8d2ef0b8cb0f4edc01229 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10074 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-11-19 r/7036 feat(users/flokli/keyboard): align 3rd layer with thinkpadFlorian Klink1-1/+2
Make these keys behave a bit more like the Fn+F* keys on the thinkpad keyboards. Seems there's currently no trivial way to get mic mute, keyboard mute and wifi toggle sent out, but considering the thinkpad usb keyboard is able to, this should be possible somehow here too - but not today, left for a followup. Change-Id: I529a958c78116dd9f7250c938e2e7989b296d6c6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10076 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-11-19 r/7035 feat(nix-compat/nar/reader): provide passthrough buffered I/Oedef2-18/+80
Allow taking advantage of the buffer of the underlying reader to avoid unnecessary copies of file data. We can't easily implement the methods of BufRead directly, since we have some extra I/O to perform in the final consume() invocation. That could be resolved at the cost of additional bookkeeping, but this will suffice for now. Change-Id: I8100cf0abd79e7469670b8596bd989be5db44a91 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10089 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-19 r/7034 fix(nix-compat/nar/reader): require BufReadedef2-4/+4
We rely on being able to make small reads cheaply, so this was already an implicit practical requirement. Requiring it explicitly removes a performance footgun, and makes further optimisations possible. Change-Id: I7f65880a41b1d6b5e6bf2e52dfe47d4c49b34bcd Reviewed-on: https://cl.tvl.fyi/c/depot/+/10088 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-11-19 r/7033 fix(nix-compat/store_path): valid names ⊊ UTF-8edef1-1/+2
We don't need to validate UTF-8 separately, since valid names are a strict subset of ASCII, and therefore a strict subset of UTF-8. Change-Id: I3261bf0efe3480b5b315074efafcf5e47a6c5a65 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10087 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su>
2023-11-19 r/7032 fix(tvix): patch futures::AsyncBufReadExt::fill_bufedef5-24/+84
This fixes EOF handling for buffered readers. Link: https://github.com/rust-lang/futures-rs/pull/2801 Change-Id: Ie98ca6a3e1de38500b0195e9b62511501acb1d2c Reviewed-on: https://cl.tvl.fyi/c/depot/+/10086 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-19 r/7031 chore(tvix): upgrade futures to 0.3.29edef2-36/+36
Change-Id: I8fd63be3cbec8766fd6d72cd9271989a19774816 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10085 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-18 r/7030 refactor(tvix/store/fs): simpllify readFlorian Klink1-19/+3
We can just use take(size) to restrict reading to that as a max. Change-Id: I0fbda74e4fb98ffeababae86a325233416029acf Reviewed-on: https://cl.tvl.fyi/c/depot/+/10072 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-18 r/7029 feat(tvix/store): From<&nix_compat::...::NarInfo<'_>> for PathInfoFlorian Klink2-2/+176
This allows converting from the NarInfo falling out of the NarInfo parser (which is a bit annoying to handle due to lifetimes) to the PathInfo proto struct. The narinfo field, containing most of the data from the original NARInfo file, as well as the references (bytes) are populated. The node field is not populated, because it requires ingesting the NAR itself to describe the root node. Change-Id: I9c04dd6ad4cae556b455188a4255e34b4f6443c5 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10067 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
2023-11-18 r/7028 refactor(tvix/nix-compat): no impl <StorePathRef<'_>> for StorePathFlorian Klink1-10/+8
This suggests it's cheap to convert around, but name actually does allocate. Move to a `to_owned(&self) -> StorePath`, to better signal that this does allocate. Change-Id: Ifaf7c21599e2a467d06e2b4ae1364228370275db Reviewed-on: https://cl.tvl.fyi/c/depot/+/10066 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-11-18 r/7027 feat(tvix/castore): fix tracing instrument in `MemoryBlobService`Ryan Lahfa1-2/+3
Change-Id: Iedba57e8b3e1a44f14f5baa1e981275d4b02eb56 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10070 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-11-18 r/7026 feat(tvix/castore): impl From<std::io::Error> for ErrorFlorian Klink1-0/+10
Make it less annoying to convert from io::Error to this. We already have one direction, doesn't hurt to have the other too. Change-Id: I9fe2c6da608c9d54910ee8c397572aadb1d90d99 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10068 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Reviewed-by: flokli <flokli@flokli.de> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-17 r/7025 refactor(tvix/castore/tonic): use match in channel_from_urlFlorian Klink1-45/+48
Having random if blocks and returning from them is error-prone. Also, turns out we only need the unprefixed scheme in the fallback case, so move it down to there. Change-Id: Ifcb09279c963f8a39e0dbabe145990263f3d7cf9 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10041 Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-11-16 r/7024 docs(tvix/glue): fix doc-comment referenceFlorian Klink1-1/+1
This has been renamed to descend_to in cl/9373. Change-Id: Ia6201fb81c7d4fa953d311451cfff95373549a50 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10045 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: edef <edef@edef.eu> Tested-by: BuildkiteCI
2023-11-15 r/7023 refactor(tvix/castore/utils): drop unused DuplexStreamWrapperFlorian Klink1-13/+1
This wasn't used at all, let's remove it. Change-Id: I426e3d93c32ebe65247ae5cf8d05b5bf686be2d6 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10044 Tested-by: BuildkiteCI Reviewed-by: edef <edef@edef.eu>
2023-11-15 r/7022 refactor(tvix/castore/tonic): make async, support wait-connect=?Florian Klink11-182/+170
This moves the sync `channel::from_url` to a async `tonic::channel_from_url`. It now allows connecting non-lazily if `wait- connect=1` is set in the URL params. Also, make the pingpong tests for blobsvc and directorysvc use the wait- connect=1 codepath. Change-Id: Ibeea33117c8121814627e7f6aba0e943ae2e92ca Reviewed-on: https://cl.tvl.fyi/c/depot/+/10030 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2023-11-15 r/7021 refactor(tvix/castore): remove DirectoryService::from_urlFlorian Klink5-187/+98
Make directoryservice::from_addr use the more specific constructors. Change-Id: I9fee2afed77692505988d631d9fe246d9843d25a Reviewed-on: https://cl.tvl.fyi/c/depot/+/10029 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2023-11-15 r/7020 refactor(tvix/castore/blobsvc): remove BlobService::from_urlFlorian Klink5-241/+98
Make blobservice::from_addr use the more specific constructors. Change-Id: Id9637e279d6910ce6d92ff0086a984be5c65a8c8 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10028 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2023-11-15 r/7019 refactor(tvix/store/pathinfosvc/from_addr): use test_caseFlorian Klink1-93/+46
All we do is constructing some strings, and checking if from_addr succeeds or not. This can be written in a much more concise way using test_case. Use lazy_static to provide temporary directories. Also add some more grpc-related test cases. Change-Id: Ia310dd01f617f7628f1e7e21304ac70da2ab3534 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10027 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2023-11-15 r/7018 refactor(tvix/store/pathinfosvc): inline SledPathInfoSvc::from_urlFlorian Klink2-136/+81
Change-Id: I0d905228df086a422bb30322add7236ca41e807b Reviewed-on: https://cl.tvl.fyi/c/depot/+/10026 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2023-11-15 r/7017 refactor(tvix/store/pathinfosvc): inline GRPCPathInfoSvc::from_urlFlorian Klink2-28/+27
Change-Id: Ib53b5525ae13c276e61b7f564673b7c6144ffc0e Reviewed-on: https://cl.tvl.fyi/c/depot/+/10025 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2023-11-15 r/7016 feat(tvix/castore/src/channel): move from_url testsFlorian Klink2-73/+61
These gRPC PathInfoService tests were actually not too useful in here, what we're mostly testing is the channel construction, so move it to there. Change-Id: Ic8c07558a1b28b46f863d5c39bcaa3a79cea007a Reviewed-on: https://cl.tvl.fyi/c/depot/+/10024 Reviewed-by: Connor Brewster <cbrewster@hey.com> Tested-by: BuildkiteCI
2023-11-15 r/7015 refactor(tvix/store/pathinfosvc): inline MemoryPathInfoSvc::from_urlFlorian Klink2-88/+47
Change-Id: If27eb518d372f4004b7b38fc765a42957f2a6b50 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10023 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2023-11-15 r/7014 refactor(tvix/store): remove from_url from PathInfoService traitFlorian Klink4-31/+13
We don't gain much from making this part of the trait, it's still up to `tvix_store::pathinfoservice::from_addr` to do most of the construction. Move it out of the trait and into the specific *Service impls directly. This allows further refactorings in followup CLs. Change-Id: I99b93ef4acd83637a2f4888a1e586f1ca96390dc Reviewed-on: https://cl.tvl.fyi/c/depot/+/10022 Tested-by: BuildkiteCI Reviewed-by: Connor Brewster <cbrewster@hey.com>
2023-11-14 r/7013 fix(users/flokli/archeology/parse_bucket_logs): fix regex and skipFlorian Klink1-1/+2
It seems the regex is not perfect, it choked on a single log line: ``` Nov 13 03:10:19 archeology-ec2 59nkrwmih3ywaxrgxqj79pn395fs6m17-parse-bucket-logs-continuously[11105]: Code: 117. DB::Exception: Line "d57bd890fbd1ae16625bdb8168064125e013198099b7e1b3c24878a4d03c3ab8 nix-cache [12/Nov/2023:09:13:02 +0000] xxx.xx.xxx.xxx - VB7SJVZ108DSSN67 REST.POST.OBJECT index.html "POST /index.html HTTP/1.1" 405 MethodNotAllowed 348 - 4 - "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" - 0bFdGKbi0n9JHXU1a2hijcJwmYdc6lG2xgbdozc3wS6mlUkBE7ssrQCHIDdOLebo78o2cGbhivY= - ECDHE-RSA-AES128-GCM-SHA256 - nix-cache.s3.amazonaws.com TLSv1.2 - -" doesn't match the regexp.: (in file/uri log/2023-11-12-10-19-50-80805A702ECF65EB): (at row 5) ``` This was due to the user-agent field. The regex is now fixed. The request itself is fun (someone trying to POST an index.html to the bucket), and we should probably filter this on the Fastly side already, not via IAM, In any case, there's no point failing to parse if a single line doesn't match the regex - we can just skip them. For the sake of completeness, logs for that day have been reprocessed and reuploaded. Change-Id: Id98a7167a381cda06d150ad5118ee9e70ead277e Reviewed-on: https://cl.tvl.fyi/c/depot/+/10034 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-11-14 r/7012 feat(users/flokli/archeology): install parquet-toolsedef1-0/+1
Change-Id: I64cd83fbce920eeabace5b49ef623c033d98a8be Reviewed-on: https://cl.tvl.fyi/c/depot/+/10000 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-14 r/7011 feat(users/flokli/archeology): install DuckDBedef1-0/+1
Change-Id: I76bc20711c7e59d184659db134ba224cfcd7f6cb Reviewed-on: https://cl.tvl.fyi/c/depot/+/9999 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-11-13 r/7010 feat(users/grfn/web): Update work sectionAspen Smith1-4/+8
I no longer work at ReadySet. Change-Id: Idc19e2d68846551b6cd94f84594712692ebe35a9 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9976 Tested-by: BuildkiteCI Autosubmit: grfn <grfn@gws.fyi> Reviewed-by: grfn <grfn@gws.fyi>
2023-11-13 r/7009 feat(users/flokli/archeology): turn on task_delayacctedef1-0/+6
More ClickHouse perf stats ^_^ Change-Id: I4f6882b1a6c1ebfed9a430e62ca634a141cd1cf1 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9998 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-13 r/7008 feat(users/flokli/keyboard): add missing KC_PSCRFlorian Klink1-1/+1
Change-Id: I005defb868151ecec95e710523db3d23c859e489 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10021 Reviewed-by: flokli <flokli@flokli.de> Autosubmit: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-13 r/7007 feat(users/flokli/keyboard): initFlorian Klink3-0/+116
This packages up my keyboard firmware used for the Keychron K6 Pro. We add a custom keymap to the `keyboards/keychron/k6_pro/ansi/rgb/ keymaps` directory, a copy from the `default` one (with a modified `keymap.c`), and then build that as a makefile target. `via` is *disabled*, as their keybindings take priority over keymap.c. Luckily, only `qmk` seems to be sufficient to build it. A simple `:flash` target/script is provided as well, it relies on some udev rules set in the global system (`hardware.keyboard.qmk.enable = true`). Change-Id: I9f7a7a992e13516c32033127f94e37aec62d6b67 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10020 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-12 r/7006 chore(ops/journaldriver): bump cargo dependenciesVincent Ambo1-132/+138
Fixes: * RUSTSEC-2023-0022 * RUSTSEC-2023-0044 * RUSTSEC-2023-0023 * RUSTSEC-2023-0024 Change-Id: Ib2813cf7a7a38fd50a1695de7b380cef4299a0c3 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10019 Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: flokli <flokli@flokli.de>
2023-11-12 r/7005 chore(fun/paroxysm): bump cargo dependenciesVincent Ambo1-229/+289
Fixes: * RUSTSEC-2023-0022 * RUSTSEC-2023-0044 * RUSTSEC-2023-0023 * RUSTSEC-2023-0024 * RUSTSEC-2023-0018 * RUSTSEC-2020-0071 There's a remaining issue in tokio, which did not get upgraded by a simple `cargo update`. Change-Id: I1459678a9d706af684620ee4c07eeace3955ce80 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10018 Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de>
2023-11-12 r/7004 chore(web/atward): bump cargo dependenciesVincent Ambo1-206/+273
Fixes: * RUSTSEC-2023-0018 Change-Id: I1484649b495f7a9b0a9627e129f2bad4ff436a07 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10017 Reviewed-by: flokli <flokli@flokli.de> Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-11-12 r/7003 feat(users/flokli/archeology): add AWS config to shellFlorian Klink1-1/+14
This allows using awscli inside a shell. Clickhouse AWS SSO integration still seems broken unfortunately, even with https://github.com/ClickHouse/ClickHouse/pull/54347 included in our bump - it seems it's coming up with another token file path than the AWS SDK: > SSOCredentialsProvider: Unable to open token file on path: /home/flokli/.aws/sso/cache/da39a3ee5e6b4b0d3255bfef95601890afd80709.json This is the sha1sum of the sso_start_url, not the sha1sum of the session-name (nixos / f2f059b8b7298f1ad52636d67cef8b719aa83bf5). Change-Id: Ia1bdec03c4f269a7415c42c90c1f4fd3d928f770 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10012 Reviewed-by: edef <edef@edef.eu> Tested-by: BuildkiteCI
2023-11-12 r/7002 chore(tools/cheddar): bump cargo dependenciesVincent Ambo1-429/+372
Fixes: * RUSTSEC-2023-0018 Change-Id: If4b5ea9edacc6f1e8664387e96e7abc24618b1a1 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10016 Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de> Autosubmit: tazjin <tazjin@tvl.su>
2023-11-12 r/7001 chore(net/alcoholic_jwt): bump cargo dependenciesVincent Ambo1-46/+45
Fixes: * RUSTSEC-2023-0022 * RUSTSEC-2023-0044 * RUSTSEC-2023-0023 * RUSTSEC-2023-0024 Change-Id: I6eb9d1041e6b4ce4665e9829ad4aad5385990724 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10015 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su>
2023-11-12 r/7000 chore(tazjin/tgsa): bump cargo dependenciesVincent Ambo1-226/+294
Fixes: - RUSTSEC-2023-0044 - RUSTSEC-2023-0018 Change-Id: Ifc1acce5696f9ec584ac7790d3a99f8ad7d28707 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10014 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-11-12 r/6999 chore(tazjin/yddns): bump cargo dependenciesVincent Ambo1-360/+312
Fixes RUSTSEC-2023-0053. Change-Id: I6b9fc31dad405b7f9fc21c27fc7beee3687a4572 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10013 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-11-12 r/6998 chore(3p/sources): bump nixpkgs & channels (2023-11-12)Vincent Ambo11-72/+68
* update wasm-bindgen in all Rust-wasm projects * remove stable overlays that work again in unstable * add texlive to stable overlays (see linked nixpkgs PR) * bump tdlib to 1.8.18, new minimum for telega.el Change-Id: Ib8e202de7dfbc35115fda31d0a98b6314b2adf17 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10010 Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: flokli <flokli@flokli.de>
2023-11-12 r/6997 feat(tazjin/blog): import blog post on emacs buffer switching thingVincent Ambo2-0/+24
This was previously only in my Telegram channel, but it might as well be on the blog itself. Change-Id: I301ebeaa4dd1875f3858cee5259a5c689b950790 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10009 Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2023-11-12 r/6996 feat(users/flokli/nixos/archeology-ec2): automate bucket log parsingFlorian Klink2-0/+80
This adds a `parse-bucket-logs.{service,timer}`, running once every night at 3AM UTC, figuring out the last time it was run and parsing bucket logs for all previous days. It invokes the `archeology-parse-bucket-logs` script to produce a .parquet file with the bucket logs in `s3://nix-cache-log/log/` for that day (inside a temporary directory), then on success uploads the produced parquet file to `s3://nix-archeologist/nix-cache-bucket-logs/yyyy-mm-dd.parquet`. Change-Id: Ia75ca8c43f8074fbaa34537ffdba68350c504e52 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10011 Reviewed-by: edef <edef@edef.eu> Tested-by: BuildkiteCI
2023-11-12 r/6995 chore(3p/nixpkgs/clickhouse): 23.3.13.6 -> 23.10.3.5edef4-26/+234
Change-Id: I3e4c43690fcaf50965152bf40e1ca2b027010fcf Reviewed-on: https://cl.tvl.fyi/c/depot/+/9997 Reviewed-by: flokli <flokli@flokli.de> Tested-by: BuildkiteCI
2023-11-11 r/6994 fix(users/flokli/archaeology): don't use file but column compressionFlorian Klink1-2/+5
Clickhouse also has column compression, configurable with the output_format_parquet_compression_method setting. It defaults to lz4, and the previous setting got a a zstd-compressed parquet file with lz4 data. Set output_format_parquet_compression_method to zstd instead, and sort by timestamp before assembling the parquet file. The existing files were updated to the same format with the following query: ``` SELECT * FROM file('bucket_logs_2023-11-11*.pq', 'Parquet', 'auto') ORDER BY timestamp ASC INTO OUTFILE 'bucket_logs_2023-11-11.parquet' SETTINGS output_format_parquet_compression_method = 'zstd' ``` Change-Id: Id63b14c82e7bf4b9907a500528b569a51e277751 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10008 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-11-11 r/6993 feat(users/flokli/nixos/archeology-ec2): add parse-bucket-logsFlorian Klink1-0/+4
This adds a `archeology-parse-bucket-logs` CLI tool to `$PATH`. It can be invoked like this: ``` archeology-parse-bucket-logs http://nix-cache-log.s3.amazonaws.com/log/2023-11-10-00-* bucket_logs_2023-11-10-00.pq.zstd ```` … and will produce a zstd-compressed Parquet file for (roughly) that time range. As the EC2 instance credentials don't give access to the logs bucket (yet), other AWS credentials need to be provided. This can be accomplished by using "AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY", "AWS_SESSION_TOKEN" from "Option 2: Manually add a profile to your AWS credentials file (Short- term credentials)" in AWS IAM Identity Center. Processing logs for a one-hour range takes a minute or two, the resulting zstd-compressed Parquet file is around 40-80M in size. Processing logs for a whole day takes some 25mins, due to the sheer amount of data (12 GB of raw log data, distributed among 450k individual files, 20Mio log lines), but at least clickhouse isn't able to parse the resulting parquet file back in: > Code: 36. DB::Exception: IOError: Couldn't deserialize thrift: MaxMessageSize reached For future automation tasks, it's probably better to run this once an hour, and further join the data later on. Change-Id: I6c8108c0ec17dc8d4e2dbe923175553325210a5c Reviewed-on: https://cl.tvl.fyi/c/depot/+/10007 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-11-11 r/6992 fix(users/flokli/archeology): make clickhouse use ambient AWS credsFlorian Klink1-1/+22
Rather than picking up from clickhouse-specific config files, this gets it to pick up from the ambient environment, which is closer to (but not the same as) the AWS default credentials chain. Change-Id: I9c498c231974ed345c3e3d354ec230052b4d0ff2 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10006 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-11-11 r/6991 feat(users/flokli/archeology): show clickhouse-local progressFlorian Klink1-1/+2
This behaviour might change (or not), see https://github.com/ClickHouse/ ClickHouse/pull/42003, but as of now, a `--progress` will provide some progress. Change-Id: I4891b6e2f96f2656858e71f88a226d24f0d45dc3 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10005 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI
2023-11-11 r/6990 feat(users/flokli/archeology): add shellFlorian Klink1-0/+5
Change-Id: Ic34fefdaac82fd1e23d248f2e5fec282384b8fc0 Reviewed-on: https://cl.tvl.fyi/c/depot/+/9984 Tested-by: BuildkiteCI Reviewed-by: raitobezarius <tvl@lahfa.xyz>