about summary refs log tree commit diff
path: root/ops/pipelines/static-pipeline.yaml
AgeCommit message (Collapse)AuthorFilesLines
2024-08-19 r/8517 feat(ops/pipelines): support buildkite retriesFlorian Klink1-4/+6
cl/12228 did enable automatic retries for some flaky tests, which generally did work, as can be seen in https://buildkite.com/tvl/depot/builds/35893 However, ":duck:" still reports as failing, because we check the number of steps to be nonzero, which is not the case if retries have happened. We cannot check for the overall status of the build, as it's still "RUNNING", but instead of counting all failed steps so far, we can query all failed jobs and then filter out the ones that were already retried. Change-Id: Ib9d27587c8a8ba7970850812c4302fecdc4482e7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/12233 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su>
2023-07-05 r/6392 feat(tools/git-r): git subcommand to display r/numbers for commitssterni1-0/+5
Sadly, this can't quite be an alias (which would be difficult to automatically set up anyways), since we want to check if an r/number is part of the (upstream) canon branch. The test script for the subcommand doubles up as a soundness check for our pipelines ref creation. Change-Id: I840af6556e50187c69490668bd8a18dd7dc25a86 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8844 Tested-by: BuildkiteCI Autosubmit: sterni <sternenseemann@systemli.org> Reviewed-by: flokli <flokli@flokli.de>
2023-02-01 r/5815 feat(ops/pipelines): trigger tvix buildkite pipelineFlorian Klink1-0/+10
Change-Id: I4e81694b9686f977a6590c5e1703a4ef413b0cf4 Reviewed-on: https://cl.tvl.fyi/c/depot/+/8003 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2022-12-28 r/5529 fix(ops/pipelines): explicitly set contexts for annotationsVincent Ambo1-1/+1
I think what might be going on with b/231 is that the annotations somehow started conflicting because they don't have contexts set. Lets try setting a context and see if it changs anything ... Change-Id: I62ed57f9e24f08e4e7215f05d35cfa769e2e2c24 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7640 Reviewed-by: sterni <sternenseemann@systemli.org> Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2022-12-03 r/5380 fix(ops/pipelines): limit concurrency of :llama:Vincent Ambo1-0/+2
When pushing a large chain of CLs, builds can fail with OOM issues as many Nix evaluations of the depot are happening simultaneously. To work around this, we limit the concurrency of simultaneous Nix evaluations (i.e. the `:llama` step). This can slow down the start of builds in a large chain of small changes, but that is a better tradeoff than failing the builds entirely and making people click buttons. Change-Id: If351aaad22d52e2bcf871377f22ab1df594c518d Reviewed-on: https://cl.tvl.fyi/c/depot/+/7501 Reviewed-by: sterni <sternenseemann@systemli.org> Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
2022-10-08 r/5059 feat(ops/pipelines): allow accessing the nix storesterni1-1/+2
This is already allowed de facto, since there seems to be a special exception for reading from derivation outputs. What is forbidden, is access to files imported to the store (even via builtins.toFile) and derivation files. The latter is required for doing dependency analysis on arbitrary derivations, unfortunately. Access to the store allows kind of evil things, but it should be (hopefully) hard to do this by accident, and accessing derivation files is not impure, though it relies on store implementation internals so to speak. Change-Id: I33a7de83ef0ee20a7076690329d62f6caffffe5f Reviewed-on: https://cl.tvl.fyi/c/depot/+/6835 Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI Reviewed-by: grfn <grfn@gws.fyi>
2022-06-03 r/4202 refactor(nix/buildkite): Rename "post" steps to "release" stepsVincent Ambo1-3/+3
This is in preparation for a subsequent CL that will do much more significant changes in //nix/buildkite. Change-Id: I80a8d67d3a7d593854c8d711572483c2581e7881 Reviewed-on: https://cl.tvl.fyi/c/depot/+/5824 Reviewed-by: ezemtsov <eugene.zemtsov@gmail.com> Tested-by: BuildkiteCI
2022-05-26 r/4144 feat(ops/pipelines): Evaluate depot pipeline in restricted-eval modeVincent Ambo1-1/+4
Change-Id: Ic5b98a0777860b68dabb9a9b59e8c682236a71c7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/4884 Tested-by: BuildkiteCI Reviewed-by: grfn <grfn@gws.fyi>
2022-03-30 r/3924 refactor(ops/pipelines): Configurable GraphQL token locationVincent Ambo1-1/+3
For external users of the pipeline construction, the token might be in a different path than `/run/agenix/buildkite-graphql-token`. It is made configurable through the BUILDKITE_TOKEN_PATH environment variable. This should be configured on the pipeline level to apply to all steps. Change-Id: I23c52e2d705e4134b8b013f8603f92e5533a6e44 Reviewed-on: https://cl.tvl.fyi/c/depot/+/5424 Autosubmit: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI Reviewed-by: asmundo <asmundo@gmail.com>
2022-01-22 r/3659 refactor(ops/pipelines): Move :anchor: into postBuildStepsVincent Ambo1-18/+0
There is no need for this step to be part of the static pipeline (it should not run if the build fails anyways). Change-Id: I71400a452d6f8f4708d146b346eaffda5da2f766 Reviewed-on: https://cl.tvl.fyi/c/depot/+/5049 Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: ezemtsov <eugene.zemtsov@gmail.com>
2022-01-22 r/3658 feat(ops/pipelines): Upload post-build steps in static pipelineVincent Ambo1-0/+15
Change-Id: I5ce6d51837c734951fe10c4f21806cf0fc57ed23 Reviewed-on: https://cl.tvl.fyi/c/depot/+/5048 Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: ezemtsov <eugene.zemtsov@gmail.com>
2022-01-22 r/3657 refactor(ops/pipelines): Split build/post steps into separate chunksVincent Ambo1-1/+1
This will create `build-chunk-$n.json` files for steps that should run _before_ duck, and `post-chunk-$n.json` files for steps that should run after duck. The post steps are not yet uploaded to Buildkite, but we also don't have any right now. Change-Id: I7e1b59cf55a8bf1d97266f6e988aa496959077bf Reviewed-on: https://cl.tvl.fyi/c/depot/+/5047 Tested-by: BuildkiteCI Reviewed-by: ezemtsov <eugene.zemtsov@gmail.com> Autosubmit: tazjin <tazjin@tvl.su>
2022-01-22 r/3656 refactor(ops/pipelines): Use branches filter for canon-only stepsVincent Ambo1-2/+2
Using this instead of a conditional leads to nicer output in the UI, but has no semantic difference. Change-Id: I5b368d663f417d256e4792d2d46b84fc50d42d0e Reviewed-on: https://cl.tvl.fyi/c/depot/+/5045 Reviewed-by: ezemtsov <eugene.zemtsov@gmail.com> Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su>
2022-01-22 r/3655 refactor(ops/pipelines): Move :git: step up in the pipelineVincent Ambo1-14/+15
This step is independent of the build result and can be scheduled at the beginning while pipeline eval is still in progress. Change-Id: I2ee268e4c333efa654dcb12c0b1562b43231d241 Reviewed-on: https://cl.tvl.fyi/c/depot/+/5044 Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: ezemtsov <eugene.zemtsov@gmail.com>
2022-01-22 r/3654 feat(ops/pipelines): Always upload entire pipeline outputVincent Ambo1-1/+1
Previously we only stored the drvmap, but we will also need the build chunks to refactor the generation of dynamic post-steps. Change-Id: I256fffe13af8f8c4521835257f5d87dda323b248 Reviewed-on: https://cl.tvl.fyi/c/depot/+/5043 Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: ezemtsov <eugene.zemtsov@gmail.com>
2022-01-20 r/3650 feat(ops/pipelines): Trigger pipeline for tvl-kit through canonVincent Ambo1-0/+10
This CI pipeline in Buildkite verifies the external (josh-provided) view of the depot at //views/kit. See https://buildkite.com/tvl/tvl-kit Note that this always triggers a build of HEAD. This is because we don't know the transformed commit ID, and we currently have no way to pass a ref through. The pipeline is configured to skip intermediate builds. I asked Buildkite for some ideas on how to improve this, lets see. Change-Id: I6c60fb1ea7606c1c90219ef04fd7bada64661529 Reviewed-on: https://cl.tvl.fyi/c/depot/+/5010 Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: sterni <sternenseemann@systemli.org>
2022-01-19 r/3637 refactor(nix/buildkite): Move fetch-parent-targets script hereVincent Ambo1-1/+1
This is no longer TVL-specific and should live here with the other generalised stuff. Change-Id: I95a1b4c0321f34812162d6fd40568269abf639dd Reviewed-on: https://cl.tvl.fyi/c/depot/+/5006 Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: ezemtsov <eugene.zemtsov@gmail.com>
2022-01-19 r/3636 refactor(ops/pipelines): Generalise fetch-parent-targets scriptVincent Ambo1-2/+2
Removes all TVL-specific values in favour of environment variables supplied by Buildkite. This makes it possible to reuse this script outside of TVL. Change-Id: Ic543bc41e4c81e65ee349ad241c515231e97ab30 Reviewed-on: https://cl.tvl.fyi/c/depot/+/5005 Tested-by: BuildkiteCI Autosubmit: tazjin <tazjin@tvl.su> Reviewed-by: ezemtsov <eugene.zemtsov@gmail.com>
2022-01-17 r/3603 feat(ops/pipelines): Fetch parent target map for pipeline generationVincent Ambo1-1/+11
Change-Id: I1c7d48fc0974549d67146a15f79ddb0b6ddfe805 Reviewed-on: https://cl.tvl.fyi/c/depot/+/4947 Tested-by: BuildkiteCI Reviewed-by: sterni <sternenseemann@systemli.org>
2022-01-17 r/3601 feat(ops/pipelines): Create drvmap structure for each commitVincent Ambo1-0/+3
Always create a structure that maps all targets to derivations, and persist it as a JSON file. This relates to some of the ideas expressed in: https://docs.google.com/document/d/16A0a5oUxH1VoiSM8hyFyLW0WiUYpNo2e2D6FTW4BlH8/edit The file is always uploaded to Buildkite as an artifact. This allows for retrieving it based on the commit ID in a Buildkite GraphQL query. By default, Buildkite stores artefacts for 6 months. Storage location can be overridden (with custom retention) through some environment variables, but for now at TVL the Buildkite-managed storage is fine. See also: https://buildkite.com/docs/pipelines/artifacts In the subsequent filtering implementation, when diffing commits across a time-range that exceeds artefact retention time, we should simply default to building everything. Change-Id: I6d808461cd1c1fdd6983ba8c8ef075736d42caa7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/3662 Tested-by: BuildkiteCI Reviewed-by: sterni <sternenseemann@systemli.org>
2022-01-07 r/3524 revert: "fix(ops/pipelines): Remove duplicated wait step"tazjin1-0/+4
This reverts commit 5e036ed9fc579d14353eb7da4af4b426c99f96e6. Reason for revert: This introduced a logic error since the remaining step runs at the wrong point in the pipeline. Temporarily reverting to having duplicated waits in order to clean up later. Change-Id: Ifa6ece50dd22924f02efd7b790a5863ca1189af7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/4841 Tested-by: BuildkiteCI Reviewed-by: tazjin <tazjin@tvl.su> Autosubmit: tazjin <tazjin@tvl.su>
2022-01-02 r/3512 fix(ops/pipelines): Realise anchor derivation for rootingVincent Ambo1-1/+1
Turns the anchor derivation into something that can actually be built (a call creating a propagated build inputs file), and builds it. This should fix the anchoring logic we have on canon. Change-Id: If6a7662b82e2e396388980f65e332cf67a45b46e Reviewed-on: https://cl.tvl.fyi/c/depot/+/4763 Tested-by: BuildkiteCI Autosubmit: tazjin <mail@tazj.in> Reviewed-by: sterni <sternenseemann@systemli.org>
2022-01-02 r/3510 fix(ops/pipelines): Remove duplicated wait stepVincent Ambo1-4/+0
This now happens in //nix/buildkite instead Change-Id: Ie9e239ee4f28ac34aa4d3279dac55d70a2cb9d86 Reviewed-on: https://cl.tvl.fyi/c/depot/+/4764 Tested-by: BuildkiteCI Reviewed-by: sterni <sternenseemann@systemli.org>
2021-12-19 r/3313 feat(ops/pipelines): annotate patchset builds with Gerrit URLsVincent Ambo1-0/+6
If available, provide a link back to Gerrit on the overview page of a build. Uses the default style (i.e. style unset), which makes it non-intrusive visually. Change-Id: I4271d589d548015b75762fd0584f3958bfcc53e5 Reviewed-on: https://cl.tvl.fyi/c/depot/+/4442 Tested-by: BuildkiteCI Reviewed-by: grfn <grfn@gws.fyi>
2021-12-15 r/3248 fix(ops/pipelines): Chunk build pipeline into multiple uploadsVincent Ambo1-2/+7
The number of jobs in the depot pipeline is reaching the limits of the Buildkite backend's ability for a single pipeline upload. Based on a conversation with their support my understanding is that this has to do with internal locking mechanisms at Buildkite. To work around this, we can instead chunk the pipeline into several smaller chunks that are uploaded serially. This commit introduces logic to chunk the pipeline accordingly. The chunk size chosen is 256 for now (a multiple of our number of agents, which is useful if we can get builds from the first chunk to start before the next ones are uploaded). Note that this chunk size is significantly below even the current number of targets (~460 as of this commit), but choosing a lower chunk size might alleviate problems we've been seeing with timeouts during pipeline uploads. Change-Id: I77030aaf8b874c330218b78c77d15216e13b9af7 Reviewed-on: https://cl.tvl.fyi/c/depot/+/4332 Tested-by: BuildkiteCI Reviewed-by: wpcarro <wpcarro@gmail.com> Autosubmit: tazjin <mail@tazj.in>
2021-12-10 r/3202 refactor(ops/pipelines): Use agenix-deployed besadii secretsVincent Ambo1-2/+2
I *think* this is the final step for b/161 Change-Id: Ie7a2198a045f2f1866a245884ab0f5414e205327
2021-12-10 r/3178 fix(ops/pipelines): Move :anchor: to static pipelineVincent Ambo1-0/+18
This step would get inserted at the wrong point in the build pipeline otherwise, causing a dependency cycle and causing the pipeline to fail. Change-Id: I534568eec77f74ae6c47276820f8a9e99493a3ea
2021-12-10 r/3177 refactor(ops/pipelines): Move :duck: logic into static pipelineVincent Ambo1-7/+37
This simplifies the fallback logic used in case of Nix evaluation failure and makes it so that the evaluation step itself is the one that is marked as failed in Buildkite. This is possible because the pipeline upload command will insert new steps at the point where it runs in the pipeline, and not later. Change-Id: I870534c004ebc457a1602623c4e5f9c0c68e28fc
2021-11-29 r/3116 refactor(ops/pipelines): Query build status from Buildkite APIVincent Ambo1-1/+0
Instead of manually tracking the build status through Buildkite metadata, use the Buildkite GraphQL API in the `:duck:` build step (i.e. the one that determines the status of the entire pipeline to be reported back to Gerrit) to fetch the number of failed jobs. This way we have less manual state accounting in the pipeline. The downside is that the GraphQL query embedded here is a little hard to read. Notes: * This needs an access token for Buildkite. We already have one for besadii which is also run by the agents, so I've given it GraphQL permissions and reused it. * I almost introduced a very rare bug here: My initial intuition was to simply `exit $FAILED_JOBS` - in the extremely rare case where `$FAILED_JOBS % 256 = 0` this would mean we would ... fail to fail the build :) Change-Id: I61976b11b591d722494d3010a362b544efe2cb25
2021-11-06 r/3007 fix(ops/pipelines): Fix tagging of commit revisionsVincent Ambo1-5/+1
It seems that shell variables don't work as expected inside the Buildkite pipeline, so usage of variables has been removed. We also don't echo the revision anymore because of that, but it does still appear in the log of `git push`. Change-Id: I124e3b09af896da898f2a78715ed371651a1c5f8 Reviewed-on: https://cl.tvl.fyi/c/depot/+/3780 Tested-by: BuildkiteCI Reviewed-by: grfn <grfn@gws.fyi>
2021-11-05 r/3005 refactor(ops/pipelines): Move revision tagging into static pipelineVincent Ambo1-0/+18
This makes the revision number available much earlier (before the rest of the pipeline runs, while Nix eval is happening) which should only be a few seconds after a commit to canon. It is also more readable in this shape. Change-Id: Iccbb17dfef6afe68f54fda41e8d10c4dc52b08c2 Reviewed-on: https://cl.tvl.fyi/c/depot/+/3775 Tested-by: BuildkiteCI Reviewed-by: grfn <grfn@gws.fyi>
2021-08-29 r/2799 refactor(ops/pipelines): Move failure status zeroing to setupVincent Ambo1-3/+6
We changed the configured pipeline in Buildkite to upload `static-pipeline.yaml` instead of containing the steps of that pipeline itself. This makes it easier to test changes to builds and such, but adds another build step with scheduling overhead etc. However - we can work around this by killing one of the existing build steps. There's no reason the failure status zeroing (required for status reporting) shouldn't be part of the pipeline setup, so I've moved it there instead and nuked that step. This should mean that the pipeline is configurable from within the repo, but without slowing anything down. Change-Id: I206ecc02647de42a461e33c02879ab84daf5ed2b Reviewed-on: https://cl.tvl.fyi/c/depot/+/3461 Tested-by: BuildkiteCI Reviewed-by: sterni <sternenseemann@systemli.org>
2021-01-30 r/2160 fix(ops/piplines/static-pipeline): add --show-trace to nix-buildProfpatsch1-1/+1
Change-Id: Ib0473f916b1436934844e620ce981f52d11e8512 Reviewed-on: https://cl.tvl.fyi/c/depot/+/2467 Tested-by: BuildkiteCI Reviewed-by: tazjin <mail@tazj.in>
2020-11-17 r/1882 feat(ops/pipelines): Check in the static pipelineVincent Ambo1-0/+15
This file represents the static pipeline which is configured in the Buildkite web UI. Updates to this file should be applied in the admin interface. These steps are responsible for launching the dynamic pipeline evaluation, or falling back to the fallback pipeline if evaluation fails. Change-Id: I6d7dd623cde65e8c69faea729f737c9bba00c2fb Reviewed-on: https://cl.tvl.fyi/c/depot/+/2103 Tested-by: BuildkiteCI Reviewed-by: glittershark <grfn@gws.fyi>