[00:12:19] (03PS3) 10Thcipriani: DNM: Backport/Train schedule update [tools/release] - 10https://gerrit.wikimedia.org/r/756695 [00:12:37] !log deployment-prep cherry-picked gerrit 758584 to beta puppetmaster T300591 [00:12:38] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [00:12:39] T300591: Beta cluster MediaWiki code not updating - https://phabricator.wikimedia.org/T300591 [00:13:40] (03PS4) 10Thcipriani: Backport/Train schedule update [tools/release] - 10https://gerrit.wikimedia.org/r/756695 [00:13:50] (03CR) 10Thcipriani: [C: 03+2] Backport/Train schedule update [tools/release] - 10https://gerrit.wikimedia.org/r/756695 (owner: 10Thcipriani) [00:14:30] (03Merged) 10jenkins-bot: Backport/Train schedule update [tools/release] - 10https://gerrit.wikimedia.org/r/756695 (owner: 10Thcipriani) [00:18:51] 10Beta-Cluster-Infrastructure, 10Infrastructure-Foundations, 10Beta-Cluster-reproducible, 10Patch-For-Review, 10Puppet: Beta cluster MediaWiki code not updating - https://phabricator.wikimedia.org/T300591 (10Tgr) [00:20:09] RECOVERY - Check systemd state on doc1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [00:21:05] 10Beta-Cluster-Infrastructure, 10Infrastructure-Foundations, 10Beta-Cluster-reproducible, 10Patch-For-Review, 10Puppet: Beta cluster MediaWiki code not updating - https://phabricator.wikimedia.org/T300591 (10Tgr) That seemed to fix it for now. jenkins scap [[https://integration.wikimedia.org/ci/view/Beta... [00:34:40] !log deployment-pre un-cherry-picked gerrit 758584 from beta puppetmaster, patch is now merged T300591 [00:34:41] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [00:34:42] T300591: Beta cluster MediaWiki code not updating - https://phabricator.wikimedia.org/T300591 [00:37:39] 10Beta-Cluster-Infrastructure, 10Infrastructure-Foundations, 10Beta-Cluster-reproducible, 10Patch-For-Review, 10Puppet: Beta cluster MediaWiki code not updating - https://phabricator.wikimedia.org/T300591 (10Tgr) @ema seems fixed but I'm not sure if this is the direction we want to go in (as opposed to f... [02:46:31] 10Project-Admins: Mark the #Contributors-Team group as inactive - https://phabricator.wikimedia.org/T300558 (10Tgr) Should the "Contributors" entries also be removed from [[https://www.mediawiki.org/wiki/Developers/Maintainers|Developers/Maintainers]]? The same can probably be asked about "Reading" as well (no... [07:18:59] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Beta-Cluster-reproducible: Beta cluster down: Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T300525 (10AlexisJazz) Broken again. ` Error Our servers are currently under maintenance or experiencing a technical problem.... [07:23:21] 10Beta-Cluster-Infrastructure, 10Infrastructure-Foundations, 10Beta-Cluster-reproducible, 10Puppet: Beta cluster MediaWiki code not updating - https://phabricator.wikimedia.org/T300591 (10AlexisJazz) Beta cluster is down again. (see T300525#7666400) [07:51:39] 10Continuous-Integration-Infrastructure: Stop using integration/composer and then archive the repo - https://phabricator.wikimedia.org/T249949 (10Legoktm) [08:55:18] 10Project-Admins: Mark the #Contributors-Team group as inactive - https://phabricator.wikimedia.org/T300558 (10kostajh) I went through https://phabricator.wikimedia.org/maniphest/?ids=167899,115598,115597,112984,104863,90030,89576,87598,86196,88688#R and untagged Contributors, adding other teams where needed.... [10:15:59] 10Continuous-Integration-Config, 10Wikidata, 10Wikidata-Campsite, 10wdwb-tech, 10User-Ladsgroup: Run CI tests daily on master for ungated extensions - https://phabricator.wikimedia.org/T285049 (10Addshore) [11:19:29] (03CR) 10Thiemo Kreuz (WMDE): [C: 03+1] Remove outdated beta feature dependency for FileExporter [integration/config] - 10https://gerrit.wikimedia.org/r/756949 (https://phabricator.wikimedia.org/T259690) (owner: 10Awight) [11:38:22] (03CR) 10Nikerabbit: "This seems to break cxserver pipeline builds:" [blubber] - 10https://gerrit.wikimedia.org/r/749569 (https://phabricator.wikimedia.org/T296046) (owner: 10BryanDavis) [11:39:22] anyone around? Above issue is high priority blocker for the Language team [11:42:32] hashar: are you around and is that anything you know about? [11:43:12] Nikerabbit: yeah I am there [11:43:25] no I don't know anything about the breakage since CI does thousands of build per day :D [11:44:10] I don't know much either, but I suspect change in blubber broke cxserver pipeline builds: https://integration.wikimedia.org/ci/job/trigger-service-pipeline-test/11098/console [11:44:51] ...which blocks our deployment because we need a code fix, which we cannot deploy without a build [11:45:12] that would be one of https://gerrit.wikimedia.org/r/q/project:blubber I guess [11:49:50] bah [11:51:25] Nikerabbit: I don't even know how to revert and deploy the revert :/ [11:51:38] or how to get the blubber service to rollback to the previous image [11:51:39] :-\ [11:52:45] I just joined here @Nikerabbit What wrong thing we did? :) [11:53:00] hmm [11:53:01] https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/758575 [11:53:04] should be that one [11:54:51] Seems like that. @hashar Possible to fix/revert this today? [11:56:16] https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/758805 Revert "blubberoid: pipeline bot promote" [11:57:40] kart_: Nikerabbit: ^ here is the rollback change [11:57:48] but I have absolutely no idea how to get that one rolled [11:57:57] going to ask in the security channel [11:58:57] thanks a lot for looking into this [12:00:02] yeah well rollback is easy :D [12:00:11] then I don't know how to deploy a helm chart hehe [12:00:18] but I have asked in #mediawiki_security [12:01:45] I am pretty sure it is a walk in the park for anyone that is used to deploy with helm [12:04:29] kart_: Nikerabbit: so essentially gotta wait for someone to deploy that helm change [12:05:09] there must be a detailed ton of doc somewhere but I know exactly nothing about it so I would rather not take the risk of breaking stuff :| [12:05:29] hashar: okay. Will you pass the baton or should I create a Phab task to track this? [12:05:34] Yep. I do deploy with helm, but this can break more stuffs. [12:05:47] given the feature got added yesterday [12:06:03] that might break one or two other repositories that might have adopted the new feature already [12:06:14] then if it breaks all the other repos, I think it is find to rollback [12:06:28] Alexandros +1 ed the patch but well hasn't replied yet ;] [12:06:39] :) [12:07:17] Last 3 changes in deployment-charts repo: [12:07:21] and there is zero chance I run `helmfile` on the deployment server since I am 100% I will make the world expose [12:07:21] 4ce3a546 Add affinity to SSD nodes to termbox [12:07:21] 0d5f05a4 Deploy Flores MT [12:07:21] 29d8d7c3 blubberoid: pipeline bot promote [12:07:30] oh [12:07:42] well hopefully they got deployed [12:07:54] Looks like safe rollback. [12:08:27] if you are familiar with helm deployment maybe you have enough permissions to update the blubberoid service? [12:10:29] so essentially it got to be rolled back and I guess everyone is having lunch right now [12:28:36] :) [12:29:21] I'm familiar, but then it was risky to deploy without getting +1s from others, I was about to ask on -operations (was debugging code itself) [12:43:40] dduvall: ^ we ended up rolling back blubberoid due to T296046 :] [12:43:40] T296046: Allow build time control of effective UID/GID for runtime in Blubber generated Dockerfile - https://phabricator.wikimedia.org/T296046 [12:43:54] I trust Dan in following up about it later today [13:40:15] 10Release-Engineering-Team: Requesting membership of the analytics group in gerrit for 'btullis' - https://phabricator.wikimedia.org/T300631 (10BTullis) [13:40:43] 10Release-Engineering-Team, 10Data-Engineering-Radar: Requesting membership of the analytics group in gerrit for 'btullis' - https://phabricator.wikimedia.org/T300631 (10BTullis) [14:08:15] 10Release-Engineering-Team, 10Data-Engineering-Radar, 10Gerrit-Privilege-Requests: Requesting membership of the analytics group in gerrit for 'btullis' - https://phabricator.wikimedia.org/T300631 (10Zabe) [14:50:50] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Beta-Cluster-reproducible: Beta cluster down: Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T300525 (10AlexisJazz) And down again. [15:17:50] 10Release-Engineering-Team, 10Scap, 10serviceops: Deploy Scap version 4.2.2 - https://phabricator.wikimedia.org/T300392 (10Jelto) 05Open→03Resolved scap `4.2.2` is deployed on all machines. I'm closing this task. [15:35:22] * bd808 tries to understand how his blubber patch broke cxserver [15:44:50] (03PS1) 10QChris: Allow “Gerrit Managers” to import history [extensions/PageProperties] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/758876 [15:44:52] (03CR) 10QChris: [V: 03+2 C: 03+2] Allow “Gerrit Managers” to import history [extensions/PageProperties] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/758876 (owner: 10QChris) [15:45:25] (03PS1) 10QChris: Import done. Revoke import grants [extensions/PageProperties] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/758877 [15:45:27] (03CR) 10QChris: [V: 03+2 C: 03+2] Import done. Revoke import grants [extensions/PageProperties] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/758877 (owner: 10QChris) [15:58:54] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.20 deployment blockers - https://phabricator.wikimedia.org/T293961 (10dancy) Thanks @Jdlrobson [16:14:44] (03CR) 10Ahmon Dancy: [C: 03+2] sync-world: Change handling of wikiversions.php [tools/scap] - 10https://gerrit.wikimedia.org/r/757759 (owner: 10Ahmon Dancy) [16:15:44] 10Scap: Make 'scap update-interwiki-cache' less scary - https://phabricator.wikimedia.org/T247107 (10dancy) 05Open→03Resolved [16:17:54] (03Merged) 10jenkins-bot: sync-world: Change handling of wikiversions.php [tools/scap] - 10https://gerrit.wikimedia.org/r/757759 (owner: 10Ahmon Dancy) [16:31:44] 10GitLab (Project Migration), 10Release-Engineering-Team (Doing), 10User-brennen: Create new GitLab project group: Generated Data Platform - https://phabricator.wikimedia.org/T296381 (10Eevans) Thinking out loud: how easy is it to move/rename and/or merge these groups ~~if~~ when the corresponding teams chan... [16:46:07] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: Migrate gitlab-test instance to puppet - https://phabricator.wikimedia.org/T297411 (10Dzahn) tested applying the puppet role after floating IP was added. we will need "profile::gitlab::monitoring_whitelist" in Hiera next [16:55:00] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: Migrate gitlab-test instance to puppet - https://phabricator.wikimedia.org/T297411 (10Dzahn) The merge above fixed: did not find a value for the name 'profile::gitlab::monitoring_whitelist' next issue is: parameter 'exporters' expects a Hash v... [16:57:49] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments, 10User-brennen: 1.38.0-wmf.20 deployment blockers - https://phabricator.wikimedia.org/T293961 (10brennen) [16:58:28] 10GitLab (CI & Job Runners), 10Release-Engineering-Team (Radar), 10Security-Team, 10serviceops, and 2 others: Setup GitLab Runner in trusted environment - https://phabricator.wikimedia.org/T295481 (10Jelto) [17:02:44] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: Migrate gitlab-test instance to puppet - https://phabricator.wikimedia.org/T297411 (10Dzahn) [17:18:43] Yippee, build fixed! [17:18:43] Project mediawiki-core-doxygen-docker build #31561: 09FIXED in 14 min: https://integration.wikimedia.org/ci/job/mediawiki-core-doxygen-docker/31561/ [17:21:55] (03PS2) 10Lucas Werkmeister (WMDE): Zuul: [mediawiki/extensions/WikibaseLexeme] Add mwext-doxygen-publish [integration/config] - 10https://gerrit.wikimedia.org/r/734654 [17:23:39] (03PS3) 10Addshore: Zuul: [mediawiki/extensions/WikibaseLexeme] Add mwext-doxygen-publish [integration/config] - 10https://gerrit.wikimedia.org/r/734654 (owner: 10Lucas Werkmeister (WMDE)) [17:24:06] (03CR) 10Addshore: "Removed the depends on so that I can merge and deploy this in preparation for the Lexeme patch landing and thus the most merge job running" [integration/config] - 10https://gerrit.wikimedia.org/r/734654 (owner: 10Lucas Werkmeister (WMDE)) [17:24:09] (03CR) 10Addshore: [C: 03+2] Zuul: [mediawiki/extensions/WikibaseLexeme] Add mwext-doxygen-publish [integration/config] - 10https://gerrit.wikimedia.org/r/734654 (owner: 10Lucas Werkmeister (WMDE)) [17:26:34] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/WikibaseLexeme] Add mwext-doxygen-publish [integration/config] - 10https://gerrit.wikimedia.org/r/734654 (owner: 10Lucas Werkmeister (WMDE)) [17:27:24] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/734654 [17:27:25] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:33:53] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments: 1.38.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T293958 (10thcipriani) 05Open→03Resolved [18:04:41] 10Release-Engineering-Team (Radar), 10SRE-Access-Requests, 10User-brennen: Requesting access to deploy-phabricator for brennen - https://phabricator.wikimedia.org/T300658 (10brennen) [18:09:18] twentyafterfour: are you leaving the foundation? [18:21:10] 10Release-Engineering-Team (Radar), 10SRE-Access-Requests, 10User-brennen: Requesting access to deploy-phabricator for brennen - https://phabricator.wikimedia.org/T300658 (10thcipriani) [18:22:11] 10Release-Engineering-Team (Radar), 10SRE-Access-Requests, 10User-brennen: Requesting access to deploy-phabricator for brennen - https://phabricator.wikimedia.org/T300658 (10thcipriani) Approved as sponser/manager. There is no official approver in data.yaml; however, I probably am the right person to be th... [19:32:36] (Queue (Jenkins jobs + Zuul functions) alert) firing: (2) Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [19:37:36] (Queue (Jenkins jobs + Zuul functions) alert) firing: (3) Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [19:52:36] (Queue (Jenkins jobs + Zuul functions) alert) firing: (2) Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [19:57:36] (Queue (Jenkins jobs + Zuul functions) alert) resolved: Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [20:29:27] (03PS1) 10BryanDavis: Revert "feature: build-time arguments for lives & runs user config" [blubber] - 10https://gerrit.wikimedia.org/r/758907 [20:30:21] (03PS2) 10BryanDavis: Revert "feature: build-time arguments for lives & runs user config" [blubber] - 10https://gerrit.wikimedia.org/r/758907 [20:30:53] (03PS3) 10BryanDavis: Revert "feature: build-time arguments for lives & runs user config" [blubber] - 10https://gerrit.wikimedia.org/r/758907 (https://phabricator.wikimedia.org/T296046) [21:19:00] 10Release-Engineering-Team (Radar), 10SRE, 10SRE-Access-Requests, 10Patch-For-Review, and 2 others: Requesting access to deploy-phabricator for brennen - https://phabricator.wikimedia.org/T300658 (10Ladsgroup) 05Open→03Resolved a:03Ladsgroup [22:24:42] 10Release-Engineering-Team (Radar), 10SRE, 10SRE-Access-Requests, 10User-Ladsgroup, 10User-brennen: Requesting access to deploy-phabricator for brennen - https://phabricator.wikimedia.org/T300658 (10Dzahn) >>! In T300658#7668604, @thcipriani wrote: > There is no official approver in data.yaml; however, I... [22:26:26] bd808: thanks for handling that blubberoid rollback. that's a bit disappointing to have to do that [22:28:15] dduvall: thanks for the empathy. :) Old software versions strike again, but there is hope in a Bullseye future I guess. [22:28:39] h.ashar did the real needful before I woke up. [22:29:18] do we really need to wait for bullseye or can we get a deb backported? [22:29:34] dduvall: you should land https://gerrit.wikimedia.org/r/c/blubber/+/758907 soon so that further blubber fixes are not blocked. [22:30:12] (03CR) 10Dduvall: [C: 03+2] Revert "feature: build-time arguments for lives & runs user config" [blubber] - 10https://gerrit.wikimedia.org/r/758907 (https://phabricator.wikimedia.org/T296046) (owner: 10BryanDavis) [22:30:19] dduvall: not sure about backporting the docker.io package. I have a hunch it would be more than just rebuilding the package, but maybe not. [22:30:21] bloop [22:31:07] i really want to revive the blubber buildkit frontend. this kind of problem would go away with that [22:31:32] what magic is that? [22:31:50] i.e. the version of buildkit that's built against would be part of the blubber-frontend container image [22:32:20] so buildkit, and by extension docker, have this concept now of different format "frontends" [22:32:21] ah. but we need newer Docker jsut to get buildkit in the first place. [22:32:29] and a frontend is packaged as an image container [22:33:49] (03Merged) 10jenkins-bot: Revert "feature: build-time arguments for lives & runs user config" [blubber] - 10https://gerrit.wikimedia.org/r/758907 (https://phabricator.wikimedia.org/T296046) (owner: 10BryanDavis) [22:34:10] CI has Docker 18.09.1 and 19.03 is the first buildkit capable Docker [22:34:13] you can then put `# syntax=foo/builder` lines at the top of your config and have your own frontend (e.g. foo/builder) process the configuration along with the build context, and spit out LLB instructions [22:34:39] (low level build instructions) [22:35:17] If we had 19.03 then this could be fixed with a `# syntax=docker/dockerfile:1.1.0` comment [22:35:28] since the frontend is built against its own version of buildkit, you can patch it without having to rely on a newer docker [22:35:36] right. i see [22:36:04] so once we have a new enough docker, we can use that approach [22:36:15] so making a buildkit frontend out of Blubber would be cool, but blocked by the same stale infrastructure [22:37:38] yep :/ [22:39:22] making Blubber a buildkit frontend is deep magic too though, since the fe is what is responsible for the processing the Dockerfile that would me you either recreate all of that logic from upstream or you fork into a unique syntax entirely [22:39:35] *would mean [22:39:39] darn, i really thought we had a new enough docker already but i didn't realize contint1001 was so far behind [22:40:12] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: Migrate gitlab-test instance to puppet - https://phabricator.wikimedia.org/T297411 (10Dzahn) After small fixes above the "role::gitlab" (except same thing as prod) class is now applied on gitlab-prod-1001.devtools in cloud VPS andDOES NOT FAIL anym... [22:41:00] bd808: it doesn't really have to recreate that magic. it just reuses the same code in docker since they're both go projects [22:41:09] there's a patch somewhere [22:41:36] but what i had envisioned a while back was just using buildkit instructions internally so there'd be no intermediary dockerfile text [22:41:44] bd808: dduvall: I have rebuild the CI WMCS instances last week to Bullseye so they have docker 20.10.5 [22:42:09] dduvall: ah, so you end up extending/reusing moby/buildkit/frontend somehow? [22:42:16] hashar: nice, i just saw that. how far out are we from upgrading contint*? [22:42:22] hashar: yeah, but contint1001 is the needful here [22:42:42] bd808: yeah [22:42:45] and I once had the upstream docker deb packages imported in our apt under thirdparty/ci so potentially we could reuse the upstream debian packages on contint1001 and contint2001. They currently are running Buster and borrow the stock debian package [22:43:02] bd808: https://gerrit.wikimedia.org/r/c/blubber/+/504651 [22:43:02] contint* is the Docker that builds pipelinelib things [22:43:34] in puppet we have modules/aptrepo/files/updates which is used by reprpero to import deb package from remote repos [22:44:09] there is one bit to import docker in our thirdparty/ci which we used until we have upgraded to Buster [22:44:25] so that can be revived I guess and we can get a newer docker on the contint* prod machines [22:44:30] so did we actually regress in our docker version on contint* then? [22:44:45] we had stretch before I think [22:45:17] with the cherry picked docker update from upstream [22:45:36] the exact version must be somewhere in the puppet history [22:45:57] modules/profile/manifests/ci/docker.pp [22:46:10] $docker_version = $::lsbdistcodename ? { [22:46:10] 'stretch' => '5:19.03.5~3-0~debian-stretch', [22:46:10] # Docker version is ignored starting with Buster [22:46:10] default => '', [22:46:15] it does look like the version regressed. the puppet manifest looks like 19.03.5 was available on stretch contint servers [22:46:22] :( [22:46:46] so those bits can added back [22:46:56] get docker upsream package imported for buster thirdparty/ci [22:46:58] buster and bullseye switched to the "docker.io" package [22:47:00] is there a package for buster already? [22:47:05] adjust puppet so that we get that version installed on buster machines [22:47:10] or do we need to tweak the reprepro stuff? [22:47:13] upgrade docker on contint host [22:47:13] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: Migrate gitlab-test instance to puppet - https://phabricator.wikimedia.org/T297411 (10Dzahn) [] installed gitlab-ce package post-installation script subprocess returned error exit status 1 [] nginx initial setup needs race condition? [] Checking i... [22:47:33] reprepro might need a tweak indeed [22:48:06] do file a task, you can cc moritz on it he has all the experience about the repro mechanism and greatly assisted me in the past on that topic [22:48:36] but the bits are present for sure [22:49:35] iirc docker apt repo has a suite per debian distro so maybe some rule has to be adjusted in the modules/aptrepo/files/updates [22:49:54] then roots have to run a reprepro command to get the package imported (and gpg validated) [22:50:31] contint* hosts already have http://apt.wikimedia.org/wikimedia buster-wikimedia/thirdparty/ci [22:51:01] so it is then all about adding the imported docker version to get it used for buster at modules/profile/manifests/ci/docker.pp [22:51:20] which would have no effect on WMCS instances which are running Bullseye [22:51:26] that was my brain dump :] [22:51:57] I am off, it is late! [22:52:04] oh [22:52:16] thanks, hashar :) [22:52:20] goodnight [22:52:23] do file a task and subscribe Moritz on it. He has been of great help on that front ;] [22:52:39] and has all the knowledge about reprepro and thirdparty/ci suite [22:53:09] T226236 was a prior iteration of this stuff where the component got created [22:53:09] T226236: Upload docker-ce 18.06.3 upstream package for Stretch - https://phabricator.wikimedia.org/T226236 [22:53:27] yea [22:53:36] do cc me on the new task, I can probably brain dump a bit more on it [22:53:48] no promise as to get it done this week. I am a little packed already :-\ [22:54:02] and out next week. But SRE can handle that for sure [22:54:04] i've dealt with reprepro before. should be pretty straightforward [22:54:12] yeah :]] [22:54:26] thanks for the spelunking and the rollback! [22:54:40] the docker downgraded, I have no idea [22:54:58] I guess it is a shortcoming of the os upgrade of the contint hosts unfortunately :\ [22:55:21] anyway it is midnight!!! happy buildkit hacking [22:57:12] dduvall: The component is only for stretch -- https://github.com/wikimedia/puppet/blob/production/modules/aptrepo/files/updates#L204-L211 -- so if contint* have moved on to Buster maybe its as simple as having that suite updated? [22:58:08] bd808: i think there's that and then the hacky stuff in profile::ci::docker that needs updating [22:58:46] yeah, to switch package names & versions [22:59:47] I suppose upgrading the contint* servers to Bullseye is more than folk would want to do? [23:00:12] there's a whole pile of software on those severs right? [23:00:27] quite a pile, yes [23:00:36] and... zuul v2 [23:00:40] yeah, role::ci::master is a lot of stuff [23:00:44] depending on python 2.7 [23:01:09] it's quite the cornucopia of cruft [23:01:19] cornucruftia [23:01:29] :sad trombone: zull is the original reason for gitlab as I recall [23:01:33] *zuul [23:02:38] zuul (v2) is a relic of flailing at CI for a few years [23:03:00] there's was a plan to move off of Zuul before there was a plan to move to GitLab [23:03:32] but that plan was sort of superseded by the GitLab decision [23:04:50] *nod* I thought I remembered gitlab's CI as the "winner" but maybe that was a retroactive thing in my brain to deal with the cognitive dissidence involved in all the gitlab madness [23:05:43] no, we had Argo as the winner originally, and worked on that for a while, and then management stepped in and said GitLab was what we really needed :) [23:05:59] sigh. right [23:06:35] which to be fair was not a bad decision. i think GitLab is the right way to go, but we definitely lost traction [23:06:39] * bd808 realizes he now has a "he who shall not be named" to go with "she who shall not be named" [23:06:56] oh geez [23:07:41] it was a forced decision still in search of a reason IMO, but opinions vary [23:08:09] i think we could have moved to Argo and still used GitLab [23:08:45] GitLab does many things and CI is just one. it's very possible to decouple CI from it [23:09:21] but, going back to Argo would be backtracking at this point, and we'd lose traction again. so, onto GitLab! [23:25:20] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Yak Shaving 🐃🪒): contint1001 and contint2001 need a newer version of Docker installed - https://phabricator.wikimedia.org/T300682 (10dduvall) [23:26:20] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Yak Shaving 🐃🪒): contint1001 and contint2001 need a newer version of Docker installed - https://phabricator.wikimedia.org/T300682 (10dduvall) Note this issue is currently blocking {T296046}. See T296046#7668655 [23:48:44] duesen: dduvall: An old thing that may be colliding with SettingsLoader plans: per-extension config sources, https://phabricator.wikimedia.org/T249564 its a no-op experiments adopted in a few extensions. [23:49:06] I intend to rip that out and remove it to make your work easier, curious on your perspectives on that [23:50:43] 10Project-Admins, 10Data-Engineering: Archive Analytics tag - https://phabricator.wikimedia.org/T298671 (10odimitrijevic) Hi @Aklapper apologies for the very late response on this and thanks for the list above. I propose the following changes: * Can H126 be changed to add Data-Engineering instead of Analytics...