[00:36:22] 10Gerrit: Gerrit Reviewer Bot is unreliable - https://phabricator.wikimedia.org/T290905 (10Tgr) Thanks @valhallasw! What's a reasonable time for the bot to process the queue? (Ie. when should I assume there was an error if I don't see the bot adding reviewers?) [08:12:00] 10Gerrit: Gerrit Reviewer Bot is unreliable - https://phabricator.wikimedia.org/T290905 (10kostajh) >>! In T290905#7364090, @valhallasw wrote: > https://gerrit-reviewer-bot.toolforge.org/ contains the last 50-or-so log lines; if you have access to tool labs I'm fairly certain you can just read the .out and .err... [08:22:52] 10Continuous-Integration-Infrastructure, 10cloud-services-team (Kanban): integration-agent-qemu-1001 in project integration has corrupted disk / partition - https://phabricator.wikimedia.org/T290615 (10hashar) [08:35:19] 10Gerrit: Gerrit Reviewer Bot is unreliable - https://phabricator.wikimedia.org/T290905 (10valhallasw) I think I know what's going on -- it's an issue I fixed earlier for ReleaseTaggerBot (which uses the same email based system), but forgot to also apply here: T284587: https://github.com/wikimedia/labs-tools-fo... [08:47:47] 10Release-Engineering-Team, 10MW-on-K8s, 10Performance-Team, 10SRE, and 2 others: Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536 (10Joe) I have some alternative ideas. Specifically, right now we have a limited number of different clusters, due to the complexity of corre... [08:49:08] 10Release-Engineering-Team, 10MW-on-K8s, 10Performance-Team, 10SRE, and 2 others: Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536 (10Joe) I forgot to add: offering the beta feature would be nice, and given it only regards logged-in users, it would not need a split of cac... [10:01:30] PROBLEM - Work requests waiting in Zuul Gearman server on contint2001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [150.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [10:08:42] O_O what happened to the gearman job queue? [10:08:53] the little diagram at the bottom of the zuul dashboard says it’s at 20k… [10:09:35] some big spikes in https://grafana.wikimedia.org/d/000000322/zuul-gearman?orgId=1&from=now-7d&to=now too [10:26:15] RECOVERY - Work requests waiting in Zuul Gearman server on contint2001 is OK: OK: Less than 100.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [11:32:51] Lucas_WMDE: will look at it [11:32:55] something got stuck I guess [11:34:12] seems to have resolved itself [11:34:17] hmm something has send lot of patches around 9:45 [11:34:34] I guess lot of merge jobs to try to merge the incoming patches against the tip of the target branch [11:37:58] it was transient yes [11:53:47] 10Continuous-Integration-Config, 10Wikidata, 10wdwb-tech, 10User-Ladsgroup, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Run CI tests daily on master for ungated extensions - https://phabricator.wikimedia.org/T285049 (10Ladsgroup) a:05Ladsgroup→03None I don't think I can do much on this witho... [12:41:26] PROBLEM - Work requests waiting in Zuul Gearman server on contint2001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [150.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [12:45:14] RECOVERY - Work requests waiting in Zuul Gearman server on contint2001 is OK: OK: Less than 100.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [14:03:19] o/ I wonder if we could get a wmde group on https://gitlab.wikimedia.org/explore/groups yet? :D [14:04:23] addshore: File as task AIUI [14:04:49] (but the answer should be yes) [14:04:50] awesome! will do! [14:05:24] * majavah complains about everyone not being able to use gitlab yet [14:08:55] I think there's some scaling and stuff to do :P [14:09:01] And not letting people assume it's production ready [14:09:21] folks with access to it sure do treat it as production ready :P [14:09:52] https://gitlab.wikimedia.org/Reedy [14:09:58] yup, look at all that activity [14:10:01] so much activity [14:10:05] you won't believe how much activity there is [14:10:13] *hides* [14:12:33] https://gitlab.wikimedia.org/explore/projects [14:14:05] 10Gerrit, 10Release-Engineering-Team (Radar), 10Infrastructure-Foundations, 10SRE, and 3 others: Add logout.d script for Gerrit - https://phabricator.wikimedia.org/T286905 (10jbond) >>! In T286905#7342139, @MoritzMuehlenhoff wrote: > Adding this functionality goes a little beyond the scope of the logout.d... [14:14:42] if we have projects deciding to make gerrit repos read-only in favor of gitlab repos, it's in production, ready or not [14:15:18] Are releng aware of people doing that [14:21:42] 10Continuous-Integration-Config, 10Release-Engineering-Team (Doing), 10BlueSpice: BlueSpice related tests fails on gate-and-submit-1.31 in quibble-composer tests - https://phabricator.wikimedia.org/T235807 (10Osnard) @Jdforrester-WMF apparently you and @hashar agreed on merging this patch in T235807#5987659.... [14:33:01] yes, releng [14:37:04] unless I'm missing something about releng/dev-images (I might be) [14:38:45] mediawiki/tools/cli [14:43:29] the “work in progress” warning also sounds more like “this isn’t done yet” than “this might disappear at any moment without backups” to me tbh [14:46:02] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Bstorm) @Krinkle I left one step undone so that I didn't cause any potential bre... [14:47:02] yeah, "work in progress" implies more on the "we haven't quite figured out CI" and "there's no phab integration" side [14:48:15] not "we might accidentally test one of the benefits of distributed version control" [15:14:01] (03CR) 10Ahmon Dancy: Fixes to support docker-compose 1.21.0 (for Debian 10) (031 comment) [tools/train-dev] - 10https://gerrit.wikimedia.org/r/708598 (owner: 10Ahmon Dancy) [15:18:27] this task is the one we'd like to resolve before opening the instance up to a wider pool of folks: https://phabricator.wikimedia.org/T288392 [15:33:22] 10Release-Engineering-Team (Doing), 10GitLab, 10Privacy Engineering, 10Security-Team: Add "Samuel (WMF)" account to Security Team group in gitlab.wikimedia.org - https://phabricator.wikimedia.org/T291094 (10sguebo_WMF) Thanks for handling that, @thcipriani [15:39:57] * bd808 was an early adopter of Diffusion and as a result will be a late adopter of Gitlab [15:43:36] bd808: My sympathies for your struggles. [15:44:20] I'm sure it will all work out in the long run, I'm just shy of the extra work that doesn't always pan out [15:58:05] :( [15:59:13] "To determine what is available in our free tier and what is available only in our paid tiers, we first assess who cares the most about the feature. Individual contributors rarely purchase The DevOps Platform, and thus, if the feature is something primarily individuals care about it will be open source. If the features are something primarily managers, directors, or executives care about then it will be source-available." from their IPO [15:59:14] paperwork is also not really exciting to me. [16:00:18] I'm still waiting to find an open core project that doesn't turn on it's community [16:07:10] I think it was shakespeare that said the course of open projects ne'er did run smooth. [16:11:47] meh (the gitlab saga) [16:11:55] 10Release-Engineering-Team (Radar), 10Quality-and-Test-Engineering-Team (QTE), 10serviceops-radar, 10CommRel-Specialists-Support (Jul-Sep-2021), and 2 others: Expand the list of group 1 wikis to contain at least one (preferably 2) smaller "top ten size" wikis - https://phabricator.wikimedia.org/T286664 (10E... [16:12:25] IPO! [16:12:37] * ebernhardson can't figure out what the devops-platform is, it's more efficient faster and reduces risk. But i still don't know what it does [16:14:11] whatever it does: it's faster and safer. Sounds like a win. [16:15:53] graphs go up and to the right? I'm in. [16:16:17] they did say it was for things management likes :) [16:20:59] 10Release-Engineering-Team (Radar), 10Quality-and-Test-Engineering-Team (QTE), 10serviceops-radar, 10CommRel-Specialists-Support (Jul-Sep-2021), and 2 others: Expand the list of group 1 wikis to contain at least one (preferably 2) smaller "top ten size" wikis - https://phabricator.wikimedia.org/T286664 (10o... [16:21:55] (03PS1) 10Ahmon Dancy: build-mv-image: Fix FORCE_FULL_BUILD logic [tools/release] - 10https://gerrit.wikimedia.org/r/722407 [16:24:58] (03CR) 10Ahmon Dancy: [C: 03+2] build-mv-image: Fix FORCE_FULL_BUILD logic [tools/release] - 10https://gerrit.wikimedia.org/r/722407 (owner: 10Ahmon Dancy) [16:26:10] (03Merged) 10jenkins-bot: build-mv-image: Fix FORCE_FULL_BUILD logic [tools/release] - 10https://gerrit.wikimedia.org/r/722407 (owner: 10Ahmon Dancy) [16:29:48] * AntiComposite would not be surprised if GitLab eventually ends up like OwnCloud [16:44:27] Or we can add a GitHub skin to Gerrit [16:51:29] let's just go straight to the worst of all possible worlds, throw up our hands in defeat on the notion of self-hosting, and mandate a browser extension to make github look like gerrit. [16:51:58] nobody will be happy and i can go back to writing shell scripts to parse error logs or something. [16:54:48] sounds awesome, notit [18:22:11] 10Release-Engineering-Team (Yak Shaving 🐃🪒), 10mwcli: mwcli: Automate upload of new version to releases server - https://phabricator.wikimedia.org/T290335 (10Addshore) I played around with releasing on gitlab today in my test repo https://gitlab.wikimedia.org/addshore/test/-/releases Made a release with artef... [18:46:11] 10Release-Engineering-Team (Doing), 10Release, 10Train Deployments: 1.38.0-wmf.1 deployment blockers - https://phabricator.wikimedia.org/T281165 (10Legoktm) ##### Risky Patches! 🚂🔥 * **Change**: https://gerrit.wikimedia.org/r/719651 (PagedTiffHandler) https://gerrit.wikimedia.org/r/717154 (PdfHandler) * *... [18:51:02] (03CR) 10Ahmon Dancy: "Please incorporate the build loop into train-dev start/stop." [tools/train-dev] - 10https://gerrit.wikimedia.org/r/716060 (https://phabricator.wikimedia.org/T287993) (owner: 10Jeena Huneidi) [18:53:06] (03CR) 10Jeena Huneidi: Build mediawiki image in train-dev (031 comment) [tools/train-dev] - 10https://gerrit.wikimedia.org/r/716060 (https://phabricator.wikimedia.org/T287993) (owner: 10Jeena Huneidi) [19:04:18] 10Continuous-Integration-Config, 10Release-Engineering-Team: Rebuild CI images affected by OpenSSL compat issue with new Let's Encrypt issuance chain - https://phabricator.wikimedia.org/T291425 (10hashar) [19:09:47] 10Release-Engineering-Team (Yak Shaving 🐃🪒), 10mwcli: mwcli: Automate upload of new version to releases server - https://phabricator.wikimedia.org/T290335 (10Addshore) And if anyone ever stumbles on this ticket for making such releases, here is an updated example that will dynamicaly include all the files in t... [19:47:37] 10Release-Engineering-Team (Yak Shaving 🐃🪒), 10mwcli, 10User-Addshore: mwcli: Automate upload of new version to a releases server - https://phabricator.wikimedia.org/T290335 (10Addshore) a:03Addshore [20:04:48] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Krinkle) @Bstorm OK. I've added `profile::wmcs::lvm` to qemu-agent-1003 in Horiz... [20:29:06] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Bstorm) That's interesting. Thanks for leaving it. I'll take a look and try to f... [20:29:54] 10Phabricator, 10Upstream: Some task notifications include a @mention entry in the X-Phabricator-Stamps mail header for no obvious reason - https://phabricator.wikimedia.org/T266328 (10mmodell) I can't make much sense of what that specific event would trigger an @mention but there are other @mentions in the th... [20:53:31] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Bstorm) @Krinkle I cannot find that error anywhere on qemu-agent-1003. It looks... [20:54:52] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Bstorm) Ah, I know why I am not finding it :) I was trying to ssh to qemu-agent-... [21:45:23] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Bstorm) That fixes the problem @Krinkle. It should now be a noop on old things a... [21:51:33] 10Release-Engineering-Team (Doing), 10Release, 10Train Deployments: 1.38.0-wmf.1 deployment blockers - https://phabricator.wikimedia.org/T281165 (10matmarex) ##### Risky Patch! 🚂🔥 * **Change**: https://gerrit.wikimedia.org/r/713681 "Always apply DiscussionTools page transformations" (T273072, T280599) * *... [21:51:54] 10Continuous-Integration-Infrastructure, 10cloud-services-team (Kanban): integration-agent-qemu-1001 in project integration has corrupted disk / partition - https://phabricator.wikimedia.org/T290615 (10Krinkle) For the record, re-creating this instance using a newer base image is happening as part of {T284774}. [22:22:59] 10Release-Engineering-Team (Yak Shaving 🐃🪒), 10mwcli, 10User-Addshore: mwcli: Automate upload of new version to a releases server - https://phabricator.wikimedia.org/T290335 (10Addshore) 05Open→03Resolved Magic CI was magic https://gitlab.wikimedia.org/releng/cli/-/pipelines/535 https://gitlab.wikimedia... [22:44:36] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Krinkle) Oops, yeah, I got those name parts backward. Sorry about that. The int... [22:53:39] addshore: I see the magic phrase "-dind" in that builg log. [22:53:50] 10Phabricator, 10Release-Engineering-Team (Yak Shaving 🐃🪒), 10User-brennen: Dockerize our Phabricator development environment - https://phabricator.wikimedia.org/T245575 (10brennen) > I see that the Phabricator development had stopped and also that we are moving towards GitLab. So is this task still needed?... [22:54:41] I'm guessing that's how it manages to be so fast since it's not in another VM and thus gets to leverage docker cache. As I understand it, we considered that for CI and it was rejected for security reasons as we werent' comfotable exposing dind on a wmcs instance in the integration project directly. [22:56:55] some light details at T250808 [22:56:55] T250808: Decide how to run a test involving docker inside WMF CI - https://phabricator.wikimedia.org/T250808 [22:58:36] probably not a big deael in the short-term if it can't reach other integration instances and if the job in question is only triggerable by you or other trusted people [23:21:31] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Bstorm) That is the functional equivalent of applying profile::wmcs::lvm. You ca... [23:45:07] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Krinkle) Ah okay, no problem. I've applied it now via the role as well (rather t... [23:46:27] 10Continuous-Integration-Infrastructure, 10Performance-Team, 10Patch-For-Review: Provide one or more Qemu agents in CI that use a newer version than 2.x - https://phabricator.wikimedia.org/T284774 (10Krinkle) OK. qemu-1003 is now up in the same shape as qemu-1001 and qemu-1002 were, although with a smaller e...