[00:00:35] (DatasourceError) firing: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [00:05:49] (DatasourceError) resolved: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [08:30:44] (03CR) 10Hashar: "recheck to see whether "upstream" issue got resolved" [integration/quibble] - 10https://gerrit.wikimedia.org/r/952016 (owner: 10Krinkle) [08:38:28] 10Continuous-Integration-Infrastructure, 10Cloud-VPS (Quota-requests): Request a flavor with elevated iops for integration cache storage - https://phabricator.wikimedia.org/T345925 (10hashar) I have poked `#wikimedia-cloud-admin` to potentially get the flavor renamed to use a `4xiops` suffix rather than `integ... [08:38:34] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Move castor instance to 4xiops disk flavor - https://phabricator.wikimedia.org/T345924 (10hashar) [08:38:36] 10Continuous-Integration-Infrastructure, 10Cloud-VPS (Quota-requests): Request a flavor with elevated iops for integration cache storage - https://phabricator.wikimedia.org/T345925 (10hashar) 05Resolved→03Open [08:41:21] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Fix Puppet agent provisioning on Jenkins agent instances - https://phabricator.wikimedia.org/T341051 (10hashar) 05Open→03Declined I have removed the cherry pick from the integration Puppet master. I am bailing out on ensuring an initial p... [08:41:26] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Cloud-VPS (Quota-requests): Rebuild WMCS integration instances to larger flavor - https://phabricator.wikimedia.org/T340070 (10hashar) [08:43:07] 10Continuous-Integration-Infrastructure, 10Patch-For-Review: Migrate all CI jobs from stretch to buster or later and drop stretch testing support - https://phabricator.wikimedia.org/T278203 (10hashar) [09:02:04] (03CR) 10Hashar: [C: 03+2] "After rechecking I filed a build failure at T346604 only to find the job is now passing again. That is most probably T346253 fixed in Flo" [integration/quibble] - 10https://gerrit.wikimedia.org/r/952016 (owner: 10Krinkle) [09:02:29] (03CR) 10Hashar: [C: 03+2] "After rechecking I filed a build failure at T346604 only to find the job is now passing again. That is most probably T346253 fixed in Flo" [integration/quibble] - 10https://gerrit.wikimedia.org/r/903255 (owner: 10Hashar) [09:18:20] 10GitLab, 10Release-Engineering-Team: GitLab email confirmation mail ends up in spam folder - https://phabricator.wikimedia.org/T346607 (10taavi) [09:18:24] 10GitLab (Pipeline Services Migration🐤), 10collaboration-services: move static-codereview.wikimedia.org to kubernetes - https://phabricator.wikimedia.org/T346309 (10Jelto) Service is deployed to all wikikube clusters: ` curl -I --resolve static-codereview.wikimedia.org:30443:$(dig +short k8s-ingress-staging.d... [09:18:37] 10GitLab (Pipeline Services Migration🐤), 10collaboration-services: move static-codereview.wikimedia.org to kubernetes - https://phabricator.wikimedia.org/T346309 (10Jelto) [09:19:35] (03Merged) 10jenkins-bot: Remove MW_QUIBBLE_CI env variable [integration/quibble] - 10https://gerrit.wikimedia.org/r/952016 (owner: 10Krinkle) [09:21:05] (03Merged) 10jenkins-bot: Drop --run=all in favor of an empty list [integration/quibble] - 10https://gerrit.wikimedia.org/r/903255 (owner: 10Hashar) [09:25:38] 10GitLab, 10Release-Engineering-Team: GitLab email confirmation mail ends up in spam folder - https://phabricator.wikimedia.org/T346607 (10taavi) [09:27:55] [Phab] FYI, I linked the Phabricator Logstash dashboard from https://wikitech.wikimedia.org/w/index.php?title=Phabricator&oldid=prev&diff=2112265 [09:29:41] 10Diffusion, 10Release-Engineering-Team, 10collaboration-services: Make https://git.wikimedia.org not redirect to Phabricator Diffusion - https://phabricator.wikimedia.org/T323073 (10Aklapper) Given the situation I'd propose a disambiguation page (similar to https://issues.apache.org/ ) as there is no single... [09:29:45] 10Diffusion, 10Release-Engineering-Team, 10collaboration-services: Make https://git.wikimedia.org not redirect to Phabricator Diffusion - https://phabricator.wikimedia.org/T323073 (10Aklapper) 05Stalled→03Open [09:31:23] 10GitLab, 10Release-Engineering-Team, 10collaboration-services: GitLab email confirmation mail ends up in spam folder - https://phabricator.wikimedia.org/T346607 (10Jelto) A bit more context: We had this issue some time ago and added SPF records in T328642. [09:37:07] andre: +1 :) [10:21:57] 10GitLab (Pipeline Services Migration🐤), 10Growth-Team, 10Image-Suggestion-API, 10Structured-Data-Backlog: Migrate image-suggestion-api to GitLab - https://phabricator.wikimedia.org/T344740 (10Urbanecm_WMF) Moving to Triaged on our end; please do ping us if there is something we need to do to help you move... [11:16:49] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, 10serviceops: Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536 (10Clement_Goubert) [11:28:49] 10Release-Engineering-Team (Priority Backlog 📥): Decrease number of open Phab tickets with assignee field set for more than two years (aka cookie licking) (Q3/2023 edition) - https://phabricator.wikimedia.org/T332676 (10Aklapper) [11:36:45] 10Beta-Cluster-Infrastructure, 10Citoid: citoid service doesn't work on beta - https://phabricator.wikimedia.org/T346624 (10Mvolz) [11:42:01] 10GitLab (Project Migration), 10collaboration-services: Migrate SRE repositories to GitLab - operations/software - https://phabricator.wikimedia.org/T341504 (10LSobanski) [12:05:15] 10Continuous-Integration-Config, 10translatewiki.net, 10ci-test-error (WMF-deployed Build Failure): GrowthExperiments CI fails on master: mwgate-node16-docker - https://phabricator.wikimedia.org/T346629 (10Urbanecm_WMF) [12:09:40] 10Continuous-Integration-Config, 10translatewiki.net, 10Patch-For-Review, 10ci-test-error (WMF-deployed Build Failure): GrowthExperiments CI fails on master: mwgate-node16-docker - https://phabricator.wikimedia.org/T346629 (10Urbanecm_WMF) p:05Triage→03High This breaks merges to GrowthExperiments. [12:11:56] 10Continuous-Integration-Config, 10GrowthExperiments-Mentorship, 10translatewiki.net, 10Growth-Team (Current Sprint), and 2 others: GrowthExperiments CI fails on master: mwgate-node16-docker - https://phabricator.wikimedia.org/T346629 (10Urbanecm_WMF) [12:12:12] 10Continuous-Integration-Config, 10GrowthExperiments-Mentorship, 10translatewiki.net, 10Growth-Team (Current Sprint), and 2 others: GrowthExperiments CI fails on master: mwgate-node16-docker - https://phabricator.wikimedia.org/T346629 (10Urbanecm_WMF) https://translatewiki.net/w/i.php?title=MediaWiki:Growt... [12:14:04] 10Continuous-Integration-Config, 10GrowthExperiments-Mentorship, 10translatewiki.net, 10Growth-Team (Current Sprint), and 2 others: GrowthExperiments CI fails on master: mwgate-node16-docker - https://phabricator.wikimedia.org/T346629 (10Urbanecm_WMF) See https://gerrit.wikimedia.org/r/c/mediawiki/extensio... [12:40:03] 10Phabricator, 10Patch-For-Review, 10User-Frostly, 10good first task: Wikimedia Phabricator emails display the "Phorge" name in the To field - https://phabricator.wikimedia.org/T345758 (10MarcoAurelio) >>! In T345758#9169340, @Aklapper wrote: >>>! In T345758#9147321, @MarcoAurelio wrote: >> I can only find... [13:18:12] 10Phabricator, 10Patch-For-Review, 10User-Frostly, 10good first task: Wikimedia Phabricator emails display the "Phorge" name in the To field - https://phabricator.wikimedia.org/T345758 (10valerio.bozzolan) I've +1 the merge request, thanks (I have not any privilege to "Accept" or anything) >>! In T345758#... [13:24:55] 10Phabricator, 10Patch-For-Review, 10User-Frostly, 10good first task: Wikimedia Phabricator emails display the "Phorge" name in the To field - https://phabricator.wikimedia.org/T345758 (10valerio.bozzolan) On a side note, I'm sorry "we" (Phorge) had to change the default name, but Phabricator is a trademar... [13:34:11] 10Continuous-Integration-Infrastructure, 10Cloud-VPS (Quota-requests): Request a flavor with elevated iops for integration cache storage - https://phabricator.wikimedia.org/T345925 (10Andrew) The flavor name is created by a cookbook; I'm not sure that existing flavors can be easily renamed but I'll poke in the... [13:40:47] Is there a GitLab equivalent to Gerrit's privilege policy? https://www.mediawiki.org/wiki/Gerrit/Privilege_policy [13:42:35] One member of our team requested access to the Data Engineering group in GitLab, so I added them. Another added access to the analytics group in Gerrit, but I had to redirect him to the policy which says that he must open a ticket and wait at least a week. [13:43:39] I don't know if I've already violated an access policy for GitLab without knowing it, or whether I could have fast-tracked my colleague's Gerrit request to join our group. [14:10:19] 10GitLab (Pipeline Services Migration🐤), 10collaboration-services, 10Patch-For-Review: move static-codereview.wikimedia.org to kubernetes - https://phabricator.wikimedia.org/T346309 (10Jelto) [14:16:25] 10GitLab (Pipeline Services Migration🐤), 10Metrics Platform Backlog, 10Data Products (Sprint 01), 10Patch-For-Review: Migrate metrics-platform repo to GitLab - https://phabricator.wikimedia.org/T344733 (10CodeReviewBot) phuedx opened https://gitlab.wikimedia.org/repos/data-engineering/metrics-platform/-/me... [14:23:30] 10Release-Engineering-Team, 10Scap, 10serviceops, 10Patch-For-Review: restbase deploys via scap lead to all hosts being disabled in conftool - https://phabricator.wikimedia.org/T346354 (10CodeReviewBot) jnuche opened https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/187 deploy: backward comp... [14:41:12] 10Phabricator, 10SRE, 10Security-Team, 10SecTeam-Processed, and 2 others: Require 2FA for members of acl*sre-team - https://phabricator.wikimedia.org/T328746 (10sbassett) 05In progress→03Resolved a:03Reedy >>! In T328746#9171419, @RLazarus wrote: > I don't have edit access to #acl_security. Thanks f... [14:41:56] 10Phabricator, 10SRE, 10Security-Team, 10SecTeam-Processed, and 2 others: Require 2FA for members of acl*sre-team - https://phabricator.wikimedia.org/T328746 (10sbassett) [14:51:26] 10Continuous-Integration-Config, 10GrowthExperiments-Mentorship, 10translatewiki.net, 10Growth-Team (Current Sprint), and 2 others: GrowthExperiments CI fails on master: mwgate-node16-docker - https://phabricator.wikimedia.org/T346629 (10Amire80) That message has an invisible RLM character, and it should h... [14:55:51] 10GitLab (Project Migration), 10collaboration-services: Migrate SRE repositories to GitLab - https://phabricator.wikimedia.org/T341468 (10dancy) [14:56:20] 10GitLab (Project Migration), 10Release-Engineering-Team (Escape Goats🐐), 10collaboration-services: Sync ldap/ops users to group on GitLab and define permissions for repos/sre - https://phabricator.wikimedia.org/T343035 (10dancy) 05Open→03Resolved [15:06:05] 10Release-Engineering-Team (Seen), 10Quibble, 10Patch-For-Review: Update Quibble setuptools check to modern alternative - https://phabricator.wikimedia.org/T345093 (10hashar) a:03hashar [15:37:10] 10GitLab, 10Release-Engineering-Team, 10collaboration-services: GitLab email confirmation mail ends up in spam folder - https://phabricator.wikimedia.org/T346607 (10LSobanski) @jhathaway do we have a standard process for adding DKIM? Do you have any other thoughts on this? [15:37:19] 10GitLab, 10Release-Engineering-Team, 10collaboration-services: GitLab email confirmation mail ends up in spam folder - https://phabricator.wikimedia.org/T346607 (10LSobanski) p:05Triage→03Low [15:41:19] 10Phabricator, 10Release-Engineering-Team, 10collaboration-services, 10Patch-For-Review, 10User-brennen: Schedule a routine Phabricator deployment window with downtimed alerting - https://phabricator.wikimedia.org/T346266 (10LSobanski) p:05Triage→03Medium [15:45:17] 10Phabricator, 10Release-Engineering-Team, 10collaboration-services, 10Patch-For-Review, 10User-brennen: Schedule a routine Phabricator deployment window with downtimed alerting - https://phabricator.wikimedia.org/T346266 (10eoghan) @brennen While we figure out a more long-term way for you to silence ale... [15:56:35] 10Phabricator: Phabricator Two-factor Authentication reset for Tommy_Kronkvist - https://phabricator.wikimedia.org/T346513 (10Aklapper) Hi @Tommy_Kronkvist, I created a private paste at P52520 that you should be able to access. [15:58:33] (Queue (Jenkins jobs + Zuul functions) alert) firing: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [16:00:03] 10GitLab (Pipeline Services Migration🐤), 10Growth-Team, 10Image-Suggestion-API: Migrate image-suggestion-api to GitLab - https://phabricator.wikimedia.org/T344740 (10AUgolnikova-WMF) [16:02:51] > firing: [16:02:56] *mhm* [16:03:17] (Zuul is very busy fwiw..) [16:04:06] 10Release-Engineering-Team, 10Scap, 10serviceops, 10Patch-For-Review: restbase deploys via scap lead to all hosts being disabled in conftool - https://phabricator.wikimedia.org/T346354 (10CodeReviewBot) jnuche merged https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/187 deploy: backward comp... [16:05:53] PROBLEM - Work requests waiting in Zuul Gearman server on contint2002 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [400.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/d/000000322/zuul-gearman?orgId=1&viewPanel=10 [16:05:54] 10Phabricator, 10Release-Engineering-Team (Escape Goats🐐), 10Wikimedia-Phabricator-Extensions, 10User-brennen: Custom Phab Reports: "Unhandled Exception ("RuntimeException") Division by zero" - https://phabricator.wikimedia.org/T324319 (10Aklapper) [16:06:08] 10Phabricator, 10Release-Engineering-Team (Priority Backlog 📥), 10Wikimedia-Phabricator-Extensions, 10Patch-For-Review: Phabricator Project Reports have inaccurate counts for the age histogram - https://phabricator.wikimedia.org/T294998 (10Aklapper) [16:08:33] (Queue (Jenkins jobs + Zuul functions) alert) resolved: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [16:12:23] 10Gitlab-Application-Security-Pipeline, 10Security Team AppSec, 10Security-Team, 10SecTeam-Processed, 10Security: Implement a template for Shell scripting testing - https://phabricator.wikimedia.org/T346163 (10sbassett) [16:13:39] (Queue (Jenkins jobs + Zuul functions) alert) firing: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [16:21:01] 10Release-Engineering-Team, 10Scap, 10serviceops: restbase deploys via scap lead to all hosts being disabled in conftool - https://phabricator.wikimedia.org/T346354 (10jnuche) Really sorry about this issue, I have just deployed a fix to production. @hnowlan, would it be possible for you to do another deploy... [16:28:39] (Queue (Jenkins jobs + Zuul functions) alert) resolved: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [16:29:35] 10GitLab (Infrastructure), 10Release-Engineering-Team, 10collaboration-services: GitLab email confirmation mail ends up in spam folder - https://phabricator.wikimedia.org/T346607 (10brennen) [16:34:10] 10Phabricator, 10Release-Engineering-Team, 10collaboration-services, 10Patch-For-Review, 10User-brennen: Schedule a routine Phabricator deployment window with downtimed alerting - https://phabricator.wikimedia.org/T346266 (10brennen) > @brennen While we figure out a more long-term way for you to silence... [16:39:00] i am messing about on the beta cluster (trying to cherry-pick an unmerged patch), i hope this doesn't send alerts or anything [16:42:09] RECOVERY - Work requests waiting in Zuul Gearman server on contint2002 is OK: OK: Less than 100.00% above the threshold [200.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/d/000000322/zuul-gearman?orgId=1&viewPanel=10 [17:12:42] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Priority Backlog 📥), 10Scap: Replace most of beta::autoupdater with scap prep auto - https://phabricator.wikimedia.org/T299163 (10dancy) a:05dancy→03None [17:13:20] 10Release-Engineering-Team (Priority Backlog 📥): Decrease number of open Phab tickets with assignee field set for more than two years (aka cookie licking) (Q3/2023 edition) - https://phabricator.wikimedia.org/T332676 (10Aklapper) 05Open→03Stalled [17:13:49] 10GitLab (Project Migration), 10Release-Engineering-Team (Escape Goats🐐), 10Epic, 10User-brennen: Migrate mediawiki/ namespace from Gerrit to GitLab - https://phabricator.wikimedia.org/T335921 (10dancy) [17:14:35] 10GitLab (Auth & Access), 10Release-Engineering-Team, 10collaboration-services, 10Patch-For-Review, 10User-brennen: Create bot to sync LDAP groups with related GitLab groups - https://phabricator.wikimedia.org/T319211 (10dancy) 05In progress→03Resolved This is running in production. [17:14:41] 10GitLab (Pipeline Services Migration🐤), 10Web-Team-Backlog, 10Wikimedia-Portals: Migrate Wikimedia Portals to GitLab - https://phabricator.wikimedia.org/T344743 (10Jdlrobson) a:03Jdrewniak [17:14:45] 10Release-Engineering-Team (Bonus Level 🕹️), 10Scap, 10Sustainability (Incident Followup): Localisation cache must be purged after or during train deploy, not (just) before - https://phabricator.wikimedia.org/T263872 (10dancy) a:05dancy→03None [17:18:05] 10Release-Engineering-Team (Priority Backlog 📥): Decrease number of open Phab tickets with assignee field set for more than two years (aka cookie licking) (Q3/2023 edition) - https://phabricator.wikimedia.org/T332676 (10Aklapper) First batch of emails sent to the email addresses registered in Phabricator of 186... [17:21:06] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Yak Shaving 🐃🪒), 10SRE: Have linters/tests results show up as comments in files on gerrit - https://phabricator.wikimedia.org/T209149 (10kostajh) a:05kostajh→03None [17:21:58] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Yak Shaving 🐃🪒), 10SRE: Have linters/tests results show up as comments in files on gerrit - https://phabricator.wikimedia.org/T209149 (10kostajh) https://wikitech.wikimedia.org/wiki/Tool:Fix_Suggester_Bot got some of the way there, but I do... [18:12:52] 10Deployments, 10Release-Engineering-Team (Priority Backlog 📥): L10n cache files building up on backup deploy hosts - https://phabricator.wikimedia.org/T275826 (10dancy) 05Open→03Resolved I took at look at the deployment server today and I don't see any accumulation of l10n files so the weekly cleanup that... [18:12:57] 10Deployments, 10Release-Engineering-Team (Radar), 10SRE-Sprint-Week-Sustainability-March2023, 10serviceops-radar, and 2 others: Remove provisioning for 'mwscript', 'foreachwikiindblist' etc from deployment host - https://phabricator.wikimedia.org/T253822 (10dancy) a:05dancy→03None [19:10:17] 10Release-Engineering-Team, 10API Platform, 10AQS2.0, 10Platform Engineering, and 4 others: Define a procedure/pattern to populate test environments - https://phabricator.wikimedia.org/T334851 (10Jrbranaa) [19:12:36] 10Release-Engineering-Team, 10API Platform, 10AQS2.0, 10Platform Engineering, and 4 others: Define a procedure/pattern to populate test environments - https://phabricator.wikimedia.org/T334851 (10Jrbranaa) Added Catalyst tag for test environment need/requirement awareness. [19:50:39] 10Release-Engineering-Team (Priority Backlog 📥), 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.27 deployment blockers - https://phabricator.wikimedia.org/T345888 (10KSarabia-WMF) [22:13:40] 10Release-Engineering-Team (Priority Backlog 📥), 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.27 deployment blockers - https://phabricator.wikimedia.org/T345888 (10Jdlrobson)