[00:00:06] \o/ don't we all love beta cluster [00:00:44] hahahahah no [00:00:57] (03PS4) 10BryanDavis: dockerfiles: update commit-message-validator [integration/config] - 10https://gerrit.wikimedia.org/r/823193 (https://phabricator.wikimedia.org/T315159) [00:01:22] I do have feelings about beta cluster [00:01:36] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10TheresNoTime) [2022-08-17T00:49:52Z, #wikimedia-releng, @thcipriani] ` what I've ma... [00:01:46] (03CR) 10BryanDavis: dockerfiles: update commit-message-validator (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/823193 (https://phabricator.wikimedia.org/T315159) (owner: 10BryanDavis) [00:04:26] * TheresNoTime started https://wikitech.wikimedia.org/wiki/Incidents/2022-08-16_Beta_Cluster_502 just in case this became a whole *thing* [00:04:42] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10TheresNoTime) p:05Unbreak!→03Medium [00:05:03] maybe not needed now, but I guess worth it for fun (tm) [00:06:11] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10ori) Follow-up items to get the Puppet repo on deployment-puppetmaster04 in good shape: * The tw... [00:06:22] thcipriani: https://phabricator.wikimedia.org/T315350#8159977 (not it) [00:06:32] kid's bedtime, bbl [00:06:35] :) [00:07:44] thanks for all the work zabe TheresNoTime and ori <3 [00:09:26] no worries, I just did a lot of pointing and waiting for smart people to figure it out [00:11:47] that's like 90% of my job :) [00:12:26] 10Beta-Cluster-Infrastructure: Remove two cherry-picked reverts from deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315394 (10TheresNoTime) [00:12:37] 10Beta-Cluster-Infrastructure: Remove two cherry-picked reverts from deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315394 (10TheresNoTime) [00:12:45] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10TheresNoTime) [00:13:45] 10Beta-Cluster-Infrastructure: Rebase & merge or re-cherry-pick 668701 on deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315395 (10TheresNoTime) [00:13:56] 10Beta-Cluster-Infrastructure: Rebase & merge or re-cherry-pick 668701 on deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315395 (10TheresNoTime) [00:14:04] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10TheresNoTime) [00:19:46] 10Beta-Cluster-Infrastructure: (Beta cluster) Running logspam-watch on deployment-mwlog01 gives repeated `Use of uninitialized value $host` errors - https://phabricator.wikimedia.org/T315379 (10Zabe) 05Open→03Resolved a:03Zabe We got puppet to run again. And it seems like this got fixed with that. [00:19:55] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10Zabe) [01:01:38] 10Beta-Cluster-Infrastructure: Rebase & merge or re-cherry-pick 668701 on deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315395 (10TheresNoTime) a:03TheresNoTime [01:51:08] (03CR) 10Jforrester: [C: 03+1] "LGTM. Anything left, or should we merged/build-deploy/update?" [integration/config] - 10https://gerrit.wikimedia.org/r/823193 (https://phabricator.wikimedia.org/T315159) (owner: 10BryanDavis) [06:27:13] (03CR) 10Hashar: [C: 04-1] "We would probably want to make it "experimental" first then ensure it passes on every single extensions it is applied to. As Matej point" [integration/config] - 10https://gerrit.wikimedia.org/r/797319 (owner: 10Jforrester) [07:09:20] 10Release-Engineering-Team, 10Gerrit (Gerrit 3.4): Upgrade Gerrit to 3.4.5 - https://phabricator.wikimedia.org/T315408 (10hashar) [07:09:39] 10Release-Engineering-Team, 10Gerrit (Gerrit 3.4): Upgrade Gerrit to 3.4.5 - https://phabricator.wikimedia.org/T315408 (10hashar) [07:13:14] (03PS1) 10Hashar: Merge tag 'v3.4.5' into wmf/stable-3.4 [software/gerrit] (wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824122 [07:13:33] (03PS2) 10Hashar: Merge tag 'v3.4.5' into wmf/stable-3.4 [software/gerrit] (wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824122 (https://phabricator.wikimedia.org/T315408) [07:20:56] (03PS1) 10Hashar: [WMF] update javamelody plugin [software/gerrit] (wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824124 [07:23:21] (03CR) 10Hashar: [C: 03+2] Merge tag 'v3.4.5' into wmf/stable-3.4 [software/gerrit] (wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824122 (https://phabricator.wikimedia.org/T315408) (owner: 10Hashar) [07:34:31] 10Beta-Cluster-Infrastructure, 10Epic: 502 errors on beta cluster - https://phabricator.wikimedia.org/T312253 (10TheresNoTime) 05Stalled→03Invalid I like tracking tasks (even if they are [[ https://www.mediawiki.org/wiki/Phabricator/Project_management/Tracking_tasks | somewhat frowned upon ]] now..), but t... [07:34:33] (03Merged) 10jenkins-bot: Merge tag 'v3.4.5' into wmf/stable-3.4 [software/gerrit] (wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824122 (https://phabricator.wikimedia.org/T315408) (owner: 10Hashar) [07:46:50] (03CR) 10Hashar: [C: 03+2] [WMF] update javamelody plugin [software/gerrit] (wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824124 (owner: 10Hashar) [07:55:02] (03Merged) 10jenkins-bot: [WMF] update javamelody plugin [software/gerrit] (wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824124 (owner: 10Hashar) [08:14:02] (03PS1) 10Hashar: Gerrit v3.4.5 and rebuild plugins [software/gerrit] (deploy/wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824134 (https://phabricator.wikimedia.org/T315408) [08:14:22] (03CR) 10CI reject: [V: 04-1] Gerrit v3.4.5 and rebuild plugins [software/gerrit] (deploy/wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824134 (https://phabricator.wikimedia.org/T315408) (owner: 10Hashar) [08:20:13] (03CR) 10Hashar: "recheck" [software/gerrit] (deploy/wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824134 (https://phabricator.wikimedia.org/T315408) (owner: 10Hashar) [08:25:08] (03PS1) 10Jaime Nuche: bootstrap: add script for master bootstrapping [tools/scap] - 10https://gerrit.wikimedia.org/r/824135 [08:28:00] (03PS2) 10Jaime Nuche: bootstrap: add script for master bootstrapping [tools/scap] - 10https://gerrit.wikimedia.org/r/824135 [08:44:21] (03CR) 10Jaime Nuche: [C: 03+2] bootstrap: add script for master bootstrapping [tools/scap] - 10https://gerrit.wikimedia.org/r/824135 (owner: 10Jaime Nuche) [08:48:39] (03Merged) 10jenkins-bot: bootstrap: add script for master bootstrapping [tools/scap] - 10https://gerrit.wikimedia.org/r/824135 (owner: 10Jaime Nuche) [08:53:22] (03PS1) 10Jaime Nuche: bootstrap: fix permissions of master bootstrap script [tools/scap] - 10https://gerrit.wikimedia.org/r/824141 [08:54:18] (03CR) 10Hashar: [C: 03+2] Gerrit v3.4.5 and rebuild plugins [software/gerrit] (deploy/wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824134 (https://phabricator.wikimedia.org/T315408) (owner: 10Hashar) [08:54:40] (03Merged) 10jenkins-bot: Gerrit v3.4.5 and rebuild plugins [software/gerrit] (deploy/wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/824134 (https://phabricator.wikimedia.org/T315408) (owner: 10Hashar) [09:00:16] (03CR) 10Jaime Nuche: [C: 03+2] bootstrap: fix permissions of master bootstrap script [tools/scap] - 10https://gerrit.wikimedia.org/r/824141 (owner: 10Jaime Nuche) [09:07:06] (03Merged) 10jenkins-bot: bootstrap: fix permissions of master bootstrap script [tools/scap] - 10https://gerrit.wikimedia.org/r/824141 (owner: 10Jaime Nuche) [09:15:58] 10Release-Engineering-Team, 10Gerrit (Gerrit 3.4), 10Patch-For-Review: Upgrade Gerrit to 3.4.5 - https://phabricator.wikimedia.org/T315408 (10hashar) 05Open→03Resolved a:03hashar I have updated both Gerrit including the plugins: ` Name Version Api-Version Status File... [09:25:11] 10Beta-Cluster-Infrastructure, 10Patch-For-Review: Remove two cherry-picked reverts from deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315394 (10jbond) > https://gerrit.wikimedia.org/r/c/operations/puppet/+/823638, What error do you see without this revert? as far as i can tell it should be a... [09:32:10] (03PS2) 10Hashar: zuul: Fix/remove links to non-existent Grafana graphs [integration/docroot] - 10https://gerrit.wikimedia.org/r/810979 (https://phabricator.wikimedia.org/T307405) (owner: 10Stang) [09:34:20] (03CR) 10Hashar: [C: 03+1] "Thanks for the links updates! I have amended the change following Timo suggestions:" [integration/docroot] - 10https://gerrit.wikimedia.org/r/810979 (https://phabricator.wikimedia.org/T307405) (owner: 10Stang) [09:37:12] (03CR) 10Hashar: dockerfiles: update commit-message-validator (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/823193 (https://phabricator.wikimedia.org/T315159) (owner: 10BryanDavis) [09:37:36] (03PS5) 10Hashar: dockerfiles: update commit-message-validator [integration/config] - 10https://gerrit.wikimedia.org/r/823193 (https://phabricator.wikimedia.org/T315159) (owner: 10BryanDavis) [09:38:02] (03CR) 10Hashar: [C: 03+2] dockerfiles: update commit-message-validator [integration/config] - 10https://gerrit.wikimedia.org/r/823193 (https://phabricator.wikimedia.org/T315159) (owner: 10BryanDavis) [09:40:54] (03Merged) 10jenkins-bot: dockerfiles: update commit-message-validator [integration/config] - 10https://gerrit.wikimedia.org/r/823193 (https://phabricator.wikimedia.org/T315159) (owner: 10BryanDavis) [09:48:13] 10Beta-Cluster-Infrastructure, 10Patch-For-Review: Remove two cherry-picked reverts from deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315394 (10jbond) >>! In T315394#8160811, @jbond wrote: >> https://gerrit.wikimedia.org/r/c/operations/puppet/+/823638, > What error do you see without this rev... [09:55:58] 10Project-Admins: Create project tag for - https://phabricator.wikimedia.org/T315424 (10hashar) [09:56:12] 10Project-Admins: Create project tag for M1 mac support - https://phabricator.wikimedia.org/T315424 (10hashar) [10:06:03] 10Beta-Cluster-Infrastructure, 10Patch-For-Review: Rebase & merge or re-cherry-pick 668701 on deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315395 (10jbond) p:05Triage→03Medium [10:06:51] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Developer productivity: arm64 versions of CI docker images - https://phabricator.wikimedia.org/T315286 (10hashar) This got previously requested via {T274140} and declined: >>! In T274140#6832335, `@MoritzMuehlenhoff` wrote: > This can't be e... [10:07:30] 10Beta-Cluster-Infrastructure, 10Patch-For-Review: Rebase & merge or re-cherry-pick 668701 on deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315395 (10TheresNoTime) a:05TheresNoTime→03jbond [10:26:55] 10Release-Engineering-Team, 10docker-pkg: docker-pkg / docker downloads all versions of parent image upon building - https://phabricator.wikimedia.org/T310458 (10hashar) Looks like contint hosts have docker-pkg 3.0.2, 3.0.3 is the latest: Giuseppe Lavagetto (3): * [66606ae] Builder: use the full image t... [10:27:14] !log Built image docker-registry.discovery.wmnet/releng/commit-message-validator:1.0.0 # T315159 [10:27:16] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:27:16] T315159: Update CI for commit-message-validator 1.0.0 - https://phabricator.wikimedia.org/T315159 [10:28:40] (03PS1) 10Hashar: jjb: update commit-message-validator to 1.0.0 [integration/config] - 10https://gerrit.wikimedia.org/r/824157 (https://phabricator.wikimedia.org/T315159) [10:29:16] (03CR) 10Hashar: [C: 03+2] "I have updated the job" [integration/config] - 10https://gerrit.wikimedia.org/r/824157 (https://phabricator.wikimedia.org/T315159) (owner: 10Hashar) [10:29:49] 10Continuous-Integration-Config, 10Patch-For-Review, 10User-bd808: Update CI for commit-message-validator 1.0.0 - https://phabricator.wikimedia.org/T315159 (10hashar) 05Open→03Resolved Job updated, thank you @bd808! [10:31:15] (03Merged) 10jenkins-bot: jjb: update commit-message-validator to 1.0.0 [integration/config] - 10https://gerrit.wikimedia.org/r/824157 (https://phabricator.wikimedia.org/T315159) (owner: 10Hashar) [10:43:57] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Developer productivity: arm64 versions of CI docker images - https://phabricator.wikimedia.org/T315286 (10Clement_Goubert) 05Open→03Declined Understood. I actually did get the testing suite installed locally, and the runtime is perfectly... [10:51:24] 10Beta-Cluster-Infrastructure, 10Patch-For-Review: Rebase & merge or re-cherry-pick 668701 on deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315395 (10jbond) >>! In T315395#8161054, @gerritbot wrote: > Change 824158 had a related patch set uploaded (by Jbond; author: jbond): > %%%[operations/pu... [11:12:50] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10jnuche) Hi there. JFYI, after the changes early this morning in `deployment-puppetmaster04` Pupp... [11:21:43] jnuche: hi [11:22:08] RhinosF1: 👋 [11:22:26] jnuche: why wasn't that patch on? It's merged right? [11:23:21] yeah, it's merged, not sure how we normally maintain that repo THB [11:23:26] *TBH [11:26:42] jnuche: it should show the same as puppet repo [11:26:47] I hate beta after yesterday [11:27:27] I'm going to look at TheresNoTime's IR because it took so much longer than it should because they were that many other errors [11:28:17] s/IR/barely an IR draft [11:29:27] RhinosF1: beta is a massive pain indeed [11:29:33] TheresNoTime: yes I plan to make an IR [11:30:32] btw. I never wrote it anywhere, but I fixed the original 'certificate verify failed (certificate revoked)' puppet failure by brutally generating new certificates for puppetmaster following https://wikitech.wikimedia.org/wiki/Help:Standalone_puppetmaster [11:31:06] hope that was fine, but considering it worked it should not be a problem I guess [11:48:14] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10jbond) >>! In T315350#8161180, @jnuche wrote: > Hi there. JFYI, after the changes early this mo... [12:14:09] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10jnuche) > Just to confirm puppet was failing on deployment-deploy03 not on deployment-puppetmast... [12:17:57] zabe: ye sure, can you stick a note on ^ [12:18:11] So I don't forget when I do an IR for TheresNoTime [12:35:33] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10jbond) > Correct, deployment-deploy03 is the beta deployment server. The failures started right... [12:39:03] (03PS1) 10Hashar: Merge tag 'v3.5.2' into wmf/stable-3.5 [software/gerrit] (wmf/stable-3.5) - 10https://gerrit.wikimedia.org/r/824196 (https://phabricator.wikimedia.org/T307334) [12:46:42] (03CR) 10CI reject: [V: 04-1] Merge tag 'v3.5.2' into wmf/stable-3.5 [software/gerrit] (wmf/stable-3.5) - 10https://gerrit.wikimedia.org/r/824196 (https://phabricator.wikimedia.org/T307334) (owner: 10Hashar) [12:51:12] (03PS2) 10Hashar: Merge tag 'v3.5.2' into wmf/stable-3.5 [software/gerrit] (wmf/stable-3.5) - 10https://gerrit.wikimedia.org/r/824196 (https://phabricator.wikimedia.org/T307334) [13:05:20] (03PS1) 10Hashar: Gerrit v3.5.2 and rebuild plugins [software/gerrit] (deploy/wmf/stable-3.5) - 10https://gerrit.wikimedia.org/r/824200 (https://phabricator.wikimedia.org/T307334) [13:09:34] (03PS2) 10Hashar: Gerrit v3.5.2 and rebuild plugins [software/gerrit] (deploy/wmf/stable-3.5) - 10https://gerrit.wikimedia.org/r/824200 (https://phabricator.wikimedia.org/T307334) [13:10:38] 10Release-Engineering-Team (Next), 10Gerrit (Gerrit 3.5), 10Patch-For-Review: Upgrade to Gerrit 3.5 - https://phabricator.wikimedia.org/T307334 (10hashar) a:03hashar [13:13:05] RhinosF1, sure, wrote something up [13:14:15] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 3 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10Zabe) Ok, lemme try to quickly summarize what happened and what was done. Some cloudvirts hosts... [13:23:12] zabe: i've expanded https://wikitech.wikimedia.org/wiki/Incidents/2022-08-16_Beta_Cluster_502#Actionables [13:23:28] i think it needs timeline and probably your comment copying across [13:24:38] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10Wikidata, and 4 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10RhinosF1) https://wikitech.wikimedia.org/wiki/Incidents/2022-08-16_Beta_Cluster_502 [13:26:00] 10Beta-Cluster-Infrastructure, 10Infrastructure-Foundations, 10SRE, 10SRE-OnFire, and 3 others: Evaluation Error on deployment-cache-text06 puppet run - https://phabricator.wikimedia.org/T315351 (10RhinosF1) [13:27:02] 10Beta-Cluster-Infrastructure, 10SRE-OnFire, 10Sustainability (Incident Followup): (Beta cluster) Running logspam-watch on deployment-mwlog01 gives repeated `Use of uninitialized value $host` errors - https://phabricator.wikimedia.org/T315379 (10RhinosF1) [13:29:08] 10Beta-Cluster-Infrastructure, 10SRE-OnFire, 10Patch-For-Review, 10Sustainability (Incident Followup): Remove two cherry-picked reverts from deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315394 (10RhinosF1) [13:29:24] 10Beta-Cluster-Infrastructure, 10SRE-OnFire, 10Patch-For-Review, 10Sustainability (Incident Followup): Rebase & merge or re-cherry-pick 668701 on deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315395 (10RhinosF1) [13:30:19] 10Beta-Cluster-Infrastructure, 10serviceops: Serve beta cluster via PHP 7.4 by default - https://phabricator.wikimedia.org/T306042 (10Joe) 05Open→03In progress a:03Joe [13:30:41] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10SRE-OnFire, and 5 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10RhinosF1) [13:32:16] 10Release-Engineering-Team (Next), 10Gerrit (Gerrit 3.5), 10Patch-For-Review: Upgrade to Gerrit 3.5 - https://phabricator.wikimedia.org/T307334 (10hashar) [13:33:11] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Radar), 10Code-Stewardship-Reviews: deployment-prep: Code stewardship request - https://phabricator.wikimedia.org/T215217 (10RhinosF1) Beta was down for 8 hours yesterday due to the mess it is - see https://wikitech.wikimedia.org/wiki/Incidents/2022-... [13:33:35] 10Project-Admins: Create project tag for - https://phabricator.wikimedia.org/T314406 (10Aklapper) > In either case, using 'eventstreams' will be confusing. [off-topic] See also https://wikitech.wikimedia.org/wiki/Event* and https://www.mediawiki.org/wiki/Naming_things ;) [13:33:48] 10Project-Admins: Create project tag for - https://phabricator.wikimedia.org/T314406 (10Aklapper) >>! In T314406#8127937, @JArguello-WMF wrote: > I'll create a subproject in #Event-Platform with an adequate name to avoid confusion. @JArguello-WMF: https://phabricator.wikimedia.org/project/manage... [13:34:43] 10Release-Engineering-Team (Next), 10Gerrit (Gerrit 3.5), 10Patch-For-Review: Upgrade to Gerrit 3.5 - https://phabricator.wikimedia.org/T307334 (10hashar) [13:35:29] Krinkle: can you help with https://wikitech.wikimedia.org/wiki/Incidents/2022-08-16_Beta_Cluster_502 [13:39:19] 10Release-Engineering-Team (Next), 10Gerrit (Gerrit 3.5), 10Patch-For-Review: Upgrade to Gerrit 3.5 - https://phabricator.wikimedia.org/T307334 (10hashar) [13:40:08] 10Release-Engineering-Team (Next), 10Gerrit (Gerrit 3.5), 10Patch-For-Review: Upgrade to Gerrit 3.5 - https://phabricator.wikimedia.org/T307334 (10hashar) [13:43:04] 10Scap: `scap backport` should include phabricator task in SAL messages - https://phabricator.wikimedia.org/T315444 (10taavi) [13:44:47] (03CR) 10Hashar: "Booted it locally:" [software/gerrit] (deploy/wmf/stable-3.5) - 10https://gerrit.wikimedia.org/r/824200 (https://phabricator.wikimedia.org/T307334) (owner: 10Hashar) [13:46:27] RhinosF1: what specifically would you like help with? [13:47:16] Krinkle: if it looks right for the start of an IR [13:47:32] i also don't want to copy too much across and sound too harsh of beta [13:49:56] RhinosF1: hm.. there's several SREs involved with the puppet changes there, some of them might be able to help first. There's an incident meeting tomorrow I believe, perhaps ask lmata to take a look as well to help make sure the right people are aware and can help determine what happened. [13:50:17] ok [13:54:46] 10Release-Engineering-Team (Next), 10Gerrit (Gerrit 3.5): Update Gerrit CI result table CSS style - https://phabricator.wikimedia.org/T315445 (10hashar) [14:02:16] 10Beta-Cluster-Infrastructure, 10SRE-OnFire, 10Patch-For-Review, 10Sustainability (Incident Followup): Remove two cherry-picked reverts from deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315394 (10jbond) > > https://gerrit.wikimedia.org/r/c/operations/puppet/+/823638 > https://gerrit.wi... [14:07:49] zabe: ori: i just sent an update to https://phabricator.wikimedia.org/T315394#8161855. i think if we can resolve this rebase error everything will be working again on deployment-prep. ill be around for a few hourse so let me know if i can help [14:10:57] 10Project-Admins: Create project tag for M1 mac support - https://phabricator.wikimedia.org/T315424 (10Jdforrester-WMF) We need a better name than M1 given M2s are already out; `Apple Silicon support`? [14:14:33] 10Project-Admins: Create project tag for M1 mac support - https://phabricator.wikimedia.org/T315424 (10taavi) Or just generalize to something like `arm64 support`? [14:16:01] 10Project-Admins: Create project tag for M1 mac support - https://phabricator.wikimedia.org/T315424 (10hashar) Good point! I went with `M1` since that is what folks referred to and are most likely to look for when using tags autocompletion. Your suggestion is great for the tag name and I guess we can add aliases... [14:17:32] taavi: Heh, snap. I'd half written a comment to that extent :) [14:17:48] great minds think alike [14:18:07] As arm in the datacentre is becoming more common [14:18:12] rpi's are getting more powerful too [14:18:28] !log fix merge conflicts in deployment-prep private repo # T315394 [14:18:30] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:18:30] T315394: Remove two cherry-picked reverts from deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315394 [14:19:54] jbond, git-sync-upstream seems to now be running without errors [14:20:02] zabe: awesome :) [14:20:47] thanks and let me know if there is anything elses i can help with [14:28:17] 10Beta-Cluster-Infrastructure, 10Infrastructure-Foundations, 10SRE, 10SRE-OnFire, and 3 others: Evaluation Error on deployment-cache-text06 puppet run - https://phabricator.wikimedia.org/T315351 (10Zabe) 05Open→03Resolved Some cherry-picks made by ori made puppet run again, see T315394 for follow-up. [14:28:23] Nice work everyone [14:28:27] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10SRE-OnFire, and 5 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10Zabe) [14:35:15] 10GitLab (Infrastructure), 10Data-Persistence-Backup, 10serviceops, 10serviceops-collab, and 2 others: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Jelto) #### New partman config The new partman config on `gitlab2003` increased the size of the backup volume: ` vg-root... [14:38:31] 10Project-Admins: Create project tag for Apple Silicon support - https://phabricator.wikimedia.org/T315424 (10hashar) [14:41:27] hello, are you aware that integration-agent-docker-1036 and 1039 seem to be failing puppet according to our internal monitoring system? [14:46:28] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search, 10SRE-OnFire, and 5 others: Known, Beta cluster Error: 502, Next Hop Connection Failed - https://phabricator.wikimedia.org/T315350 (10jbond) [14:46:37] 10Beta-Cluster-Infrastructure, 10SRE-OnFire, 10Sustainability (Incident Followup): Remove two cherry-picked reverts from deployment-puppetmaster04 - https://phabricator.wikimedia.org/T315394 (10jbond) 05Open→03Resolved a:03jbond resolving as i think this is all resolved now but please reopen if not [14:48:05] 10Beta-Cluster-Infrastructure: Spam/Spambot registration on deployment.beta is out of hand - https://phabricator.wikimedia.org/T187046 (10TheresNoTime) 05Open→03Declined Old domain, "Domain not configured" [14:49:46] 10Beta-Cluster-Infrastructure: "The database is read-only until replication lag decreases" when saving preferences on beta - https://phabricator.wikimedia.org/T247617 (10TheresNoTime) 05Open→03Resolved [14:51:14] taavi: I don't know if that is a known issue. Would you mind filing a phab task? [14:53:15] 10Beta-Cluster-Infrastructure, 10MediaWiki-Authentication-and-authorization, 10MediaWiki-extensions-CentralAuth: "Loss of session data" on Beta Cluster - https://phabricator.wikimedia.org/T172560 (10TheresNoTime) [14:53:20] 10Beta-Cluster-Infrastructure, 10Beta-Cluster-reproducible: Login session bug on Beta Commons - https://phabricator.wikimedia.org/T186133 (10TheresNoTime) [15:02:22] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Patch-For-Review: Docker is not running on contint2001 - https://phabricator.wikimedia.org/T313119 (10hashar) 05Open→03Resolved Ran Puppet: ` Notice: /Stage[main]/Profile::Ci::Docker/Service[docker]/enable: enable changed 'false' to '... [15:29:14] 10Release-Engineering-Team (Bonus Level 🕹️): Delete wmf branches from Gerrit repositories - https://phabricator.wikimedia.org/T303828 (10thcipriani) [15:29:17] 10Release-Engineering-Team (Bonus Level 🕹️), 10Scap: `scap backport` should include phabricator task in SAL messages - https://phabricator.wikimedia.org/T315444 (10thcipriani) [15:29:20] 10Release-Engineering-Team (Bonus Level 🕹️), 10Scap: Scap backport: Notify on irc when change has been deployed to mwdebug - https://phabricator.wikimedia.org/T314613 (10thcipriani) [15:29:34] 10Release-Engineering-Team (Bonus Level 🕹️), 10Scap: scap: add progress reporting to php-fpm-restarts - https://phabricator.wikimedia.org/T302631 (10thcipriani) [15:29:38] 10Release-Engineering-Team (Bonus Level 🕹️), 10Scap, 10MW-1.37-notes (1.37.0-wmf.3; 2021-04-27), 10Patch-For-Review, and 2 others: Localisation cache must be purged after or during train deploy, not (just) before - https://phabricator.wikimedia.org/T263872 (10thcipriani) [15:29:43] 10Release-Engineering-Team (Bonus Level 🕹️), 10Scap, 10SRE, 10Python3-Porting: git-fat needs to be ported to Python 3 - https://phabricator.wikimedia.org/T279509 (10thcipriani) [15:38:37] 10Release-Engineering-Team: Weekly train branch cut job should wait until Jenkins has merged the mediawiki/core branch commit - https://phabricator.wikimedia.org/T315452 (10dancy) [15:39:21] 10Release-Engineering-Team (Bonus Level 🕹️): Weekly train branch cut job should wait until Jenkins has merged the mediawiki/core branch commit - https://phabricator.wikimedia.org/T315452 (10thcipriani) [15:44:54] 10Project-Admins: Create project tag for Apple Silicon support - https://phabricator.wikimedia.org/T315424 (10Jdforrester-WMF) >>! In T315424#8161902, @taavi wrote: > Or just generalize to something like `arm64 support`? Are we actually want to support the whole range of ARMv8 architectures? Will there be confl... [16:20:15] 10Release-Engineering-Team (Bonus Level 🕹️), 10Scap: scap: add progress reporting to php-fpm-restarts - https://phabricator.wikimedia.org/T302631 (10dancy) a:05dancy→03None [16:20:35] 10Release-Engineering-Team (Bonus Level 🕹️), 10Scap, 10MW-1.37-notes (1.37.0-wmf.3; 2021-04-27), 10Patch-For-Review, and 2 others: Localisation cache must be purged after or during train deploy, not (just) before - https://phabricator.wikimedia.org/T263872 (10dancy) a:05dancy→03None [16:25:03] 10Release-Engineering-Team (Bonus Level 🕹️), 10Scap: scap: add progress reporting to php-fpm-restarts - https://phabricator.wikimedia.org/T302631 (10dancy) [16:36:17] that was a quick meeting :) [16:39:14] The best kind [16:45:44] sorry [17:07:27] 10Project-Admins: Create project tag for Apple Silicon support - https://phabricator.wikimedia.org/T315424 (10bd808) >>! In T315424#8162188, @Jdforrester-WMF wrote: > Are we actually want to support the whole range of ARMv8 architectures? Will there be conflicts between different use cases? There might be chip... [18:20:04] (03PS1) 10Ahmon Dancy: Add utils. subprocess_check_run_quietly_if_ok() [tools/scap] - 10https://gerrit.wikimedia.org/r/824255 [18:22:14] (03PS2) 10Ahmon Dancy: Add utils.subprocess_check_run_quietly_if_ok() [tools/scap] - 10https://gerrit.wikimedia.org/r/824255 [18:36:36] (03CR) 10Ahmon Dancy: [C: 03+2] Add utils.subprocess_check_run_quietly_if_ok() [tools/scap] - 10https://gerrit.wikimedia.org/r/824255 (owner: 10Ahmon Dancy) [18:40:39] (03PS1) 10Ahmon Dancy: Clearly distinguish the remaining and elapsed time values [integration/zuul] (patch-queue/debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/824259 [18:41:38] (03Merged) 10jenkins-bot: Add utils.subprocess_check_run_quietly_if_ok() [tools/scap] - 10https://gerrit.wikimedia.org/r/824255 (owner: 10Ahmon Dancy) [18:42:30] (03PS1) 10Ahmon Dancy: Add utils.get_current_train_info() [tools/scap] - 10https://gerrit.wikimedia.org/r/824260 [18:59:38] 10Phabricator, 10Project-Admins, 10User-RhinosF1, 10affects-Miraheze: Split Miraheze-Linked column of #user-rhinosf1 to new affects-Miraheze project and update herald - https://phabricator.wikimedia.org/T315160 (10Aklapper) 05Open→03Resolved Created H406 called `Add #affects-miraheze [T315160; T240987]... [19:00:29] 10Phabricator, 10Project-Admins, 10User-RhinosF1, 10affects-Miraheze: Split Miraheze-Linked column of #user-rhinosf1 to new affects-Miraheze project and update herald - https://phabricator.wikimedia.org/T315160 (10RhinosF1) Thanks for the help! [19:41:36] (03CR) 10Jeena Huneidi: Add utils.get_current_train_info() (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/824260 (owner: 10Ahmon Dancy) [19:46:11] (03CR) 10Ahmon Dancy: Add utils.get_current_train_info() (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/824260 (owner: 10Ahmon Dancy) [20:19:53] (03CR) 10Jeena Huneidi: [C: 03+2] "LGTM, tested in train-dev" [tools/scap] - 10https://gerrit.wikimedia.org/r/824260 (owner: 10Ahmon Dancy) [20:24:39] (03Merged) 10jenkins-bot: Add utils.get_current_train_info() [tools/scap] - 10https://gerrit.wikimedia.org/r/824260 (owner: 10Ahmon Dancy) [20:39:48] (03PS1) 10Ahmon Dancy: Use train-blockers.toolforge for scap stage-train auto information [tools/scap] - 10https://gerrit.wikimedia.org/r/824288 (https://phabricator.wikimedia.org/T310395) [22:13:47] (03CR) 10Stang: zuul: Fix/remove links to non-existent Grafana graphs (031 comment) [integration/docroot] - 10https://gerrit.wikimedia.org/r/810979 (https://phabricator.wikimedia.org/T307405) (owner: 10Stang) [22:26:44] (03CR) 10Brennen Bearnes: [C: 03+1] Use train-blockers.toolforge for scap stage-train auto information [tools/scap] - 10https://gerrit.wikimedia.org/r/824288 (https://phabricator.wikimedia.org/T310395) (owner: 10Ahmon Dancy) [22:43:48] 10Phabricator, 10Release-Engineering-Team (Bonus Level 🕹️), 10serviceops, 10serviceops-collab, 10Patch-For-Review: Setup rsync for phab data on disk - https://phabricator.wikimedia.org/T313360 (10Dzahn) >>! In T313360#8159924, @Dzahn wrote: > regarding the UIDs.. user 'phd' has a reserved UID of 498. per... [22:57:17] 10GitLab (Infrastructure), 10Data-Persistence-Backup, 10serviceops, 10serviceops-collab, and 2 others: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10thcipriani) @LSobanski and I talked about predicting storage space of GitLab as we grow. We've learned that storage growth is much differe... [23:39:27] (03PS2) 10Ahmon Dancy: Use train-blockers.toolforge for scap stage-train auto information [tools/scap] - 10https://gerrit.wikimedia.org/r/824288 (https://phabricator.wikimedia.org/T310395) [23:56:06] 10Release-Engineering-Team (Next), 10tech-decision-forum, 10Code-Stewardship-Reviews, 10Documentation, 10User-AKlapper: Document checklist steps to undeploy / sunset a codebase on WMF servers (not: archiving) - https://phabricator.wikimedia.org/T294329 (10LNguyen) >>! In T294329#8073743, @Aklapper wrote:... [23:57:09] 10Release-Engineering-Team (Next), 10Code-Stewardship-Reviews, 10Documentation, 10User-AKlapper: Document checklist steps to undeploy / sunset a codebase on WMF servers (not: archiving) - https://phabricator.wikimedia.org/T294329 (10LNguyen)