[10:58:34] 10serviceops, 10ops-codfw: Degraded RAID on mw2442 - https://phabricator.wikimedia.org/T357380#9575876 (10JMeybohm) The new disk was not detected by the host, even after scsi scan (maybe that's not a thing anymore? ;)) Anyhow. I rebooted the node and it did not came back up. Powercycling again with console att... [11:16:56] 10serviceops, 10ops-codfw: Degraded RAID on mw2442 - https://phabricator.wikimedia.org/T357380#9575944 (10JMeybohm) @MatthewVernon pointed out (thanks) that this could have helped (if done before the reboot obviously): https://wikitech.wikimedia.org/wiki/Swift/How_To#Replacing_a_disk_without_touching_the_ring... [11:19:11] 10serviceops, 10ops-codfw: Degraded RAID on mw2442 - https://phabricator.wikimedia.org/T357380#9575965 (10MatthewVernon) After the reboot, you could still have made the new virtual drive with the last of those lines: ` megacli -CfgEachDskRaid0 WB RA Direct CachedBadBBU -a0 ` [11:20:12] 10serviceops, 10ops-codfw: Degraded RAID on mw2442 - https://phabricator.wikimedia.org/T357380#9575966 (10JMeybohm) >>! In T357380#9575965, @MatthewVernon wrote: > After the reboot, you could still have made the new virtual drive with the last of those lines: > ` > megacli -CfgEachDskRaid0 WB RA Direct CachedB... [12:05:13] 10serviceops, 10Similarusers: Remove similar-users service from k8s - https://phabricator.wikimedia.org/T345274#9576104 (10Tchanders) @Joe I can approve an undeployment. I think this will need a fair amount of reworking if and when we pick it back up again, and it doesn't make sense to add to your maintenance... [13:18:52] 10serviceops, 10ops-codfw: Degraded RAID on mw2442 - https://phabricator.wikimedia.org/T357380#9576213 (10MoritzMuehlenhoff) [13:20:02] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536#9576217 (10Clement_Goubert) [13:20:12] 10serviceops, 10MW-on-K8s, 10Release-Engineering-Team, 10SRE, and 2 others: Move 50% of mediawiki external requests to mw on k8s - https://phabricator.wikimedia.org/T357507#9576216 (10Clement_Goubert) 05Open→03Resolved [13:30:42] 10serviceops, 10MW-on-K8s, 10Release-Engineering-Team, 10SRE, 10Traffic: Move 50% of mediawiki external requests to mw on k8s - https://phabricator.wikimedia.org/T357507#9576241 (10Clement_Goubert) [13:31:09] 10serviceops, 10MW-on-K8s, 10Release-Engineering-Team, 10SRE, 10Traffic: Move 60% of mediawiki external requests to mw on k8s - https://phabricator.wikimedia.org/T357508#9576243 (10Clement_Goubert) 05Stalled→03In progress [13:31:16] 10serviceops, 10MW-on-K8s, 10SRE, 10Scap, 10Release-Engineering-Team (Now this 🫠): Scap should check errors coming from mw-on-k8s canaries during deployments - https://phabricator.wikimedia.org/T357402#9576244 (10Clement_Goubert) [13:31:21] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536#9576245 (10Clement_Goubert) [13:56:11] 10serviceops, 10DC-Ops, 10ops-codfw: mw2420-mw2451 do have unncecesarry raid controllers (configured - https://phabricator.wikimedia.org/T358489#9576345 (10JMeybohm) [13:57:25] 10serviceops, 10ops-codfw: Degraded RAID on mw2442 - https://phabricator.wikimedia.org/T357380#9576360 (10JMeybohm) 05Open→03Resolved a:03JMeybohm T358489 as follow-up for the strange RAID config, resolving this one. [13:58:07] 10serviceops, 10DC-Ops, 10ops-codfw: mw2420-mw2451 do have unncecesarry raid controllers (configured) - https://phabricator.wikimedia.org/T358489#9576368 (10JMeybohm) [14:00:48] 10serviceops, 10DC-Ops, 10ops-codfw: mw2420-mw2451 do have unncecesarry raid controllers (configured) - https://phabricator.wikimedia.org/T358489#9576376 (10MatthewVernon) If you do decide you might want to reprovision these nodes as non-RAID, there is a [[ https://gerrit.wikimedia.org/r/plugins/gitiles/oper... [14:09:43] 10serviceops, 10DC-Ops, 10ops-codfw: mw2420-mw2451 do have unncecesarry raid controllers (configured) - https://phabricator.wikimedia.org/T358489#9576400 (10RobH) Moritz asked me about this, and I have some background. So orders placed in January 2023 via the dell portal for standard configs also included a... [14:11:09] 10serviceops, 10DC-Ops, 10ops-codfw: mw2420-mw2451 do have unncecesarry raid controllers (configured) - https://phabricator.wikimedia.org/T358489#9576401 (10RobH) I'm told there is a question on 'can we pull these raid controllers to use elsewhere' and the answer is 'no, or the host you remove it from has no... [15:12:27] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: mw2420-mw2451 do have unncecesarry raid controllers (configured) - https://phabricator.wikimedia.org/T358489#9576608 (10JMeybohm) >>! In T358489#9576376, @MatthewVernon wrote: > If you do decide you might want to reprovision these nodes as non-RAID, there is a [[... [15:14:42] 10serviceops, 10Patch-For-Review: Have internal MediaWiki to MediaWiki HTTP requests use an envoyproxy on appservers - https://phabricator.wikimedia.org/T298265#9576616 (10Clement_Goubert) All wikis are now using the local envoy {F42149057} The Y-axis not starting at 0 is misleading, but the numbers show a qua... [15:34:46] 10serviceops, 10DC-Ops, 10Data-Persistence, 10Traffic, and 4 others: ☂️ Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T357547#9576708 (10joanna_borun) [16:05:18] 10serviceops, 10Infrastructure-Foundations, 10Packaging, 10SRE: Package php-ast in {stretch,buster}-wikimedia/component - https://phabricator.wikimedia.org/T280210#9576817 (10joanna_borun) [16:05:26] 10serviceops, 10Infrastructure-Foundations, 10Packaging, 10SRE: Package php-ast in {stretch,buster}-wikimedia/component - https://phabricator.wikimedia.org/T280210#9576820 (10joanna_borun) @Reedy is it still valid? [16:25:28] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: mw2420-mw2451 do have unncecesarry raid controllers (configured) - https://phabricator.wikimedia.org/T358489#9576996 (10JMeybohm) >>! In T358489#9576608, @JMeybohm wrote: >>>! In T358489#9576376, @MatthewVernon wrote: >> If you do decide you might want to reprovi... [16:48:57] 10serviceops: Have internal MediaWiki to MediaWiki HTTP requests use an envoyproxy on appservers - https://phabricator.wikimedia.org/T298265#9577116 (10Clement_Goubert) [16:49:08] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Migrate internal traffic to k8s - https://phabricator.wikimedia.org/T333120#9577115 (10Clement_Goubert) [16:50:32] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Migrate internal traffic to k8s - https://phabricator.wikimedia.org/T333120#8728141 (10Clement_Goubert) [16:50:43] 10serviceops: Have internal MediaWiki to MediaWiki HTTP requests use an envoyproxy on appservers - https://phabricator.wikimedia.org/T298265#9577145 (10Clement_Goubert) 05Open→03Resolved a:03Clement_Goubert Not seeing obvious issues in logstash or graphs, considering resolved. Feel free to reopen if someth... [17:32:12] 10serviceops, 10CommRel-Specialists-Support: CommRel support for Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T358233#9577389 (10Trizek-WMF) [17:32:34] 10serviceops, 10CommRel-Specialists-Support: CommRel support for Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T358233#9568155 (10Trizek-WMF) [17:37:44] 10serviceops, 10CommRel-Specialists-Support: CommRel support for Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T358233#9577423 (10Trizek-WMF) Thank you @jijiki! We will start the process very soon. [23:40:59] 10serviceops, 10Data-Persistence, 10Traffic, 10conftool: Switch conftool to use the version 3 etcd datastore - https://phabricator.wikimedia.org/T350565#9578631 (10Scott_French) a:03Scott_French