[01:45:38] 06serviceops, 07Datacenter-Switchover: Determine switchover changes for migration of video scaling to k8s - https://phabricator.wikimedia.org/T372849 (10Scott_French) 03NEW [01:47:05] 06serviceops, 07Datacenter-Switchover: Determine switchover changes for migration of video scaling to k8s - https://phabricator.wikimedia.org/T372849#10076128 (10Scott_French) [01:47:06] 06serviceops, 07Datacenter-Switchover: Southward Datacenter Switchover (September 2024) - https://phabricator.wikimedia.org/T370962#10076129 (10Scott_French) [08:37:04] 06serviceops, 10CX-cxserver, 10RESTBase Sunsetting, 10LPL Essential (LPL Essential 2024 Jul-Sep): Switchover plan from RESTBase to REST Gateway for cxserver - https://phabricator.wikimedia.org/T372753#10076461 (10PWaigi-WMF) [09:53:30] hnowlan i just checked and there is no mediawiki_http_requests_duration_count for mw-jobrunner [09:53:32] so [09:54:06] I guess we just don't produce that metric for them? [10:00:24] huh, weird given that they're just appservers - but I would suppose those metrics don't get emitted by hitting rpc/RunSingleJob.php [10:00:28] 06serviceops, 06Infrastructure-Foundations, 10netops, 06Traffic: weighted maglev viability for low-traffic services - https://phabricator.wikimedia.org/T368545#10076853 (10Vgutierrez) A quick test using IPVS maglev implementation with mh-port flag enabled (to include the source port as part of the load bal... [10:01:05] iirc they're computed from the log stream [10:03:39] ah [10:04:12] https://gerrit.wikimedia.org/g/operations/puppet/+/608b8b9ef76f270d26c615fc57b6cbbb0dc1a365/modules/profile/files/benthos/instances/mw_accesslog_metrics.yaml [10:26:23] 06serviceops: Unexpected helmfile changes when attempting a k8s deployment for a miscweb site - https://phabricator.wikimedia.org/T372825#10076961 (10Clement_Goubert) Changes to sidecar images are generally fine to deploy, if in doubt you can ask on IRC either in #wikimedia-operations or #wikimedia-serviceops an... [11:22:37] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878 (10Clement_Goubert) 03NEW [11:25:26] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10077064 (10Clement_Goubert) p:05Triage→03High [11:35:45] 06serviceops, 10ChangeProp, 10MediaWiki-Core-HTTP-Cache, 06MediaWiki-Engineering, and 2 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#10077127 (10daniel) Looking at metrics from LinksUpdate, it seems that w... [11:41:51] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10077137 (10Clement_Goubert) From what I can gather the automation is there with the `--move-vlan` option to the reimage cookbook, I th... [12:20:11] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10077212 (10ayounsi) > I need to check that the physical cabling changes are ok before we start Physical cabling is on the new switches... [13:12:25] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: Q1:rack/setup/install wikikube-worker1240 to wikikube-worker1304 - https://phabricator.wikimedia.org/T369743#10077382 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host wikikube-worker1299.eqiad.wmnet with OS bullseye... [13:12:47] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: Q1:rack/setup/install wikikube-worker1240 to wikikube-worker1304 - https://phabricator.wikimedia.org/T369743#10077383 (10Jclark-ctr) [14:11:58] 06serviceops, 06Content-Transform-Team-WIP, 10Page Content Service, 07Code-Health-Objective, 13Patch-For-Review: ptwiki: Use backing node service instead of RESTBase on pregeneration changeprop rules - https://phabricator.wikimedia.org/T372749#10077698 (10Jgiannelos) [14:12:27] 06serviceops, 06Content-Transform-Team-WIP, 10Page Content Service, 10RESTBase Sunsetting, and 2 others: ptwiki: Use backing node service instead of RESTBase on pregeneration changeprop rules - https://phabricator.wikimedia.org/T372749#10077701 (10Jgiannelos) [14:12:50] 06serviceops, 06Content-Transform-Team-WIP, 10Page Content Service, 10RESTBase Sunsetting, and 2 others: ptwiki: Use backing node service instead of RESTBase on pregeneration changeprop rules - https://phabricator.wikimedia.org/T372749#10077703 (10Jgiannelos) a:03Jgiannelos [15:14:39] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10078004 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw2291 to... [15:15:17] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10078010 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host w... [15:17:14] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10078014 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikik... [15:28:38] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10078072 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host w... [16:12:35] 06serviceops, 10Scap: RESTBase deployment fails with scap internal error - https://phabricator.wikimedia.org/T294936#10078303 (10dancy) 05Resolved→03Open [16:16:06] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10078336 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikik... [16:46:21] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T372916 (10Clement_Goubert) 03NEW [17:08:48] 06serviceops, 10Scap: RESTBase deployment fails with scap internal error - https://phabricator.wikimedia.org/T294936#10078634 (10dancy) 05Open→03Resolved [18:23:48] 06serviceops, 10Deployments, 10Shellbox, 10Wikibase-Quality-Constraints, and 2 others: Burst of curlFactory cURL errors during mediawiki deployments - https://phabricator.wikimedia.org/T371633#10078899 (10Krinkle) >>! @dancy wrote in the **task description**: > ` > CurlFactory:211 cURL error 56: Recv fail... [18:23:49] 06serviceops, 10Deployments, 10Shellbox, 10Wikibase-Quality-Constraints, and 2 others: Burst of curlFactory cURL errors during mediawiki deployments - https://phabricator.wikimedia.org/T371633#10078904 (10Krinkle) [18:24:43] 06serviceops, 10Deployments, 10Shellbox, 10Wikibase-Quality-Constraints, and 2 others: Burst of curlFactory cURL errors during mediawiki deployments - https://phabricator.wikimedia.org/T371633#10078905 (10Krinkle) [18:25:20] 06serviceops, 10Deployments, 10Shellbox, 10Wikibase-Quality-Constraints, and 2 others: Burst of GuzzleHttp Exception for http://localhost:6025/call/constraint-regex-checker - https://phabricator.wikimedia.org/T371633#10078911 (10Krinkle) [20:01:32] huh, have I broken something? [20:01:42] * kamila_ will look at benthos tomorrow [20:43:49] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T372916#10079278 (10Jhancock.wm) FYI, I'm getting an netbox reporting alert on mw2291. test_puppetdb_in_netbox 2024-08-20T20:40:02.169950+00:00 Failure — — expected... [21:22:38] 06serviceops, 06DBA, 06SRE: In the aftermath of T370304: Brainstorming of short- and medium-term observability / quality-of-life production changes - https://phabricator.wikimedia.org/T372943 (10CDanis) 03NEW [21:22:54] 06serviceops, 06DBA, 06SRE: In the aftermath of T370304: Brainstorming of short- and medium-term observability / quality-of-life production changes - https://phabricator.wikimedia.org/T372943#10079364 (10CDanis)