[10:52:55] 06Traffic: Review haproxy captured header lengths - https://phabricator.wikimedia.org/T360415 (10Fabfur) 03NEW [10:54:40] 06Traffic, 13Patch-For-Review: 14Benthos: add specific unit tests for different logs - 14https://phabricator.wikimedia.org/T359626#9641408 (10Fabfur) 05Open→03Resolved 14This is fixed  [10:54:44] 06Traffic, 06Data Products, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#9641410 (10Fabfur) [10:55:10] 06Traffic, 06Data Products, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#9641413 (10Fabfur) [10:56:17] 06Traffic, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: Install new Benthos instance on cp hosts - https://phabricator.wikimedia.org/T358109#9641411 (10Fabfur) 05Open→03In progress p:05Triage→03Medium [10:58:53] 06Traffic, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: Benthos: better management for unparsable logs - https://phabricator.wikimedia.org/T359627#9641417 (10Fabfur) p:05Triage→03Low Even without metrics generation, this has been fixed with a small processing on the input side. Lea... [13:02:28] 06Traffic, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 5 others: ☂️ Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T357547#9641765 (10jijiki) [13:37:10] 06Traffic, 06DC-Ops, 10ops-esams: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430 (10RobH) 03NEW [13:37:22] 06Traffic, 06DC-Ops, 10ops-esams: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9641914 (10RobH) [13:39:05] 06Traffic, 06DC-Ops, 10ops-esams: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9641925 (10RobH) [13:57:06] 10netops, 06DC-Ops, 06Infrastructure-Foundations: Take advantage of 10Gb NICs in the new network stack - https://phabricator.wikimedia.org/T360297#9642004 (10elukey) For the ML hosts - our K8s clusters don't currently require 10G bandwidth, and at the time we didn't want to "waste" 10G ports if not really ne... [13:59:33] 10netops, 06DC-Ops, 06Infrastructure-Foundations: Take advantage of 10Gb NICs in the new network stack - https://phabricator.wikimedia.org/T360297#9642027 (10ayounsi) {F42751975} {F42751976} Feel free to test it on Netbox next The steps to follow once this script is deployed : # (Optional) Upgrade idrac... [14:04:30] 10netops, 06DC-Ops, 06Infrastructure-Foundations: Take advantage of 10Gb NICs in the new network stack - https://phabricator.wikimedia.org/T360297#9642060 (10ayounsi) > For the ML hosts - our K8s clusters don't currently require 10G bandwidth, and at the time we didn't want to "waste" 10G ports if not really... [14:20:37] 06Traffic, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 5 others: ☂️ Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T357547#9642157 (10ops-monitoring-bot) jiji@cumin1002 - Cookbook cookbooks.sre.discovery.datacenter depool all services in codfw: Northward DC... [14:25:18] 06Traffic, 06DC-Ops, 10ops-esams: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9642233 (10RobH) a:03RobH [14:25:32] 06Traffic, 06DC-Ops, 10ops-esams: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9642243 (10RobH) Chatted with @ssingh as I had neglected some items we had discussed previously: * Adjusted this from a single installation window to 2 windows, 1 week apart, falling on Wednesday. ** All... [14:38:12] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Decom asw-a-codfw switch stack - https://phabricator.wikimedia.org/T358244#9642302 (10Papaul) Zeroize done on asw-a3 and asw-a4 [14:40:47] 06Traffic, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 4 others: ☂️ Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T357547#9642359 (10ops-monitoring-bot) jiji@cumin1002 - Cookbook cookbooks.sre.discovery.datacenter depool all services in codfw: Northward DC... [15:29:07] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9642678 (10ssingh) Hi @RobH: Thanks for creating the task. In some further discussion with @BBlack today, we decided that we will do the following: - We have decided that we will depool esams p... [15:32:09] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9642721 (10RobH) [15:33:45] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9642727 (10RobH) Remote hands won't have any ability to power down a host other than by pressing the front power button. It would reduce potential complexity if we power down all the hosts for t... [15:41:21] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9642791 (10ssingh) >>! In T360430#9642721, @RobH wrote: > Remote hands won't have any ability to power down a host other than by pressing the front power button. It would reduce potential comple... [15:47:01] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Decom asw-a-codfw switch stack - https://phabricator.wikimedia.org/T358244#9642812 (10Papaul) Zeroize done on all the old switches in role a [15:51:42] 06Traffic, 06Data-Engineering, 10Observability-Logging: Add $schema key to Benthos payload - https://phabricator.wikimedia.org/T360450 (10Fabfur) 03NEW [15:56:05] 06Traffic, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 5 others: ☂️ Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T357547#9642861 (10akosiaris) We had to repool kartotherian in codfw as we had a [CPU exhaustion event](https://grafana.wikimedia.org/d/000000... [16:08:31] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE, 13Patch-For-Review: Decom asw-a-codfw switch stack - https://phabricator.wikimedia.org/T358244#9642913 (10Papaul) [16:11:56] 06Traffic, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 5 others: ☂️ Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T357547#9642921 (10Clement_Goubert) Some tweaking of replicas was needed on mw-on-k8s, which was expected as this is the first switchover wher... [16:25:32] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE, 13Patch-For-Review: Decom asw-a-codfw switch stack - https://phabricator.wikimedia.org/T358244#9642974 (10Papaul) [16:31:09] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Decom asw-a-codfw switch stack - https://phabricator.wikimedia.org/T358244#9642996 (10Papaul) [16:33:59] 06Traffic, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: Add $schema key to Benthos payload - https://phabricator.wikimedia.org/T360450#9643007 (10gmodena) For context: this is the approach we follow with other producers, e.g. [Java](https://gerrit.wikimedia.org/r/plugins/gitiles/wikimed... [16:37:03] 06Traffic, 06Data-Engineering, 10Observability-Logging, 10Event-Platform, 13Patch-For-Review: Add $schema key to Benthos payload - https://phabricator.wikimedia.org/T360450#9643030 (10gmodena) [16:41:18] 10netops, 06DC-Ops, 06Infrastructure-Foundations: Take advantage of 10Gb NICs in the new network stack - https://phabricator.wikimedia.org/T360297#9643042 (10wiki_willy) Hi @elukey - do you want me to change the Lift Wing expansion requests for 16x servers in FY24-25 to 10g? Thanks, Willy [17:06:27] 06Traffic, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: Install new Benthos instance on cp hosts - https://phabricator.wikimedia.org/T358109#9643142 (10Fabfur) [17:07:51] 06Traffic, 06Data-Engineering, 10Observability-Logging, 10Event-Platform, 13Patch-For-Review: 14Add $schema key to Benthos payload - 14https://phabricator.wikimedia.org/T360450#9643140 (10Fabfur) 05Open→03Resolved p:05Triage→03Low [17:09:54] 06Traffic, 06Data-Engineering, 10Observability-Logging: Better Benthos performances - https://phabricator.wikimedia.org/T360454 (10Fabfur) 03NEW [17:18:01] 06Traffic, 13Patch-For-Review: 14Simplify maintenance of DNS/NTP hosts to reduce toil around reboots, reimages, and other work - 14https://phabricator.wikimedia.org/T347054#9643222 (10ssingh) 14>>! In T347054#9584958, @Volans wrote: > Even from itself? As in what happens if an operator runs `authdns-updat... [17:30:01] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9643269 (10ssingh) Rob, once the time/data is confirmed, please let me know here or on IRC and I will send an email to sre@. Thanks! [20:47:19] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Decom asw-a-codfw switch stack - https://phabricator.wikimedia.org/T358244#9644060 (10Papaul) [23:26:33] 06Traffic: 14Review haproxy captured header lengths - 14https://phabricator.wikimedia.org/T360415#9644363 (10Fabfur) 05Open→03Resolved p:05Triage→03Low [23:26:36] 06Traffic, 06Data Products, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#9644365 (10Fabfur) [23:27:06] 06Traffic, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: 14Benthos: better management for unparsable logs - 14https://phabricator.wikimedia.org/T359627#9644377 (10Fabfur) 05In progress→03Resolved [23:27:18] 06Traffic, 06Data Products, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#9644378 (10Fabfur)