[11:06:28] 10netops, 06Infrastructure-Foundations, 06serviceops, 07Kubernetes: Reimage one of the wikikube-worker1240 to wikikube-worker1304 node in eqiad as a replacement for wikikube-ctrl1001 - https://phabricator.wikimedia.org/T379790#10330555 (10JMeybohm) Beware of {T380142} [11:41:22] 10netops, 06Infrastructure-Foundations, 06serviceops, 07Kubernetes: Reimage one of the wikikube-worker1240 to wikikube-worker1304 node in eqiad as a replacement for wikikube-ctrl1001 - https://phabricator.wikimedia.org/T379790#10330660 (10cmooney) >>! In T379790#10322697, @akosiaris wrote: > Cool, thanks.... [15:38:48] FIRING: PuppetZeroResources: Puppet has failed generate resources on cp7001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [15:39:00] sukhe: ^^ is that you? [15:39:01] yesss [15:39:04] yep :) [15:39:09] intentionally firing [15:39:10] col [15:39:11] cool [15:39:25] we got rid of those back in the day acuse they were kinda noisy :) [15:39:41] are those alerting on a single ocurrence or after a few failures in a row? [15:39:53] vgutierrez: https://phabricator.wikimedia.org/T379807 [15:40:06] we need to alert on these in case of hardware failure checks [15:40:20] and in general, since last time it was failing over 12 hours and no alert was sent out [15:40:32] we can adjust the window [15:41:54] 06Traffic, 10Observability-Alerting, 06SRE: PuppetFailure alert is not being fired for host(s) where agent has failed - https://phabricator.wikimedia.org/T379807#10331748 (10ssingh) 05Open→03Resolved a:03ssingh ` 10:38:48 < jinxer-wm> FIRING: PuppetZeroResources: Puppet has failed generate resource... [15:43:18] sukhe: cool [15:43:44] 10netops, 06Infrastructure-Foundations, 10netbox, 13Patch-For-Review: Netbox: librenms report errors - https://phabricator.wikimedia.org/T379907#10331760 (10joanna_borun) p:05Triage→03Medium [15:48:51] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: Decom prod infra side of the ulsfo-office link - https://phabricator.wikimedia.org/T379778#10331780 (10cmooney) p:05Triage→03Medium [15:51:50] 10netops, 06Infrastructure-Foundations, 06serviceops, 07Kubernetes: Reimage one of the wikikube-worker1240 to wikikube-worker1304 node in eqiad as a replacement for wikikube-ctrl1001 - https://phabricator.wikimedia.org/T379790#10331792 (10cmooney) p:05Triage→03Medium [15:53:48] RESOLVED: PuppetZeroResources: Puppet has failed generate resources on cp7001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [16:03:49] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: Decom prod infra side of the ulsfo-office link - https://phabricator.wikimedia.org/T379778#10331908 (10RobH) [16:50:54] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: Decom prod infra side of the ulsfo-office link - https://phabricator.wikimedia.org/T379778#10332241 (10RobH) a:05RobH→03None [16:53:21] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: Decom prod infra side of the ulsfo-office link - https://phabricator.wikimedia.org/T379778#10332236 (10RobH) 05Open→03Resolved a:03RobH @wiki_willy: I just wanted to notify you of this task's resolution and you'll see th... [17:07:13] 06Traffic, 06serviceops: Handling inbound IPIP traffic on low traffic LVS k8s based realservers - https://phabricator.wikimedia.org/T352956#10332388 (10JMeybohm) I see that I did not put this here, sorry. In the IPIP mail thread we suggested to set a fixed, smaller MTU for all Pod traffic in order to not have... [17:24:00] 06Traffic, 06serviceops: Handling inbound IPIP traffic on low traffic LVS k8s based realservers - https://phabricator.wikimedia.org/T352956#10332637 (10Vgutierrez) as mentioned on the email thread that sounds like viable option for us [17:28:41] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes: Handling inbound IPIP traffic on low traffic LVS k8s based realservers - https://phabricator.wikimedia.org/T352956#10332691 (10JMeybohm) [17:50:28] 06Traffic, 10envoy, 06serviceops, 06SRE: Upgrade Envoy to >= 1.24 - https://phabricator.wikimedia.org/T380211 (10JMeybohm) 03NEW [17:50:42] 06Traffic, 10envoy, 06serviceops, 06SRE: Upgrade Envoy to >= 1.24 - https://phabricator.wikimedia.org/T380211#10332849 (10JMeybohm) [17:50:49] 06Traffic, 10envoy, 06serviceops, 06SRE, 13Patch-For-Review: Upgrade Envoy to supported version - https://phabricator.wikimedia.org/T300324#10332850 (10JMeybohm) [21:36:51] 06Traffic, 13Patch-For-Review: Upgrade Varnish from 6.0.11 to 6.0.13 - https://phabricator.wikimedia.org/T379699#10333861 (10BCornwall) 05In progress→03Declined We're skipping this version - the CVEs aren't relevant to our deployment and we'll instead work torwards moving to 7.x.