[02:44:21] 10netops, 06Infrastructure-Foundations, 10observability, 06SRE: LibreNMS reporting no routes learnt from doh/durum Anycast peers at various POPs - https://phabricator.wikimedia.org/T384258#10502803 (10andrea.denisse) Hi @cmooney, I was reviewing the [[ https://github.com/librenms/librenms/releases/tag/25.... [03:18:36] 10netops, 06Infrastructure-Foundations, 10observability, 06SRE, 10SRE Observability (FY2024/2025-Q3): LibreNMS reporting no routes learnt from doh/durum Anycast peers at various POPs - https://phabricator.wikimedia.org/T384258#10502812 (10andrea.denisse) 05Open→03Resolved After upgrading to v25.1... [08:28:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.03%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [08:32:48] ^ that's temporary, will resolve later the day or tomorrow (due to VM migrations for the bookworm updates) [08:33:10] RESOLVED: GanetiMemoryPressure: Ganeti: High memory usage (90.03%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [08:56:23] k [09:39:31] 10netops, 06Infrastructure-Foundations, 10observability, 06SRE, 10SRE Observability (FY2024/2025-Q3): LibreNMS reporting no routes learnt from doh/durum Anycast peers at various POPs - https://phabricator.wikimedia.org/T384258#10503188 (10cmooney) >>! In T384258#10502812, @andrea.denisse wrote: > Aft... [09:57:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.1%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [10:02:10] RESOLVED: GanetiMemoryPressure: Ganeti: High memory usage (90.1%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [10:53:18] 10netops, 06Infrastructure-Foundations, 10ops-magru: Jan 2025 - Magru core router connectivity blips - https://phabricator.wikimedia.org/T384774#10503562 (10cmooney) 05Open→03Resolved Gonna close this one, all is stable after ~24h. [10:58:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.02%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [11:03:10] RESOLVED: GanetiMemoryPressure: Ganeti: High memory usage (90.02%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [11:27:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.11%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [11:32:10] RESOLVED: GanetiMemoryPressure: Ganeti: High memory usage (90.11%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [11:35:40] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Productionize gnmic network telemetry pipeline - https://phabricator.wikimedia.org/T369384#10503694 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=36d26c8a-4d30-4345-8682-54b6b4882e38) set by cmooney@cumin1002 for 3:00:... [11:57:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.06%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [12:02:10] RESOLVED: GanetiMemoryPressure: Ganeti: High memory usage (90.06%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [12:27:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.01%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [12:32:10] RESOLVED: GanetiMemoryPressure: Ganeti: High memory usage (90.01%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [12:57:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.03%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [13:02:10] RESOLVED: GanetiMemoryPressure: Ganeti: High memory usage (90.08%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [13:14:51] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 10vrts: Mailserver refusing emails sent through VRTS due to too large headers - https://phabricator.wikimedia.org/T380696#10503987 (10Aklapper) @jhathaway: Did you receive some reply from their postmaster? [13:18:54] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Productionize gnmic network telemetry pipeline - https://phabricator.wikimedia.org/T369384#10504014 (10cmooney) Moving to //event-value-tag-v2// has been pushed out to all our Netflow VMs and we've seen a nice reduction in CPU usage, plus a... [13:27:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [13:32:10] RESOLVED: GanetiMemoryPressure: Ganeti: High memory usage (90.13%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [13:57:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.08%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [14:02:10] RESOLVED: GanetiMemoryPressure: Ganeti: High memory usage (90%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [14:10:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.05%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [15:02:14] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 10vrts: Mailserver refusing emails sent through VRTS due to too large headers - https://phabricator.wikimedia.org/T380696#10504785 (10jhathaway) unfortunately not. [15:11:31] 10netops, 06Infrastructure-Foundations, 10observability, 10Prod-Kubernetes, and 3 others: Prevent BGP alerts triggering when K8s host maintenance is being done - https://phabricator.wikimedia.org/T384731#10504843 (10lmata) [15:37:58] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: Dec 2024: cr3-ulsfo errors on et-0/0/0 link from cr4 - https://phabricator.wikimedia.org/T384288#10505023 (10RobH) [16:06:46] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: Dec 2024: cr3-ulsfo errors on et-0/0/0 link from cr4 - https://phabricator.wikimedia.org/T384288#10505148 (10RobH) Picked this back up, it had gotten neglected due to not being assigned to me and not having the ops-ulsfo tag and I shou... [16:06:54] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: Dec 2024: cr3-ulsfo errors on et-0/0/0 link from cr4 - https://phabricator.wikimedia.org/T384288#10505158 (10RobH) a:05cmooney→03RobH [17:00:37] 10Mail, 06Infrastructure-Foundations: Set up dual-stack ECDSA/RSA certificate support for Exim - https://phabricator.wikimedia.org/T385067 (10BCornwall) 03NEW [17:00:51] 10Mail, 06Infrastructure-Foundations: Set up dual-stack ECDSA/RSA certificate support for Exim - https://phabricator.wikimedia.org/T385067#10505414 (10BCornwall) [17:22:09] > Glad this was helpful! It's great to see you building in public—it makes it easier to offer support. Don't hesitate to come back if you encounter another bottleneck or need improvements. [17:22:17] -- https://github.com/openconfig/gnmic/issues/588#issuecomment-2622127368 [17:48:27] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: Dec 2024: cr3-ulsfo errors on et-0/0/0 link from cr4 - https://phabricator.wikimedia.org/T384288#10505646 (10RobH) Remote hands 01020815 scheduled for 2025-02-04 @ 0800 Pacific (1600 GMT). [18:10:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.42%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [18:26:26] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: Check link from msw1-eqiad et-0/1/0 to msw2-eqiad et-0/1/0 - https://phabricator.wikimedia.org/T384708#10505845 (10Papaul) Replaced the optic on the msw2 side [19:17:55] FIRING: MaxConntrack: Max conntrack at 81.09% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [19:22:55] RESOLVED: MaxConntrack: Max conntrack at 82.31% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [19:39:12] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: Check link from msw1-eqiad et-0/1/0 to msw2-eqiad et-0/1/0 - https://phabricator.wikimedia.org/T384708#10506123 (10cmooney) >>! In T384708#10505845, @Papaul wrote: > Replaced the optic on the msw2 side Cool, looks ok so far but will... [20:36:01] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Extend sre.network.configure-switch-interfaces cookbook to add sflow and qos config - https://phabricator.wikimedia.org/T379549#10506260 (10cmooney) The above patch I believe will do what we need. Needs some testing I will work with dc-ops... [21:12:56] FIRING: MaxConntrack: Max conntrack at 82.28% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [21:17:55] RESOLVED: MaxConntrack: Max conntrack at 82.28% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [21:35:55] FIRING: MaxConntrack: Max conntrack at 81.97% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [21:40:55] RESOLVED: MaxConntrack: Max conntrack at 81.85% on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [22:10:10] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (90.43%) on ganeti2019:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [23:25:33] 10SRE-tools, 06Infrastructure-Foundations: Support creating phab tasks in wmflib.phabricator - https://phabricator.wikimedia.org/T366470#10506845 (10Aklapper) > Unfortunately wmflib currently only supports creating comments. I guess this is about expanding the `transactions` handling for the `self._client.man... [23:30:20] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Extend sre.network.configure-switch-interfaces cookbook to add sflow and qos config - https://phabricator.wikimedia.org/T379549#10506853 (10cmooney) As a test I ran this for an existing host that had been configured with the current live co...