[04:26:07] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Connect two hosts in codfw row A/B for switch migration testing - https://phabricator.wikimedia.org/T345803 (10Papaul) @cmooney @Jhancock.wm checked the server, no IP address set on it and she did reset it but it didn't resolve the issue. I asked... [07:31:29] 10netops, 10Infrastructure-Foundations, 10observability: librenms.syslog table size - https://phabricator.wikimedia.org/T349362 (10Marostegui) [07:31:36] 10netops, 10Infrastructure-Foundations, 10observability: librenms.syslog table size - https://phabricator.wikimedia.org/T349362 (10Marostegui) p:05Triage→03High [07:33:20] 10netops, 10Infrastructure-Foundations, 10observability: librenms.syslog table size - https://phabricator.wikimedia.org/T349362 (10jcrespo) @andrea.denisse This is something I mentioned to you some time ago and promised to raise it with your team. [08:03:11] 10netops, 10Infrastructure-Foundations, 10SRE, 10observability: librenms.syslog table size - https://phabricator.wikimedia.org/T349362 (10ayounsi) Haha yeah indeed! In theory we should only keep 90 days of logs : https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modu... [08:06:49] 10netops, 10Infrastructure-Foundations, 10SRE, 10observability: librenms.syslog table size - https://phabricator.wikimedia.org/T349362 (10Marostegui) This is the oldest row: ` root@db1164.eqiad.wmnet[librenms]> select timestamp from syslog order by timestamp asc limit 1; +---------------------+ | timestamp... [08:12:47] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, and 2 others: Change cloud-instance-transport vlan subnets from /30 to /29 - https://phabricator.wikimedia.org/T348140 (10dcaro) 05Open→03In progress [08:17:48] 10netops, 10Infrastructure-Foundations, 10SRE, 10observability: librenms.syslog table size - https://phabricator.wikimedia.org/T349362 (10jcrespo) Would it be possible to have it on filesystem/kibana only? I don't mind backing it up for persistence, but on db there is extra cost that wouldn't be on filesys... [09:38:47] 10Traffic, 10SRE, 10Patch-For-Review: Investigate IPVS IPIP encapsulation support - https://phabricator.wikimedia.org/T348837 (10Vgutierrez) >>! In T348837#9253425, @cmooney wrote: > Regarding the UDP encapsulation it's an interesting idea, and is a reminder that currently our switches distribute flows based... [09:54:22] 10netops, 10Infrastructure-Foundations, 10SRE, 10observability: librenms.syslog table size - https://phabricator.wikimedia.org/T349362 (10ayounsi) We already have it in Kibana, but the LibreNMS UI is quite convenient and we send more verbose logs for alerting there. The solution is probably to reduce the r... [10:07:47] (SLOMetricAbsent) firing: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [10:22:47] (SLOMetricAbsent) resolved: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [10:53:11] 10Traffic, 10SRE, 10Patch-For-Review: HAProxy should use a single backend for Vanish - https://phabricator.wikimedia.org/T349287 (10Fabfur) [12:59:48] 10Traffic, 10SRE, 10Patch-For-Review: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10Fabfur) [14:09:14] 10Acme-chief, 10Toolforge, 10cloud-services-team: toolforge acme-chief: Failed to generate additional resources using 'eval_generate': Could not intern_multiple from application/json: 416: unexpected token at '{"checksum":{"type":"md5","val' - https://phabricator.wikimedia.org/T349384 (10taavi) [15:09:19] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Automate L3 Switch to Core Router BGP peerings (and remove OSPF on drmrs switches) - https://phabricator.wikimedia.org/T349125 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=b09e42f6-6ad2-4453-abab-27f0a3934508) set by... [15:26:15] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Automate L3 Switch to Core Router BGP peerings (and remove OSPF on drmrs switches) - https://phabricator.wikimedia.org/T349125 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=6731cf5b-8a4f-4391-98fa-2900d5500bf5) set by... [15:44:48] sukhe: I noticed doing this BGP change that we had a static route on the new esams switches [15:44:50] for this IP: [15:44:51] https://netbox.wikimedia.org/ipam/ip-addresses/13909/ [15:45:08] I think I probably set this up before we decided to make ns2 the anycast IP? [15:45:15] topranks: looking [15:45:32] assume we can probably remove it in netbox and delete the static? [15:45:32] yep [15:45:39] yeah, +1! [15:45:44] cool [15:45:53] thanks! [15:45:57] ha I paniced, was a diff in my "before and after" BGP routes sent to core [15:46:12] then noticed it was going to dns I freaked I'd broken things :P [15:46:16] I think this was well before the ns2 anycast decision [15:46:19] thanks [15:46:49] yeah I assigned it in the public block when planning, we made new assignments for everything we had on the old ranges [16:19:35] 10Acme-chief, 10Toolforge, 10cloud-services-team: toolforge acme-chief: Failed to generate additional resources using 'eval_generate': Could not intern_multiple from application/json: 416: unexpected token at '{"checksum":{"type":"md5","val' - https://phabricator.wikimedia.org/T349384 (10Vgutierrez) acme-chi... [16:24:32] 10Acme-chief, 10Toolforge, 10cloud-services-team: toolforge acme-chief: Failed to generate additional resources using 'eval_generate': Could not intern_multiple from application/json: 416: unexpected token at '{"checksum":{"type":"md5","val' - https://phabricator.wikimedia.org/T349384 (10Vgutierrez) At the s... [16:54:17] 10netops, 10Data-Engineering, 10Infrastructure-Foundations, 10SRE, and 2 others: [Maintenance] Netflow/pmacct: use forwardingStatus - https://phabricator.wikimedia.org/T331707 (10Ahoelzl)