[06:21:44] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Allow managing drmrs DHCP settings with Homer - https://phabricator.wikimedia.org/T328737 (10ayounsi) a:05ayounsi→03cmooney [06:30:06] 10netops, 10Infrastructure-Foundations, 10SRE: Add generic mechanism to add static routes on switches - https://phabricator.wikimedia.org/T334281 (10cmooney) 05Open→03Resolved [06:55:57] 10netops, 10Infrastructure-Foundations, 10SRE: Core routers: replace bootp with dhcp-relay - https://phabricator.wikimedia.org/T320508 (10ayounsi) a:05ayounsi→03cmooney [07:03:59] 10netops, 10DBA, 10Data-Engineering, 10Infrastructure-Foundations, and 8 others: eqiad row D switches upgrade - https://phabricator.wikimedia.org/T333377 (10ayounsi) a:03cmooney [08:21:06] 10netbox, 10DC-Ops, 10Infrastructure-Foundations, 10Observability-Alerting, and 2 others: validate what we need from the check_eth check - https://phabricator.wikimedia.org/T333007 (10fgiunchedi) [10:57:48] (SystemdUnitFailed) firing: krb5-admin-server.service Failed on krb2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:22:36] 10netops, 10DBA, 10Data-Engineering, 10Infrastructure-Foundations, and 8 others: eqiad row D switches upgrade - https://phabricator.wikimedia.org/T333377 (10Vgutierrez) [11:51:28] 10Puppet, 10Infrastructure-Foundations: gitlab: test out gitlb actions with a stable puppet module - https://phabricator.wikimedia.org/T334723 (10jbond) p:05Triage→03Medium [13:37:20] 10SRE-tools, 10Infrastructure-Foundations, 10cloud-services-team (FY2022/2023-Q4): WMCS Cookbook Automation FY2022-23 Q2 tracking task - https://phabricator.wikimedia.org/T319401 (10fnegri) a:03fnegri [14:59:17] (SystemdUnitFailed) firing: krb5-admin-server.service Failed on krb2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:55:45] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10fnegri) [18:06:38] 10puppet-compiler, 10Infrastructure-Foundations, 10SRE: PCC failing for an LVS host (false negative) even after manually updating facts - https://phabricator.wikimedia.org/T334680 (10ssingh) Yet another data point if that helps: I am trying to merge the codfw LVS hiera definitions and ran into the following... [18:59:17] (SystemdUnitFailed) firing: krb5-admin-server.service Failed on krb2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:32:13] (DiskSpace) firing: Disk space urldownloader1001:9100:/ 5.928% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=urldownloader1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [21:09:00] 10netops, 10DBA, 10Data-Engineering, 10Infrastructure-Foundations, and 8 others: eqiad row D switches upgrade - https://phabricator.wikimedia.org/T333377 (10colewhite) [22:59:17] (SystemdUnitFailed) firing: krb5-admin-server.service Failed on krb2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:32:28] (DiskSpace) firing: Disk space urldownloader1001:9100:/ 1.258% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=urldownloader1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace