[00:04:13] (DiskSpace) resolved: Disk space puppetmaster1001:9100:/ 5.411% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=puppetmaster1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [02:03:35] (SystemdUnitFailed) firing: docker-reporter-k8s-images.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:03:36] (SystemdUnitFailed) firing: docker-reporter-k8s-images.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:45:00] jobo, cdanis: this probably interests you https://github.blog/2023-10-16-measuring-git-performance-with-opentelemetry/ [08:50:40] 10Mail, 10Infrastructure-Foundations, 10Patch-For-Review: Add Auto-Submitted: auto-generated header to emails sent by scripts - https://phabricator.wikimedia.org/T347835 (10ayounsi) 05Open→03Resolved a:03ayounsi All done, we already see a significant change in term of email noise. Last part is T347831. [09:25:36] 10SRE-tools, 10Infrastructure-Foundations, 10Spicerack: Add a dependency on the opensearch-py client - https://phabricator.wikimedia.org/T345900 (10dcaro) I have created a patch to replace elasticsearch with opensearchpy: https://gerrit.wikimedia.org/r/c/operations/software/spicerack/+/966492 Following @col... [09:31:54] 10netops, 10Data-Engineering, 10Infrastructure-Foundations, 10SRE, and 2 others: Netflow/pmacct: use forwardingStatus - https://phabricator.wikimedia.org/T331707 (10CodeReviewBot) joal opened https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/519 Update analytics druid netfl... [09:35:01] 10SRE-tools, 10Infrastructure-Foundations, 10Spicerack, 10Patch-For-Review: spicerack: tox fails to install PyYAML using python 3.11 on bookworm - https://phabricator.wikimedia.org/T345337 (10dcaro) Related task {T345900} [10:03:36] (SystemdUnitFailed) firing: docker-reporter-k8s-images.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:15:58] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, and 2 others: Change cloud-instance-transport vlan subnets from /30 to /29 - https://phabricator.wikimedia.org/T348140 (10cmooney) [12:59:08] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, and 2 others: Change cloud-instance-transport vlan subnets from /30 to /29 - https://phabricator.wikimedia.org/T348140 (10cmooney) Stab in the dark guessing what commands are needed in codfw, based on man page and some guides (including info Artur... [13:21:43] hi, what's the status of puppetboard2003 ? I saw a puppet alert from 6 days ago and a textfile stale alert on puppetserver about puppetboard2003 [13:33:06] godog: ill take a look that should be in service [13:36:00] cheers jbond [13:36:38] 10netops, 10Data-Engineering, 10Infrastructure-Foundations, 10SRE, and 2 others: Netflow/pmacct: use forwardingStatus - https://phabricator.wikimedia.org/T331707 (10CodeReviewBot) tchin merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/519 Update analytics druid netf... [13:38:51] volans: ty for the link [13:40:35] yw :) [14:03:36] (SystemdUnitFailed) firing: docker-reporter-k8s-images.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:43:25] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, and 2 others: Change cloud-instance-transport vlan subnets from /30 to /29 - https://phabricator.wikimedia.org/T348140 (10cmooney) Ok we seem to have muddled through, for the record commands needed as follows: ` wmcs-openstack port unset 1290224c-... [14:58:27] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, and 2 others: Change cloud-instance-transport vlan subnets from /30 to /29 - https://phabricator.wikimedia.org/T348140 (10dcaro) Note that we have to merge and deploy this first: https://gerrit.wikimedia.org/r/c/operations/puppet/+/965708 [15:56:28] 10netops, 10Infrastructure-Foundations, 10SRE: Improve network BGP group definition and automation templates - https://phabricator.wikimedia.org/T349116 (10cmooney) p:05Triage→03Low [15:58:19] 10netops, 10Infrastructure-Foundations, 10SRE: Improve network BGP group definition and automation templates - https://phabricator.wikimedia.org/T349116 (10cmooney) [15:58:53] 10netops, 10Infrastructure-Foundations, 10SRE: Improve Homer BGP group definition and automation templates - https://phabricator.wikimedia.org/T349116 (10cmooney) [17:12:28] 10netops, 10Infrastructure-Foundations, 10SRE: Tighter control on exported BGP routes from MRs - https://phabricator.wikimedia.org/T348739 (10cmooney) 05Open→03Resolved [17:46:13] 10netops, 10Infrastructure-Foundations, 10SRE: Automate L3 Switch to Core Router BGP peerings (and remove OSPF on drmrs switches) - https://phabricator.wikimedia.org/T349125 (10cmooney) p:05Triage→03Medium [17:46:22] 10netops, 10Infrastructure-Foundations, 10SRE: Automate L3 Switch to Core Router BGP peerings (and remove OSPF on drmrs switches) - https://phabricator.wikimedia.org/T349125 (10cmooney) [17:46:30] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Consolidate Automation Templates for DC Switches - https://phabricator.wikimedia.org/T312635 (10cmooney) [18:03:36] (SystemdUnitFailed) firing: docker-reporter-k8s-images.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:03:36] (SystemdUnitFailed) firing: (2) docker-reporter-k8s-images.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:03:36] (SystemdUnitFailed) firing: (2) docker-reporter-k8s-images.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:07:13] (DiskSpace) firing: Disk space puppetmaster1001:9100:/ 5.932% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=puppetmaster1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace