[07:53:27] (SystemdUnitFailed) firing: netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:13:43] if nobody is working on it I'd like to take sretest1002 for a spin of reimages to debug something [08:18:27] (SystemdUnitFailed) resolved: netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:20:58] go ahead [08:32:15] thx [10:06:12] 10SRE-tools, 10DBA, 10Infrastructure-Foundations: Create a cookbook for cloning a mariadb database into another - https://phabricator.wikimedia.org/T340048 (10Ladsgroup) 05Open→03Resolved [10:12:40] 10CAS-SSO, 10Data-Platform-SRE, 10Infrastructure-Foundations: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10jbond) @Stevemunene >>! In T305874#9054412, @Stevemunene wrote: > - name: AUTH_OIDC_CLIENT_ID > value: "our-client-id" This is `datahub` > - na... [11:59:04] 10netops, 10Infrastructure-Foundations, 10SRE: New IP and Vlan allocations for esams knams move - https://phabricator.wikimedia.org/T343214 (10cmooney) p:05Triage→03Medium [11:59:20] 10netops, 10Infrastructure-Foundations, 10SRE: New IP and Vlan allocations for esams knams move - https://phabricator.wikimedia.org/T343214 (10cmooney) [12:08:56] 10netops, 10Infrastructure-Foundations, 10SRE: Announce new public IPv6 prefix from Amsterdam for knams migration - https://phabricator.wikimedia.org/T343216 (10cmooney) p:05Triage→03Medium [12:09:05] 10netops, 10Infrastructure-Foundations, 10SRE: Announce new public IPv6 prefix from Amsterdam for knams migration - https://phabricator.wikimedia.org/T343216 (10cmooney) [12:09:13] 10netops, 10Infrastructure-Foundations, 10SRE: New IP and Vlan allocations for esams knams move - https://phabricator.wikimedia.org/T343214 (10cmooney) [12:16:35] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Announce new public IPv6 prefix from Amsterdam for knams migration - https://phabricator.wikimedia.org/T343216 (10cmooney) [12:18:41] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Announce new public IPv6 prefix from Amsterdam for knams migration - https://phabricator.wikimedia.org/T343216 (10cmooney) IRR route6 object created: ` cathal@officepc:~$ whois -r -T route6 -h whois.ripe.net 2a02:ec80:300::/48 % This is the... [12:55:22] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE, 10collaboration-services, and 4 others: migrate gitlab away from the CAS protocol - https://phabricator.wikimedia.org/T320390 (10Jelto) p:05High→03Medium The cas omniauth_provider was removed in the last merged patch. OIDC is the only login available in G... [13:40:04] 10CAS-SSO, 10Data-Platform-SRE, 10Infrastructure-Foundations: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10Stevemunene) Thanks @jbond Adding a datahub_staging oidc entry with `service_id: 'https://datahub-frontend\.k8s-staging\.discovery\.wmnet(/.*)?'` which we access... [14:08:04] FYI all im out for a week from today so ping me in the next few hourse if there is somethin g you would like me to look at before i go [14:23:06] 10SRE-tools, 10Infrastructure-Foundations, 10cloud-services-team (FY2023/2024-Q1): tcpircbot: enable logging to #wikimedia-cloud-feed - https://phabricator.wikimedia.org/T342666 (10fnegri) I grouped `logmsgbot_cloud` to the existing `logmsgbot` account: ` 16:16 identify logmsgbot {LOGMSGBO... [14:29:41] 10SRE-tools, 10Infrastructure-Foundations, 10cloud-services-team (FY2023/2024-Q1): tcpircbot: enable logging to #wikimedia-cloud-feed - https://phabricator.wikimedia.org/T342666 (10bd808) >>! In T342666#9059267, @fnegri wrote: > I grouped `logmsgbot_cloud` to the existing `logmsgbot` account: Thank you. :) `... [14:36:37] 10SRE-tools, 10Infrastructure-Foundations, 10Patch-For-Review, 10cloud-services-team (FY2023/2024-Q1): Allow wmcs cookbooks running on cloudcuminXXXX to write to the SAL - https://phabricator.wikimedia.org/T325756 (10fnegri) [14:36:53] 10SRE-tools, 10Infrastructure-Foundations, 10cloud-services-team (FY2023/2024-Q1): tcpircbot: enable logging to #wikimedia-cloud-feed - https://phabricator.wikimedia.org/T342666 (10fnegri) 05In progress→03Resolved > [14:28] ChanServ sets mode +v logmsgbot_cloud Thanks, I was about to ask you! :) [15:20:18] 10SRE-tools, 10Infrastructure-Foundations, 10Patch-For-Review, 10cloud-services-team (FY2023/2024-Q1): Allow wmcs cookbooks running on cloudcuminXXXX to write to the SAL - https://phabricator.wikimedia.org/T325756 (10fnegri) > we can try to reuse the same logger, but configure the destination host (to be w... [15:40:07] 10CAS-SSO, 10Data-Platform-SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10BTullis) Moving to blocked whilst we carry out {T343236} [15:52:30] 10SRE-tools, 10Infrastructure-Foundations, 10Goal, 10cloud-services-team (FY2023/2024-Q1): cloudcumin: decide sudoers rules for users without global root - https://phabricator.wikimedia.org/T325067 (10fnegri) [16:00:25] 10SRE-tools, 10Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q1): [spicerack] support including {project} in SAL messages - https://phabricator.wikimedia.org/T341793 (10fnegri) 05In progress→03Resolved [16:00:30] 10SRE-tools, 10Infrastructure-Foundations, 10Patch-For-Review, 10cloud-services-team (FY2023/2024-Q1): Allow wmcs cookbooks running on cloudcuminXXXX to write to the SAL - https://phabricator.wikimedia.org/T325756 (10fnegri) [17:33:28] (SystemdUnitFailed) firing: (2) dump-conftool-pools.service Failed on config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:38:28] (SystemdUnitFailed) firing: (3) dump-conftool-pools.service Failed on config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:08:28] (SystemdUnitFailed) firing: (4) dump-conftool-pools.service Failed on config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:18:07] ill put in a silence for theses config-master hosts [18:18:19] jbond: also see -ops :) [20:18:28] (SystemdUnitFailed) firing: confd_prometheus_metrics.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:23:28] (SystemdUnitFailed) resolved: confd_prometheus_metrics.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:36:22] (DiskSpace) firing: Disk space puppetmaster1001:9100:/ 5.945% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=puppetmaster1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace