[08:43:52] https://wikitech.wikimedia.org/wiki/Ganeti#Netbox_naming_disambiguation if like me you keep getting confused between Ganeti and Netbox terminology [09:49:14] (SystemdUnitFailed) firing: netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:04:15] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 2 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10ABran-WMF) to sum it up as it's a bit confusing to re-read everything: | | puppet5 (db1139) | puppet 7 (db1133) | `mysql --ssl-ca wmf-c... [10:15:21] (SystemdUnitFailed) resolved: netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:21:27] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 2 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10ABran-WMF) as for the certificates side: | | Puppet 7 ca.crt `puppet_rsa` | Puppet 5 ca.crt `palladium.eqiad.wmnet` | wmf-ca.crt `Wikim... [10:22:48] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 2 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10Marostegui) Those are tests from the orchestrator server I assume? [10:32:22] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 2 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10ABran-WMF) No, good catch! I forgot to add those results as well. Previous results were from the previously described tests. From orche... [10:34:29] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 2 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10Marostegui) If db1133 gets fixed, that should mean that the new dbstores (1008, 1009) should pop up and get discovered automatically too. [10:45:56] 10netops, 10Ganeti, 10Infrastructure-Foundations, 10SRE: Investigate Ganeti in routed mode - https://phabricator.wikimedia.org/T300152 (10ayounsi) [10:49:11] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 2 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10ABran-WMF) ` root@dborch1001:/etc/ssl/certs# grep -i ca-certificates /etc/orchestrator.conf.json "MySQLOrchestratorSSLCAFile": "/etc/... [11:10:24] XioNoX: thx for the ganeti-netbox clarification, maybe you can link it from the nebtox page too, just in case [11:10:40] good idea [11:24:55] that's useful, thanks! [11:28:36] added to https://wikitech.wikimedia.org/wiki/Netbox#Ganeti_sync [12:13:32] 10netops, 10Infrastructure-Foundations, 10SRE: Automate BGP peering on MR routers towards core - https://phabricator.wikimedia.org/T354809 (10cmooney) 05Open→03Resolved a:03cmooney [13:39:38] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [14:02:04] 10netbox, 10Infrastructure-Foundations, 10Patch-For-Review: Netbox: replace getstats.GetDeviceStats with ntc-netbox-plugin-metrics-ext - https://phabricator.wikimedia.org/T311052 (10SLyngshede-WMF) I'm not really loving either of the metrics plugins. The [[ https://github.com/networktocode/ntc-netbox-plugin-... [14:35:11] Can cumin should me all hosts that does NOT have a certain Puppet profile? [14:46:49] sudo cumin 'not O:profilename' should work [14:47:01] sudo cumin 'not P:profilename' should work [14:48:15] That gives me the same result sudo cumin 'not P:firewall' [14:48:41] Ah no, not the same, different [14:52:52] slyngs: 'A:all and not P{P:firewall}' [14:53:22] 10netbox, 10Infrastructure-Foundations, 10Patch-For-Review: Netbox: replace getstats.GetDeviceStats with ntc-netbox-plugin-metrics-ext - https://phabricator.wikimedia.org/T311052 (10ayounsi) The getstats netbox script is a hack and we should at best replace it with a Netbox plugin. netbox-more-metrics also s... [14:53:40] Thank you :-) [14:54:08] btw are we having the meeting? joanna and the US are out [14:54:43] I don't know... I love talking to you, but that may not be sufficient reason :-) [14:55:12] I'm fine either way [14:56:46] I don't have anything specific, but I'm partially out Wednesday and Thursday and completely gone Friday [15:01:56] topranks and I are in [15:02:21] joining [15:45:30] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 2 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10ABran-WMF) Hm, I stumbled upon something unexpected: ` root@db1133:/etc/ssl/certs# mysql [snip] MariaDB [(none)]> select @@global.ssl_... [16:29:12] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 2 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10ABran-WMF) I ran the following test: with a custom PKI, a server certificate generated with an intermediate CA and the CA bundle fed to... [17:49:15] (SystemdUnitFailed) firing: netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:15:22] (SystemdUnitFailed) resolved: netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:19:15] (SystemdUnitFailed) firing: netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:45:22] (SystemdUnitFailed) resolved: netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed