[00:50:09] (SystemdUnitFailed) firing: netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:19:06] (SystemdUnitFailed) resolved: netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:36:46] (NTPNoSynced) firing: NTP not synced - https://wikitech.wikimedia.org/wiki/NTP - TODO - https://alerts.monitoring.wmflabs.org/?q=alertname%3DNTPNoSynced [06:36:46] (NTPNoSynced) firing: NTP not synced - https://wikitech.wikimedia.org/wiki/NTP - TODO - https://alerts.monitoring.wmflabs.org/?q=alertname%3DNTPNoSynced [07:02:32] 10Puppet, 10SRE, 10Patch-For-Review: Add humorous redirect for fox.wikimedia.org - https://phabricator.wikimedia.org/T352870 (10Joe) 05Open→03Declined Or not :) [08:06:50] FYI; I'll switch the other cumin hosts over to Puppet 7 in about an hour [08:31:00] and I'll be enabling requestctl-based blocks for nftables on sretest1001 now, things might be a little broken on it initially [09:13:00] ack, as for the incerase in QGET on etcd I guess this will not affect them because we either call them for iptables or nftbles, but never both on the same host right? [09:13:37] yeah,it doesn't change anything [09:14:41] in fact the nftables solution will actually query less, the ferm one currently queries all below request-ipblocks/abuse while the nftables one will only specically qeury the "blocked-nets" key it actually needs [09:19:06] nice! I'll keep an eye on the etcd graphs once we start deploying i everywhere [09:19:09] *it [09:42:46] 10CFSSL-PKI, 10Ganeti, 10Infrastructure-Foundations, 10Patch-For-Review: Migrate Ganeti-rapi to use pki - https://phabricator.wikimedia.org/T350686 (10MoritzMuehlenhoff) [10:21:55] volans: I have the Debian packaging for Debmonitor sort of work (missing a bunch of detail) but I'm struggling to get patches made against the debian branch accepted by Gerrit [10:22:08] Is there some special rule? [10:36:46] (NTPNoSynced) firing: NTP not synced - https://wikitech.wikimedia.org/wiki/NTP - TODO - https://alerts.monitoring.wmflabs.org/?q=alertname%3DNTPNoSynced [10:43:43] if you switched to the Debian branch, the made your commit(s), you can run "git-review debian" and it will submit the patch to the debian branch, or do you mean something else? [10:46:04] Aaaah, it's the "git-review debian" I'm missing [10:48:39] the default error message is fairly obscure otherwise.. [11:14:52] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10Marostegui) [11:14:59] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10Marostegui) p:05Triage→03Medium [11:19:58] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [11:20:19] slyngs: also depends if your local branch is tracking the remote one [11:20:36] I have [11:20:37] debian 876f7d9 [origin/debian] [11:20:37] * master 84e7da9 [origin/master] [11:20:53] and IIRC git review is smart enough if you're tracking a remote one to use that one [11:21:08] but if you're not yes, you have to specify the branch manually when calling git review [11:22:13] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [11:40:59] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10MoritzMuehlenhoff) cumin1001 has been reverted to Puppet 5, but cumin2002 is on Puppet 7 and can be used to reproduce. [11:51:28] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10Marostegui) db1124 can be used for testing. It is a test host running puppet 7. It can be restarted, rebooted, reimaged, whatever is needed [12:03:24] slyngs, volans: hmmh, maybe it actually makes more sense to split the server and client for debmonitor into separate source packages after all? since we need to build the for all supported distros we'd also build the server part (which needs more recent Django etc) for older distros, which would bring quite some complexity [12:03:44] meant to write: "since we need to build the client for all supported distros" [12:04:16] it just occured to me when I saw the bump the debhelper compat in Simon's interim patch [12:04:34] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10Marostegui) Just took a quick look: ` # db-mysql db1133 ERROR 2026 (HY000): SSL connection error: self signed certificate in certificate... [12:05:12] yes, we need to build the client for all OSes and the server only for the one we want to support (bookworm?) [12:05:33] The compat bump is mostly to be able to build using pyproject.toml, because it 's easier. [12:05:36] if the easiest way is to split that into two sure [12:06:27] slyngs: have you seen the debianization of our existing python projects? like cumin (salsa) or spicerack (gerrit) for example? [12:06:31] no need to re-invent the wheel :D [12:06:54] No, but we also need to think about upgrading from setup.py [12:07:13] pyproject is the new hotness.... until the next thing :-) [12:07:43] *need*? [12:07:53] https://salsa.debian.org/python-team/packages/cumin/-/tree/debian/debian [12:08:26] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10Marostegui) This has more implications, as orchestrator cannot see these hosts (db1124, db1133) (with the changed cert). So this really... [12:09:00] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10Marostegui) ` 15 dborch1001 orchestrator[425]: 2023-12-07 12:07:15 ERROR ReadTopologyInstance(db1124.eqiad.wmnet:3306) show global statu... [12:10:18] volans, slyngs: yeah, given that build dependencies affect the whole source package let's actually proceed with a separate source package for debmonitor-server [12:10:34] ack, makes sense [12:10:36] also allows the two tools to change at a different pace which is also a win by itself [12:10:43] probably easier to move the client [12:10:46] as it's one file :D [12:11:20] The build configuration is also a lot simpler :-) [12:13:14] volans: Regarding upgradering/side-grading to pyproject, "need" is a probably not the right world, but it is a little simpler for Django projects [12:15:11] And: "SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer." [12:17:43] depends which command are you running, not forcely by the presenc eof setup.py [12:18:52] sorry, lunch ready, I have to step out for a bit [12:26:09] Enjoy. I'll try to split the client out in the mean time :-) [13:59:06] (SystemdUnitFailed) firing: upload_puppet_facts.service Failed on puppetmaster1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:36:46] (NTPNoSynced) firing: NTP not synced - https://wikitech.wikimedia.org/wiki/NTP - TODO - https://alerts.monitoring.wmflabs.org/?q=alertname%3DNTPNoSynced [15:50:32] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10ABran-WMF) a:03ABran-WMF [15:51:19] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10ABran-WMF) 05Open→03In progress [15:51:31] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10ABran-WMF) [15:59:54] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10dcaro) [16:20:36] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE, 10User-jbond: Thanos and Grafana lose the session after an hour - https://phabricator.wikimedia.org/T268233 (10fgiunchedi) Untagging o11y here, since we moved thanos to oauth2-proxy I believe this should not apply to thanos anymore (though might still apply t... [18:00:10] (SystemdUnitFailed) firing: upload_puppet_facts.service Failed on puppetmaster1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:36:46] (NTPNoSynced) firing: NTP not synced - https://wikitech.wikimedia.org/wiki/NTP - TODO - https://alerts.monitoring.wmflabs.org/?q=alertname%3DNTPNoSynced [22:00:11] (SystemdUnitFailed) firing: upload_puppet_facts.service Failed on puppetmaster1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:36:46] (NTPNoSynced) firing: NTP not synced - https://wikitech.wikimedia.org/wiki/NTP - TODO - https://alerts.monitoring.wmflabs.org/?q=alertname%3DNTPNoSynced