[00:14:43] (SystemdUnitFailed) firing: (4) debmonitor-maintenance-gc.service Failed on debmonitor2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:00:31] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q1:(Need By: TBD) rack/setup/install cloudswift100[12] - https://phabricator.wikimedia.org/T289882 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by pt1979@cumin2002 for host cloudswift1002.eqiad.wmnet with OS b... [02:12:46] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q1:(Need By: TBD) rack/setup/install cloudswift100[12] - https://phabricator.wikimedia.org/T289882 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin2002 for host cloudswift1002.eqiad.wmnet with OS bulls... [02:21:54] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q1:(Need By: TBD) rack/setup/install cloudswift100[12] - https://phabricator.wikimedia.org/T289882 (10Papaul) [02:22:36] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q1:(Need By: TBD) rack/setup/install cloudswift100[12] - https://phabricator.wikimedia.org/T289882 (10Papaul) 05Stalled→03Resolved This is complete [04:14:43] (SystemdUnitFailed) firing: (4) debmonitor-maintenance-gc.service Failed on debmonitor2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:34:43] (SystemdUnitFailed) firing: (5) debmonitor-maintenance-gc.service Failed on debmonitor2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:49:43] (SystemdUnitFailed) firing: (5) debmonitor-maintenance-gc.service Failed on debmonitor2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:14:43] (SystemdUnitFailed) firing: (4) debmonitor-maintenance-gc.service Failed on debmonitor2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:49:38] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Next steps for Puppet 7 - https://phabricator.wikimedia.org/T330490 (10ops-monitoring-bot) Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: puppetmaster1005.eqiad.wmnet [06:49:48] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Next steps for Puppet 7 - https://phabricator.wikimedia.org/T330490 (10ops-monitoring-bot) Cookbook cookbooks.sre.debmonitor.remove-hosts run by jmm: for 1 hosts: puppetmaster2005.codfw.wmnet [09:09:31] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review, 10User-jbond: replace all puppet crons with systemd timers - https://phabricator.wikimedia.org/T273673 (10SLyngshede-WMF) a:03SLyngshede-WMF I think we're done, but just cleaning up a few comment and old naming. [09:19:45] jbond: I hope you don't mind I copy pasted your code: https://gerrit.wikimedia.org/r/c/operations/software/homer/+/928795 but it's for the greater good :) [09:35:19] XioNoX: nice :) [10:53:11] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review, 10User-jbond: replace all puppet crons with systemd timers - https://phabricator.wikimedia.org/T273673 (10jbond) 05Open→03Resolved i think you are right ` $ cumin R:cron... [10:53:20] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review, 10User-jbond: Work required to prepare for puppet 6 - https://phabricator.wikimedia.org/T265138 (10jbond) [11:07:58] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review, 10User-jbond: replace all puppet crons with systemd timers - https://phabricator.wikimedia.org/T273673 (10MoritzMuehlenhoff) > ill close this no need to keep it around to change comments, great work all It's not just comments, there's al... [11:08:10] Hello. I've got a question about reprepro. Is there any way that we can get it to host identically named but different files for different distributions? [11:12:19] The problem I'm facing relates to https://phabricator.wikimedia.org/T337465 and the various hadoop packages. I tried a workaround before (https://phabricator.wikimedia.org/T310643#8145985) to add a suffix to the package name, but I've found a subtle problem in the resulting packages. [11:16:20] Can I use a named component, such as `thirdparty/bigtop15-bullseye` to get around this? [11:17:14] you mean that the same version number is used for builds with are not identical? [11:17:22] AFAICT reprepro doesn't support this, no [11:17:51] we'd need to modify the version for the second builds by appending a +deb11u1 suffix e.g. [11:18:40] ah, that's actually what you proposed at https://phabricator.wikimedia.org/T310643#8145985 , so +1 on that approach :-) [11:27:50] moritzm: Thanks for that. Something about the approach that I tried didn't work though. It caused something to be skipped or to fail during the build: https://phabricator.wikimedia.org/T337465#8914820 [11:29:45] do we expect that this is a one off build? I mean this is only for the transition period where we have buster/bullseys in parallel, right [11:30:07] one hack that we can do it to modify the version number outside of the package build [11:31:07] You mean renaming the files? I'm not sure how long the transition period will be, or how it would affect rollback plans etc. [11:31:28] what I mean is: [11:31:43] if we can use the same deb on buster and bullseye (given it's java) [11:32:21] under the hood a deb ist an ar archive with two tarballs [11:32:48] and we can modify the deb for bullseye to keep the actual package content unchaned from what's in buster [11:33:21] and only edit the control tarball (which contains the meta data found e.g. in debian/control) to use the new version number [11:33:27] but keep the rest of the deb the same [11:33:39] and then recreate a deb out of it [11:33:45] Oh right. OK, I'm not sure, there are a lot of debs and some of them have a lot of dependencies. I'll have a look at them and see. [11:34:07] that changes the checksum of the deb and we can import it under the new version [11:34:31] I was asking about a new named component because I was intrigued by the error message from here: https://phabricator.wikimedia.org/T310643#8137514 [11:34:41] `File "pool/thirdparty/bigtop15/b/bigtop-groovy/bigtop-groovy_2.5.4-1_all.deb" is already registered with different checksums!` [11:35:21] yeah, the version number == checksum constraint applies across all components TTBOMK [11:35:50] ah, I just remembered that there's a helper called dpkg-repack which simplifies what I mentioned before quite a bit [11:36:19] ping me if you run into any issues with that, happy to help [11:42:41] Thanks. If we think that the same debs would work, we could just copy from one distro to another like this: https://wikitech.wikimedia.org/wiki/Reprepro#Copying_between_distributions couldn't we? [11:43:13] yes [11:43:24] we have several packages whcih we use across multiple distros [11:43:34] I'm just not sure that they're going to be compatible. We're got stuff like hdfs_fuse which is ELF. [11:43:38] https://www.irccloud.com/pastebin/VyX9lzJz/ [11:43:45] e.g. various prometheus exporters written in Go (since that creates a static ELF) [11:43:55] yeah, then it's unlikely to work [11:44:01] but what we can do is: [11:44:31] build one for buster and one for bullseye, but under the same name (given that bigtop makes a suffix tricky) [11:44:47] and then only modify the version for the bullseye build with dpkg-repack [11:45:53] Ah, OK. I like it. Thanks. I'll try to write it up in a ticket. [11:47:05] sounds good :-) [12:56:41] topranks: hi! quick question [12:57:03] (if you are around that is!) [12:57:06] I just ran sudo cookbook sre.network.configure-switch-interfaces lvs2014 [12:57:18] and it seems like I am getting a diff, but I think it's just the description [12:57:24] but I wanted to be sure since it's an lvs [12:57:52] [edit interfaces xe-2/0/44] [12:57:53] - description lvs2014; [12:57:53] + description "lvs2014 {#11995}"; [13:01:14] sukhe: yes that's just the description [13:01:20] ok thank you! [13:01:22] I will run it then [13:01:31] someone has added a cable label in netbox, so the automation has picked that up and added to the desc [13:01:32] I wanted to throw some live traffic at it but wanted to make sure all is oK :P [13:01:38] but harmless either way from a technical point of view [13:01:40] np! [14:19:43] (SystemdUnitFailed) firing: check_netbox_uncommitted_dns_changes.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:29:43] (SystemdUnitFailed) resolved: check_netbox_uncommitted_dns_changes.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed