[02:38:23] <jinxer-wm>	 (SystemdUnitFailed) firing: production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[06:38:23] <jinxer-wm>	 (SystemdUnitFailed) firing: production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[08:32:39] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff)
[08:46:02] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff)
[09:13:00] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff)
[09:43:21] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10elukey)
[09:59:50] <jinxer-wm>	 (SystemdUnitFailed) firing: (2) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[10:13:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (3) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[10:34:32] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff)
[11:13:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (3) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[11:46:04] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff)
[13:02:06] <jayme>	 volans: I just ran a reimage after changing the role of a node from a puppet5 to a puppet7 one, which seems to make the migrate-host cookbook fail
[13:02:20] <jayme>	 SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, puppet7 is only avalible for bullseye
[13:02:30] <jayme>	 sudo cookbook sre.hosts.reimage -t T351074 --os bullseye -p 7 mw2420
[13:03:01] <stashbot>	 T351074: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074
[13:03:37] <volans>	 jbond: ^^^ 
[13:04:09] <volans>	 I'm at lunch, I can check in a bit, but I think we might call the migrate-host *before* the reimage right now
[13:04:38] <volans>	 and maybe wwe should move it to after d-i before the first puppet run
[13:05:58] <jayme>	 yeah, it's definitely running before
[13:06:56] <jayme>	 and it does not know about the changed role, asking me to add a hieradata/hosts patch
[13:07:09] <volans>	 ofc
[13:07:30] <jayme>	 probably an option to skip running it at all would be enough
[13:07:43] <jbond>	 jayme: so to double check whre yu trying to use the reimage cookbook to change from buster -> bullseye and migrate from puppet5 -> puppet7 in one go?
[13:08:11] <jayme>	 jbond: and move from puppet role A (puppet5) to B (puppet7), yes
[13:08:30] <moritzm>	 yeah, this is for a reimage of mw/buster to k8s/bullseye (which has been enabled for P7 at the role level)
[13:10:22] <jbond>	 oh wow i didn;t realise that mw serveres can now alos be kubernetes::worker's :/
[13:10:45] <volans>	 the mising rename part is just an implementation detail, we might fix it or not ;)
[13:11:05] <volans>	 but they are repurposed slowly from mw hoss to k8s to support mw-on-k8s
[13:11:33] <jbond>	 sure sure, i dont think it makes much difference would have still failed with a rename i think
[13:11:45] <volans>	 yep
[13:12:50] <jbond>	 so there are a few things here.  first it saying to add the hiera even thous its allready there.  i don;t think its worth fixing this and tbh its not an easy thing to fix
[13:13:50] <jayme>	 jbond: I would assume this is kind of an edge-case, right?
[13:14:18] <volans>	 a bit, but not too edgy
[13:14:22] <jbond>	 jayme: well this is definetly an edge case on a few points
[13:14:33] <jbond>	 i think others will hit the hiera thing mentioned above
[13:14:46] <jbond>	 but either way i still think its not worth the fix 
[13:15:20] <jbond>	 i think the thing that make it tricker is the changing roles
[13:15:23] <jayme>	 but couldn't we just add a --skip-puppet-migration flag?
[13:15:57] <jayme>	 and skipp all the puppet version checks as well as running migrate-host
[13:16:15] <jbond>	 jayme: i think to unblck you it would be better to reimage to puppet 5 & bullseye 
[13:16:20] <jbond>	 then run the migrate cookbook
[13:16:39] <jbond>	 if yu add force_puppet7: false to the host it should kep you on puppet5
[13:16:54] <jayme>	 hm..that's yet another step for 300+ hosts
[13:17:19] <volans>	 that's 2 more commits, not doable for the migration, but for a one off-test might
[13:17:19] <jbond>	 like i said just to unblock you, will think on a better solution
[13:17:58] <jayme>	 I fail to understand why it would be required to run migrate-host in this case
[13:18:29] <jbond>	 jayme: happy to review a patch
[13:18:34] <jayme>	 as the target role is migrated the node should be automatically after the reimage - or am I wrong?
[13:18:39] <jbond>	 but otherwise im looking at the code now
[13:18:58] <volans>	 we could also re-use the -p/--puppet: 5, 7, ignore
[13:20:23] <taavi>	 to me it seems like the reimage cookbook should use the current puppet version to perform the pre-reimage cleanup, and the given version only when signing the certs for the new installation
[13:21:56] * jbond grabbing some food
[13:24:48] <jayme>	 jbond: volans: I was thinking: https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/976732
[13:26:06] <volans>	 as-is would not work
[13:26:52] <volans>	 we need to detect the current puppet version to do the cleanup on the proper puppetmaster/server and know/detect the version of puppet after the reimage to know where to sign the CSR
[13:27:46] <jayme>	 is that cleanup part of the migrate-puppet cookbook?
[13:27:53] <volans>	 so it needs some refactoring
[13:28:13] <volans>	 no also the reimage itself clears puppetdb and the cert at every reimage
[13:28:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (3) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[13:29:03] <volans>	 but let me check the curent code with the migrate logic integrated
[13:31:29] <jayme>	 hmm, I did not see any other use of get_puppet_version() there
[13:31:50] <volans>	 no but there are uses of self.puppet_server that it's currently either one or the other
[13:31:56] <volans>	 I'm doing a local refactor
[13:32:39] <jayme>	 that's right...but I did not rouch that
[13:32:42] <jayme>	 *touch
[13:33:15] <jayme>	 if you run with -p5 you'll get puppet 5, if you run with p7 you get puppet 7 (from _get_puppet_server())
[13:36:03] <jbond>	 hmm volans i think that patch could work as we do
[13:36:04] <jbond>	  self.puppet_server.delete(self.fqdn)
[13:36:04] <jbond>	 >>        if self.args.puppet_version == 7:  # Ensure we delete the old certificate from the Puppet 5 infra
[13:36:07] <jbond>	               self.spicerack.puppet_master().delete(self.fqdn)
[13:36:20] <jbond>	 if self.args.puppet_version == 7:
[13:36:25] <jbond>	   self.spicerack.puppet_master().delete(self.fqdn)
[13:36:44] <jbond>	 in fact im now wondering if we even need to call the migrate cookbook from reimage
[13:36:53] <jbond>	 anyway ill wait for you refactor 
[13:37:35] <volans>	 ofc in this case we don't need to call the migrate-host
[13:37:54] <volans>	 and in the general case I think you did it to avoid duplication of code
[13:38:02] <volans>	 but we could also not do that
[13:38:43] <jbond>	 yes i think that is what i did.  however i am now wondering if its is needed in the genral case.  
[13:39:03] <jbond>	 ultimatly all we need to do, unless im missing something.  is delete the old cert and sign the new one.
[13:39:14] <jbond>	 and i think the reimage cookbook allready dose that 
[13:39:43] <volans>	 yes the problem is the auto-discovery of the version IMHO
[13:39:54] <volans>	 if I forget to set -p in this case I endup with detecting puppet 5
[13:40:07] <volans>	 installing puppet5
[13:40:14] <volans>	 but then having hiera set for puppet 7
[13:40:30] <volans>	 and it's too easy to get into that situation and I'd like to avoid it
[13:40:43] <jbond>	 but thats yes and i think we have that problem regardless of wether we call the migrate cookbook or not
[13:40:50] <volans>	 yes
[13:40:58] <jbond>	 FYI this is one of the reason i wanted to add the hiera_lookup methoed
[13:41:13] <jbond>	 but somehow convinced my self i didn't need it
[13:41:21] <volans>	 lol
[13:41:25] <volans>	 we can re-add it
[13:41:33] <jbond>	 oh yes we could just look the same information up via puppetdb so a puppetdb module would probably be better but more work
[13:41:45] <volans>	 puppetdb wouldn't work in this case :d
[13:41:55] <volans>	 has puppet5 data and we need the info before the new puppet run
[13:42:38] <jbond>	 from puppetdb we can see what the last force_puppet7 value was which would give us the anser
[13:43:13] <jbond>	 it would be pretty much the same result as using hiera lookup
[13:43:15] <volans>	 I guess people merges hiera while reimging, so puppet might hve not run to populate puppetdb
[13:43:25] <jbond>	 fyi the hiera lookup patch is ready to merge so we can do that https://gerrit.wikimedia.org/r/c/operations/software/spicerack/+/972459
[13:43:50] <jbond>	 volans: yes thats fair
[13:45:01] <volans>	 yeah probably is the cleanest way as there is no way from the current cookbook to know if the host is changing puppet version
[13:45:01] <jbond>	 volans: so yes for that bit i think the best idea would be to update get_puppet_version to use puppet_server.hiera_lookup
[13:46:16] <volans>	 +1
[13:46:33] <jbond>	 ok ill mereg that above change put will need a spicerack release
[13:47:02] <volans>	 sure
[13:48:32] <jayme>	 lmk if I can be of any help :)
[13:49:03] <volans>	 jayme: not repurposing mw hosts into k8s ones :D
[13:49:14] <jayme>	 no can do :)
[13:49:48] <volans>	 btw IMHO I think we should automate the renaming and rename those, it hurts me having a mwXXXX being a k8s host
[13:59:29] <volans>	 jbond: sorry I cancelled your +2 to make sure it works as epxected both on puppetmaster and puppetserver, as I'm failing to make it work on the hosts running the same command
[14:00:05] <volans>	 also that method is inherited by PuppetMaster() too so it should either work there too or we should override it to raise
[14:01:19] <jbond>	 volans: the command is working as expected.  it shuld alo work on puppetmaster
[14:02:00] <volans>	 I've put the error I'm getting on the CR (for the others)
[14:02:25] <jbond>	 ftr /govol
[14:28:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (3) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[14:29:50] <jinxer-wm>	 (SystemdUnitFailed) firing: (7) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[14:44:50] <jinxer-wm>	 (SystemdUnitFailed) firing: (8) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[14:49:50] <jinxer-wm>	 (SystemdUnitFailed) firing: (7) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[15:08:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (7) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[15:34:50] <jinxer-wm>	 (SystemdUnitFailed) firing: (3) update-ubuntu-mirror.service Failed on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[15:44:50] <jinxer-wm>	 (SystemdUnitFailed) firing: (3) update-ubuntu-mirror.service Failed on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[15:45:53] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10jbond)
[15:49:40] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff)
[15:59:50] <jinxer-wm>	 (SystemdUnitFailed) firing: (3) update-ubuntu-mirror.service Failed on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[16:06:33] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10jbond)
[16:16:29] <volans>	 jayme: to keep you updated, john and have updated and released spicerack with the function that we can now use to make the reimage support your use case
[16:16:49] <volans>	 we've now to refactor it a bit to use that, not sure if it will be today, I've a meeting in 45 and still stuff to do
[16:17:59] <jayme>	 cool, thanks. I'm not in a real rush. Tomorrow would be super fine
[16:23:59] <jinxer-wm>	 (PuppetZeroResources) firing: Puppet has failed generate resources on aux-k8s-worker1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources
[16:25:45] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10jbond)
[16:38:59] <jinxer-wm>	 (PuppetZeroResources) resolved: Puppet has failed generate resources on aux-k8s-worker1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources
[16:53:59] <jinxer-wm>	 (PuppetFailure) firing: (2) Puppet has failed on apt-staging2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure
[17:29:13] <jinxer-wm>	 (DiskSpace) firing: Disk space build2001:9100:/ 4.955% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=build2001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[17:38:15] <volans>	 moritzm: FYI^^^ I'm having a look at the disk space
[17:46:10] <volans>	 the big chunk is pbuilder, I pinged the top /home users in -sre
[17:46:30] <volans>	 is there any procedure to cleanup old cruft from pbuilder in a sfe way?
[17:50:58] <volans>	 sorry the actual top user is docker
[17:52:01] <volans>	 in particular /var/lib/docker/overlay2
[17:57:05] <volans>	 for exmple starting with  "docker builder prune", it should be safe AFAICT
[17:59:13] <jinxer-wm>	 (DiskSpace) resolved: Disk space build2001:9100:/ 4.001% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=build2001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[17:59:49] <volans>	 thaks to b.en we got some GB back but we should still perform some cleanup (and probably put it in a timer)
[18:02:06] <volans>	 I see we hve 2 timers, a daily one with "docker system prune --force" and a weekly one with "docker system prune --all --volumes --force"
[18:02:22] <volans>	 I wonder if we should add the builder prune too or that's included into system prune
[18:02:49] <volans>	 from the docs is unclear if system prune removes the builder cache too
[18:03:44] <volans>	 as it's not critical not doing anything for now, waiting for feedbck
[18:17:14] <jinxer-wm>	 (DiskSpace) firing: Disk space build2001:9100:/ 5.283% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=build2001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[18:33:04] <volans>	 I've run systemctl start docker-system-prune-dangling.service
[18:33:11] <volans>	  Total reclaimed space: 33.62GB
[18:33:21] <volans>	 it should be ok for a bit
[18:37:13] <jinxer-wm>	 (DiskSpace) resolved: Disk space build2001:9100:/ 2.914% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=build2001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[18:48:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (2) netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[19:18:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (2) netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[19:48:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (2) netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[20:18:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (2) netbox_report_accounting_run.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[20:53:59] <jinxer-wm>	 (PuppetFailure) firing: (2) Puppet has failed on apt-staging2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure
[22:08:59] <jinxer-wm>	 (PuppetFailure) firing: (2) Puppet has failed on apt-staging2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure
[23:18:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (2) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:23:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (3) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:29:51] <jinxer-wm>	 (SystemdUnitFailed) firing: (4) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:33:24] <jinxer-wm>	 (SystemdUnitFailed) firing: (5) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:34:50] <jinxer-wm>	 (SystemdUnitFailed) firing: (6) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:38:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (7) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:39:50] <jinxer-wm>	 (SystemdUnitFailed) firing: (8) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:43:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (9) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:44:50] <jinxer-wm>	 (SystemdUnitFailed) firing: (12) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:48:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (13) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:53:23] <jinxer-wm>	 (SystemdUnitFailed) firing: (14) export_smart_data_dump.service Failed on bast2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed