[08:14:47] @marostegui I don't have Amir's script for upgrading pc hosts at the moment, is it ok if I start with the es* hosts? [08:14:56] sure [08:16:53] marostegui: can I help with downgrading clouddbs? T421826 [08:16:57] T421826: Downgrade clouddb* hosts to 10.11.13 - https://phabricator.wikimedia.org/T421826 [08:17:13] es backups are still running [08:17:44] could you wait a few minutes for them to finish? [08:19:31] sure jynus [08:20:02] I think the script will be blocked anyway if they detect backups running (or you could check) [08:20:16] dhinus: It is fine, thanks I think I will get it all fully done today [08:20:20] I appreciate your help though! [08:21:45] ack, thanks :) [08:44:07] @marostegui ok I have a small tweak to the restart script at https://gitlab.wikimedia.org/repos/data_persistence/dbtools/scripts/-/merge_requests/8 - do we want to issue mysql_upgrade after every reboot just to be safe? [08:45:14] sure, it doesn't hurt [08:45:53] federico3: as we mentioned yesterday in the meeting, let's not include mariadb upgrades in this rolling restart [08:46:58] yes, the MR is about adding a way to do only reboot without apt dist-upgrade (but it's running mysql_upgrade anyways just for safety) [08:47:08] I've not checked it yet, I am quite busy at the moment [08:47:13] I will take a look later [08:47:54] ok [10:03:36] I'm disabling Puppet on the Swift storage nodes to roll out https://gerrit.wikimedia.org/r/c/operations/puppet/+/1242430 [10:07:40] jynus: can I start the reboots? [10:22:48] federico3: yes sorry, @ a meeting so I forgot to tell you [10:22:52] they all finished [10:22:59] no worries, thanks! [10:33:38] r1242430 seems fine, I've re-enabled Puppet so that it get applied to the rest of the nodes [11:08:33] moritzm: I don't know if you saw my question re "Ubuntu mirror in sync with upstream" alert. Do you think I can downtime or disable that, assuming it won't be looked at in the future? [11:09:24] I assume I can based on feedback, but want to make sure you or your team is ok with that [11:10:04] yeah, it's fine to downtime [11:10:17] I will do a month [11:10:21] we're sunsetting the mirror entirely over the coming weeks [11:10:22] thanks [11:10:30] yeah, that's why [11:10:53] I normally don't touch it, but some those pollute my dashboards [11:11:07] to know what's going on that's important [13:48:32] marostegui: I see some alerts on clouddb1023, are you working on it? [13:48:58] yes [13:49:03] It will be recovered in a sec [13:49:11] ok, thanks! [14:28:04] I'm collecting steps for section updates at https://wikitech.wikimedia.org/wiki/MariaDB/Upgrading_a_section , also I added a chart for the kernel versions [17:12:15] sretest2003 still has wmf-mariadb1011 installed, but TTBOMK that test is over? if that's the case, then I'd remove the package there to avoid confusion [17:53:46] moritzm: sure! Thanks! [18:09:01] k, done [19:48:48] FIRING: PuppetFailure: Puppet has failed on restbase1031:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [19:51:48] FIRING: PuppetFailure: Puppet has failed on aqs1010:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [20:03:48] FIRING: [2x] PuppetFailure: Puppet has failed on restbase1031:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [20:11:48] FIRING: [2x] PuppetFailure: Puppet has failed on aqs1010:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [20:58:48] FIRING: [2x] PuppetFailure: Puppet has failed on restbase1031:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [21:01:48] FIRING: [2x] PuppetFailure: Puppet has failed on aqs1010:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [21:13:48] RESOLVED: [2x] PuppetFailure: Puppet has failed on restbase1031:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [21:16:48] RESOLVED: [2x] PuppetFailure: Puppet has failed on aqs1010:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure