[06:13:06] going to switchover pc2 [07:08:24] done and reverted once the master was reimaged to bookworm (only for codfw) [08:18:24] Emperor: sorry, I was afk monday & tuesday [08:19:22] brouberol: no worries, it's more a "I am confused by this thing" than anything urgent :) [08:19:56] so, as to why apt_repo.yaml, it's because `profile::installserver::preseed` is included in the apt_repo role https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/production/modules/role/manifests/apt_repo.pp#12 [08:20:25] it did strike me as odd as well, but I went with it TBH [08:21:18] marostegui: there is this thing, I wonder if it could have been affected by maintenance: "The following units failed: mediawiki_job_purge_parsercache_pc2.service" [08:30:58] brouberol: huh, weird, so are we using the apt repo servers for preseed config still [08:31:22] or maybe it's legacy [08:32:31] does look like the installserver role itself doesn't contain profile::installserver::preseed (or at least, not directly) [08:34:31] jynus: yeah, for sure [08:34:36] I was expecting it to recover on its own [08:34:45] But I can reload it anyway [08:35:46] done! [08:38:26] I wasn't as much worried about the failure, as maybe reporting to who implemented it to retry and reconnect after some time if ro is detected [08:50:34] Emperor: I had the same thought as well. The install-server "responsabilities" are scattered between the install and apt hosts [08:58:48] jynus: I think it is one of those scripts that should re-read the config [08:59:02] indeed [08:59:11] I know Amir1 did great work chasing all those, so I don't know if maybe that one isn't that easy or it was missed [11:24:56] Hi. Sorry, this is deeply tedious, but would someone mind going through all 6 of https://gerrit.wikimedia.org/r/q/topic:swift_envoy_rollout please for +1s? They're pairwise hiera changes to do the remaining nginx->envoy rollouts for ms frontends. It needs doing like this because each migration involves a reimage, so we can only do one per DC in any given change. [11:27:03] done Emperor ! [11:27:09] Thank you <3 [11:27:24] loving the moss artifact :D [11:27:36] 🙊 [11:27:47] some hills are worth dying on [12:16:23] why i didn't get notification for the ping [12:16:24] sorry [12:17:53] marostegui: the pc config doesn't get reloaded via the logic I built because it's a completely different set of configuration :( [12:18:03] if that's what you're implying [12:18:39] but the cleaner doesn't matter, it'll get re-run next day and it's set to delete anything older than X so it should just take over from where it left off [12:19:08] 218G pagelinks.ibd 😰 [14:44:23] PROBLEM - MariaDB sustained replica lag on s6 on db1213 is CRITICAL: 2.6 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1213&var-port=13316 [14:45:23] RECOVERY - MariaDB sustained replica lag on s6 on db1213 is OK: (C)2 ge (W)1 ge 0.6 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1213&var-port=13316 [15:26:29] jynus: anychance i can start migrating backup roles yet? [15:26:54] yes, the only one I hadn't time to change is backupmon1001 [15:27:00] the others are good [15:27:14] it got fixed yesterday night [15:28:13] jynus: ack ill do them all excluding dbbackups::monitoring for now [15:29:01] the fix is easy, but last time I did it in a rush, things broke [15:29:13] and it is only 1 host [15:29:28] jynus: ack happy to wait [18:48:39] PROBLEM - MariaDB sustained replica lag on s6 on db1213 is CRITICAL: 5.2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1213&var-port=13316 [18:50:01] RECOVERY - MariaDB sustained replica lag on s6 on db1213 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1213&var-port=13316