[10:17:57] 10serviceops: wmf_auto_restart should not run for imagecatalog on the non-primary deployment server - https://phabricator.wikimedia.org/T305135 (10Joe) [10:18:05] 10serviceops: wmf_auto_restart should not run for imagecatalog on the non-primary deployment server - https://phabricator.wikimedia.org/T305135 (10Joe) p:05Triage→03Medium [10:26:39] 10serviceops: wmf_auto_restart should not run for imagecatalog on the non-primary deployment server - https://phabricator.wikimedia.org/T305135 (10MoritzMuehlenhoff) a:03MoritzMuehlenhoff [10:27:35] <_joe_> moritzm: I assumed we'd have to manage that :) [10:33:39] I broke it, I fix it :-) [14:09:37] <_joe_> Since we're setting up a systemd timer to run the code sync for new mediawiki branches on the legacy systems [14:10:08] <_joe_> dancy and I were thinking we could at the same time pre-pull the correct image on all kubernetes nodes [14:10:25] <_joe_> so that we don't risk timeouts either, being this the image with the biggest delta [14:10:48] <_joe_> (we cut a base when we promote a new wiki version) [14:11:06] <_joe_> this would avoid risking timeouts [14:11:28] <_joe_> akosiaris, jayme ^^ if you have more brilliant ideas, I'm all ears :) [14:24:22] we're in a meeting [14:39:53] _joe_: legacy systems being the appservers ? and code sync being ? [14:40:32] <_joe_> akosiaris: yes and when we add a new version of mediawiki to /srv/mediawiki on the appservers [14:40:43] <_joe_> but we still don't reference it in wikiversions.json [14:40:52] 10serviceops, 10Generated Data Platform, 10SRE, 10Service-deployment-requests: Setup Initial Image Suggestion Service CI and k8s params/stubs - https://phabricator.wikimedia.org/T305154 (10WDoranWMF) [14:41:04] <_joe_> basically amhon wants to automate that step of the train deployment [14:41:14] <_joe_> it takes 40 minutes and it's distributing dead code [14:41:29] <_joe_> I thought it would be a good idea to pull the image at the same time [14:43:59] 10serviceops, 10Generated Data Platform, 10SRE, 10Service-deployment-requests: Blubber setup for Image Suggestions Service - https://phabricator.wikimedia.org/T305155 (10WDoranWMF) [14:53:40] not in love with the idea much tbh. Asking that all kubernetes nodes get the new image, aside from the fact it will require ugly stuff in puppet that will expose the nodes to mw deployment specifics, will also create a lot of traffic in a pretty spiky manner. [14:54:12] whereas a k8s deployment following the default deployment settings would actually do that way more staggered by design [14:54:41] and only pull the images on the nodes that will actually run the pods (which for all intents and purposes is going to be always less than the total) [14:58:18] I'd agree to that. Maybe we should check first if we actually run into issues with pulling during normal deployments before optimizing [15:05:41] 10serviceops, 10Generated Data Platform, 10SRE, 10Service-deployment-requests: Setup Initial Image Suggestion Service CI and k8s params/stubs - https://phabricator.wikimedia.org/T305154 (10herron) p:05Triage→03Medium [15:06:00] 10serviceops, 10Generated Data Platform, 10SRE, 10Service-deployment-requests: Blubber setup for Image Suggestions Service - https://phabricator.wikimedia.org/T305155 (10herron) p:05Triage→03Medium [16:16:14] 10serviceops, 10Prod-Kubernetes, 10Patch-For-Review: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10Papaul) [16:16:17] 10serviceops, 10Prod-Kubernetes, 10Kubernetes, 10Patch-For-Review: Move kubernetes workers to bullseye and docker to overlayfs - https://phabricator.wikimedia.org/T300744 (10Papaul) [18:46:07] rebooting mwdebug1* now as well. did mwdebug2* ~ 2 days ago [19:58:02] rebooting mw canary codfw (and a bunch of misc stuff like aphlict, phab2001, scandium-testreduce and whatnot... already done) [20:43:29] 10serviceops, 10Generated Data Platform, 10SRE, 10Service-deployment-requests: Blubber setup for Image Suggestions Service - https://phabricator.wikimedia.org/T305155 (10Dzahn) port reserved: 4017 https://wikitech.wikimedia.org/wiki/Kubernetes/Service_ports [20:52:24] 10serviceops, 10Generated Data Platform, 10SRE, 10Service-deployment-requests: New Service Request Generated Datasets: Image Suggestions Service - https://phabricator.wikimedia.org/T304891 (10Dzahn) found out the "add dummy tokens to labs/private" step is not needed anymore [21:06:09] so..when adding a new k8s service... we do NOT have to add tokens to ci/master.yaml and deployment_server.yaml anymore? not in labs/private and neither in real private on the puppetmaster? nothing? so the first thing they need from us is that we create a namespace (and reserve a port). right? [21:06:40] seems like it, already since last September. that's nice [22:46:38] I am adding 2 new 100GB disks in ganeti.. to the gitlab hosts. we need them so we can have backups and hold longer than about 1.5 days [23:51:57] 10serviceops, 10Data-Persistence-Backup, 10GitLab (Infrastructure), 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Dzahn) @Jelto @Arnoldokoth See above. I added a new disk to gitlab2001 and gitlab1001. On gitlab2001 I have also done the other necessa...