[00:03:57] I haven't had the opportunity to use `scap backport` until today, and omfg it's so nice [00:04:54] so thank you to everyone who made it happen [04:17:32] 10Gerrit: Reviewer-bot option to be added as CC instead of reviewer - https://phabricator.wikimedia.org/T334118 (10Aklapper) [@Tgr: Please use the Feature form for feature requests when possible - thanks!] [04:18:25] 10Project-Admins: Add 'developer-experience' Phabricator tag - https://phabricator.wikimedia.org/T334126 (10Aklapper) 05Openβ†’03Stalled [06:17:44] 10Continuous-Integration-Infrastructure, 10SRE, 10serviceops-collab, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10hashar) I don't know what has happened over the night but the zuul-merger service started alarming over night: ` Notificatio... [06:31:33] 10Gerrit: Reviewer-bot option to be added as CC instead of reviewer - https://phabricator.wikimedia.org/T334118 (10hashar) The reviewer-bot is an external tool created by @valhallasw , source at https://github.com/valhallasw/gerrit-reviewer-bot. Specially the [[ https://github.com/valhallasw/gerrit-reviewer-bot/... [06:39:48] 10Gerrit: Reviewer-bot option to be added as CC instead of reviewer - https://phabricator.wikimedia.org/T334118 (10hashar) The equivalent built-in feature in Gerrit is the reviewer plugin https://gerrit.wikimedia.org/r/plugins/reviewers/Documentation/config.html It can be configured on any repository via the `r... [06:52:00] 10Gerrit: Reviewer-bot option to be added as CC instead of reviewer - https://phabricator.wikimedia.org/T334118 (10hashar) I have just found out adding a reviewer is exposed in the UI as a repository {nav Commands}: https://gerrit.wikimedia.org/r/admin/repos/operations/deployment-charts,commands On the side ba... [08:09:37] 10Release-Engineering-Team, 10Wikimedia-Phabricator-Extensions, 10serviceops-collab: Disable "Browse Gerrit Projects" on https://phabricator.wikimedia.org/r/ - https://phabricator.wikimedia.org/T228507 (10Aklapper) ...or basically, just porting https://phabricator.wikimedia.org/D1206 to Gitlab. [08:15:40] o/ do we have examples of multi-arch builds with blubber & kokkuri, I naively tried BUILD_TARGET_PLATFORMS: linux/amd64,linux/arm64, but this is failing "standard_init_linux.go:219: exec user process caused: exec format error" (c.f. https://gitlab.wikimedia.org/repos/search-platform/cirrussearch-elasticsearch-image/-/jobs/89862) [08:28:48] 10GitLab (Infrastructure), 10serviceops-collab, 10Patch-For-Review: Troubleshoot partman config for two additional disks on GitLab hosts - https://phabricator.wikimedia.org/T333674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin2002 for host gitlab2003.wikimedia.org wit... [08:40:45] make it work locally with buildx and tonistiigi/binfmt [08:53:21] and seems like it's possible to build for arm64 in gitlab according to this build: https://gitlab.wikimedia.org/repos/releng/blubber/-/jobs/84041 [08:58:54] dcausse: hi, well that build is for a commit Revert "ci: Build for both linux/amd64 and linux/arm64" [08:59:08] https://gitlab.wikimedia.org/repos/releng/blubber/-/commit/0734952947504560483c65edbc7f0da4634dfbdb [08:59:22] which points to https://phabricator.wikimedia.org/T322453 [08:59:42] and I am not sure how it relates to arm :] [09:00:42] hashar: ah! indeed, I might hit this error after, but here I think I'm missing a small thing to make it actually build the image for arm [09:02:21] dcausse: I'd guess the relevant task is https://phabricator.wikimedia.org/T318866#8317918 but dduvall would know all the details [09:02:31] in the blubber build I see docker build commands actually working, e.g. #12 [linux/arm64 buildkit-prep 2/8] RUN apt-get update && apt-get install -y "gcc" "git" "make" && rm -rf /var/lib/apt/lists/* [09:02:38] I don't know anything about the low level of blubber :-\ [09:03:05] hashar: thanks for all the pointers! [09:04:58] dcausse: maybe you can try by updating the blubber/buildkit syntax to v0.18.0 if that is a thing. Honestly I have no idea :] [09:05:07] or the base elastic.co image might not be an arm64 one [09:05:34] #8 [linux/amd64 1/5] FROM docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2@sha256:2c257b68f3... [09:06:00] then it probably builds both the arm64 and amd64 in parallel so .. [09:06:02] the elastic image is multi-arch [09:06:06] I know nothing clearly [09:07:19] blubber 0.16 can build the arm64 image on my machine as long as I install tonistiigi/binfmt with: docker run --privileged --rm tonistiigi/binfmt --install all [09:07:34] and use buildx [09:07:51] well I guess your best bet is to file a task for dduvall to investigate [09:08:06] will do, thanks! :) [09:22:20] 10GitLab (Infrastructure), 10serviceops-collab: Troubleshoot partman config for two additional disks on GitLab hosts - https://phabricator.wikimedia.org/T333674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin2002 for host gitlab2003.wikimedia.org with OS bullseye executed wit... [09:36:41] 10GitLab (Infrastructure), 10serviceops-collab: Troubleshoot partman config for two additional disks on GitLab hosts - https://phabricator.wikimedia.org/T333674 (10Jelto) The installer on `gitlab2003` is stuck on partman creation: ` Apr 6 08:33:08 debconf: --> GET partman-auto/expert_recipe Apr 6 08:33:08 d... [09:52:08] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T330209 (10hashar) Not much happened after promoting to all wikis. We investigated some elevated traffic rates and latency increases that... [10:47:15] 10Release-Engineering-Team, 10Wikimedia-Phabricator-Extensions, 10serviceops-collab, 10Patch-For-Review: Disable "Browse Gerrit Projects" on https://phabricator.wikimedia.org/r/ - https://phabricator.wikimedia.org/T228507 (10Aklapper) Done in https://gitlab.wikimedia.org/repos/phabricator/extensions/-/merg... [12:01:39] hashar: is the Zuul alert on contint2002 expected: https://alerts.wikimedia.org/?q=alertname%3DCheck%20systemd%20state&q=team%3Dsre&q=%40receiver%3Dirc-spam&q=zuul [12:01:50] Looks like it was supposed to be disabled through https://gerrit.wikimedia.org/r/c/operations/puppet/+/906307/ [12:02:01] sobanski: somehow it started last night for no reason I could find [12:02:25] the zuul-merger service it alerted for is masked on purpose and I have added a Puppet change this morning to remove that monitoring unless the service is explicitly enabled [12:02:38] let me check Icinga [12:02:46] oh [12:02:57] no that is another service wmf-auto-restart [12:03:09] I will ask sre-foundation [12:03:31] It alerts via both Icinga and Prometheus [12:03:37] yeah [12:03:58] Icinga is the source of the alert and it is somehow bubbled up to Alert Manager which is the overall dashboard for all alerts [12:04:01] Here's the other one: https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed&q=team%3Dserviceops-collab&q=%40receiver%3Dserviceops-collab-warning&q=zuul [12:04:01] (as I understand it) [12:05:41] I am asking Moritz in private ;) [12:05:46] Thanks! [12:11:51] sobanski: the wmf_auto_restart script needs a little fix up :] At least there is nothing to worry about, it is merely a glitch and that will be improved as a result. Thanks! [12:19:34] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T330209 (10TheDJ) Note that the release notes are MIA https://www.mediawiki.org/wiki/MediaWiki_1.41/wmf.3 [12:26:03] sobanski: and there is the magic fix https://gerrit.wikimedia.org/r/c/operations/puppet/+/906564/2/modules/zuul/manifests/merger.pp :] [12:26:37] πŸ‘ [12:41:21] 10GitLab (Infrastructure), 10serviceops-collab: Troubleshoot partman config for two additional disks on GitLab hosts - https://phabricator.wikimedia.org/T333674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin2002 for host gitlab2003.wikimedia.org with OS bullseye [13:05:38] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Migrate cxserver to mw-api-int - https://phabricator.wikimedia.org/T334204 (10Clement_Goubert) [13:08:47] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 3 others: Migrate internal traffic to k8s - https://phabricator.wikimedia.org/T333120 (10Clement_Goubert) [13:09:56] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 3 others: Migrate internal traffic to k8s - https://phabricator.wikimedia.org/T333120 (10Clement_Goubert) [13:34:47] 10GitLab (Infrastructure), 10serviceops-collab: Troubleshoot partman config for two additional disks on GitLab hosts - https://phabricator.wikimedia.org/T333674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin2002 for host gitlab2003.wikimedia.org with OS bullseye executed wit... [13:39:16] RECOVERY - Check systemd state on contint2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [13:40:38] 10GitLab (Infrastructure), 10serviceops-collab: Troubleshoot partman config for two additional disks on GitLab hosts - https://phabricator.wikimedia.org/T333674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin2002 for host gitlab2003.wikimedia.org with OS bullseye [13:43:43] 10Release-Engineering-Team: train-deploy-notes failed with "1.41.0-wmf.3 does not match the latest wmf branch" - https://phabricator.wikimedia.org/T334211 (10TheresNoTime) [13:56:48] 10GitLab (Infrastructure), 10serviceops-collab: Troubleshoot partman config for two additional disks on GitLab hosts - https://phabricator.wikimedia.org/T333674 (10Jelto) Hard coding the raid sizes had a similar effect: ` Apr 6 13:44:38 partman-auto: Available disk space (960197) too small for expert recipe... [13:58:49] 10Release-Engineering-Team: train-deploy-notes failed with "1.41.0-wmf.3 does not match the latest wmf branch" - https://phabricator.wikimedia.org/T334211 (10hashar) Indeed, we had an extra branch created which I had to delete to resume the `scap stage-train` command: ` 2023-04-04 08:28 Deleting medi... [13:59:36] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T330209 (10hashar) >>! In T330209#8753449, @hashar wrote: > The train script breaking on the branch is from https://gitlab.wikimedia.org/... [14:00:39] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T330209 (10hashar) 05Openβ†’03Resolved I am claiming 1.41.0-wmf.3 to be a success. [14:34:01] 10GitLab (Infrastructure), 10serviceops-collab: Troubleshoot partman config for two additional disks on GitLab hosts - https://phabricator.wikimedia.org/T333674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin2002 for host gitlab2003.wikimedia.org with OS bullseye executed wit... [14:51:45] 10Release-Engineering-Team (Radar), 10Developer-Advocacy, 10MediaWiki-Action-API, 10User-notice: Standardise procedures for deprecating public-facing code - https://phabricator.wikimedia.org/T114384 (10Elitre) Resubscribing so I won't forget to check what the current standards are. Also FYI @VirginiaPounds... [15:15:06] 10Continuous-Integration-Infrastructure, 10OOUI, 10Patch-For-Review, 10Regression: OOUI PHP demos page is broken (again) - https://phabricator.wikimedia.org/T322357 (10Jdforrester-WMF) [15:21:29] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10Structured Data Engineering, 10Structured-Data-Backlog (Current Work), 10Wiki-Setup (Delete / Redirect): Close and delete TestCommons from production - https://phabricator.wikimedia.org/T213295 (10Jdforrester-WMF) [15:49:31] 10Release-Engineering-Team, 10Wikimedia-Phabricator-Extensions, 10serviceops-collab, 10Patch-For-Review: Disable "Browse Gerrit Projects" on https://phabricator.wikimedia.org/r/ - https://phabricator.wikimedia.org/T228507 (10brennen) a:05Dzahnβ†’03brennen > Done in https://gitlab.wikimedia.org/repos/phab... [16:05:41] 10GitLab (Infrastructure), 10serviceops-collab, 10Patch-For-Review: Troubleshoot partman config for two additional disks on GitLab hosts - https://phabricator.wikimedia.org/T333674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin2002 for host gitlab2003.wikimedia.org wit... [16:52:14] 10GitLab (CI & Job Runners), 10Release-Engineering-Team (GitLab V: Event Horizon πŸŒ„): buildkitd: Require use of the blubber frontend when running on trusted runners. - https://phabricator.wikimedia.org/T329220 (10thcipriani) a:03demon [16:59:03] 10GitLab (Infrastructure), 10serviceops-collab: Troubleshoot partman config for two additional disks on GitLab hosts - https://phabricator.wikimedia.org/T333674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin2002 for host gitlab2003.wikimedia.org with OS bullseye executed wit... [18:29:28] 10Release-Engineering-Team (Radar), 10serviceops-collab: sre-collab/releng: convert or remove all nrpe::monitor_service checks - https://phabricator.wikimedia.org/T334250 (10Dzahn) [18:41:17] (03CR) 10Michael Große: Zuul: [mediawiki/extensions/EntitySchema] Add Wikibase dep (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/906041 (https://phabricator.wikimedia.org/T333661) (owner: 10Lucas Werkmeister (WMDE)) [18:53:36] 10Continuous-Integration-Config, 10Moderator-Tools-Team, 10PageTriage, 10Growth-Team (Current Sprint), 10Patch-For-Review: Add PageTriage to gated extensions - https://phabricator.wikimedia.org/T333534 (10Tgr) >>! In T333534#8742978, @Novem_Linguae wrote: > For my own learning, is the "gate" documented a... [19:35:26] 10GitLab, 10Release-Engineering-Team: Error when using mutli arch build on gitlab with blubber and kokkuri - https://phabricator.wikimedia.org/T334254 (10dcausse) [19:41:26] (03PS3) 10Jforrester: Zuul: Add all @wikia/fandom domains to the CI allow list [integration/config] - 10https://gerrit.wikimedia.org/r/906608 (owner: 10Reedy) [19:41:37] (03CR) 10Jforrester: [C: 03+1] "LGTM; should I deploy?" [integration/config] - 10https://gerrit.wikimedia.org/r/906608 (owner: 10Reedy) [19:42:18] (03CR) 10Reedy: Zuul: Add all @wikia/fandom domains to the CI allow list (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/906608 (owner: 10Reedy) [19:43:17] (03CR) 10Jforrester: [C: 03+2] Zuul: Add all @wikia/fandom domains to the CI allow list [integration/config] - 10https://gerrit.wikimedia.org/r/906608 (owner: 10Reedy) [19:44:27] (03Merged) 10jenkins-bot: Zuul: Add all @wikia/fandom domains to the CI allow list [integration/config] - 10https://gerrit.wikimedia.org/r/906608 (owner: 10Reedy) [19:45:04] 10Diffusion, 10Phabricator, 10serviceops-collab, 10Patch-For-Review: Redirect https://phabricator.wikimedia.org/r/ to https://gerrit.wikimedia.org/g/ - https://phabricator.wikimedia.org/T324311 (10Dzahn) There is new activity on T228507. Let's see how this affects this ticket. [19:47:38] 10Release-Engineering-Team, 10Wikimedia-Phabricator-Extensions, 10serviceops-collab, 10Patch-For-Review: Disable "Browse Gerrit Projects" on https://phabricator.wikimedia.org/r/ - https://phabricator.wikimedia.org/T228507 (10Dzahn) Once we get this disabled it may or may not also close this other ticket: T... [19:49:13] !log Zuul: Add all @wikia/fandom domains to the CI allow list [19:49:15] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:49:24] 10Diffusion, 10Phabricator, 10serviceops-collab, 10Patch-For-Review: Redirect https://phabricator.wikimedia.org/r/ to https://gerrit.wikimedia.org/g/ - https://phabricator.wikimedia.org/T324311 (10Dzahn) a:05Dzahnβ†’03None [19:49:31] 10Diffusion, 10Phabricator, 10serviceops-collab, 10Patch-For-Review: Redirect https://phabricator.wikimedia.org/r/ to https://gerrit.wikimedia.org/g/ - https://phabricator.wikimedia.org/T324311 (10Dzahn) watching, just wanted to make clear it's up for grabs [20:19:35] 10GitLab (Infrastructure), 10SRE, 10ops-eqiad, 10serviceops-collab: Install additional SSDs on gitlab1004.wikimedia.org (B1) - https://phabricator.wikimedia.org/T333997 (10wiki_willy) a:03Jclark-ctr [20:19:58] 10GitLab (Infrastructure), 10SRE, 10ops-eqiad, 10serviceops-collab: Install additional SSDs on gitlab1003.wikimedia.org (A3) - https://phabricator.wikimedia.org/T333996 (10wiki_willy) a:03Jclark-ctr [20:29:34] 10Gerrit: Reviewer-bot option to be added as CC instead of reviewer - https://phabricator.wikimedia.org/T334118 (10valhallasw) Hiya, the reviewer bot is in very low maintenance mode a a) Gerrit has much more powerful features than when the bot was originally written, and b) we're moving over to Gitlab at... some... [21:10:25] 10GitLab (Administration, Settings & Policy), 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10Privacy Engineering, 10Product-Analytics, and 2 others: Request for Private repos to be enabled - https://phabricator.wikimedia.org/T305082 (10Htriedman) Hi all! I've read this thread and I want to weigh in on...