[00:22:18] RECOVERY - SSH on contint1001.mgmt is OK: SSH OK - OpenSSH_6.6 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [00:32:46] 10Release-Engineering-Team (Done by Wed 24 Nov 🧟), 10Release, 10Train Deployments, 10User-Ladsgroup: 1.38.0-wmf.9 seems to have introduced a memory leak - https://phabricator.wikimedia.org/T296098 (10tstarling) It tried to get a core dump: ` Dec 10 20:52:55 wtp1025 kernel: [21268770.342075] Core dump to |... [06:12:41] (Queue (Jenkins jobs + Zuul functions) alert) firing: Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [06:19:52] PROBLEM - Work requests waiting in Zuul Gearman server on contint2001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [400.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [06:27:41] (Queue (Jenkins jobs + Zuul functions) alert) firing: (2) Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [06:47:41] (Queue (Jenkins jobs + Zuul functions) alert) resolved: Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [06:47:59] RECOVERY - Work requests waiting in Zuul Gearman server on contint2001 is OK: OK: Less than 100.00% above the threshold [200.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [07:03:12] 10Release-Engineering-Team (Done by Wed 24 Nov 🧟), 10Release, 10Train Deployments, 10User-Ladsgroup: 1.38.0-wmf.9 seems to have introduced a memory leak - https://phabricator.wikimedia.org/T296098 (10Ladsgroup) >>! In T296098#7565279, @tstarling wrote: > It tried to get a core dump: > > ` > Dec 10 20:52:5... [07:18:55] 10Release-Engineering-Team, 10Scap: scap master rsyncd: Rename common to mediawiki-staging - https://phabricator.wikimedia.org/T297510 (10hashar) `common` is indeed the legacy name it comes from `/home/wikipedia/common` we have used in the early days until the grand renaming of 2014 done by @ori with https://... [07:40:34] 10Release-Engineering-Team (Done by Wed 24 Nov 🧟), 10Release, 10Train Deployments, 10User-Ladsgroup: 1.38.0-wmf.9 seems to have introduced a memory leak - https://phabricator.wikimedia.org/T296098 (10tstarling) OK, I didn't realise mw1414 was depooled with high memory usage, that is useful. I looked at /pr... [08:29:24] (03CR) 10Hashar: [C: 03+2] "Kudos Apple for shipping a legacy bash (bash 4 got released in 2009!)" [tools/train-dev] - 10https://gerrit.wikimedia.org/r/745909 (owner: 10Ahmon Dancy) [08:30:09] (03Merged) 10jenkins-bot: mirror-repos.sh: Handle MacOS bash [tools/train-dev] - 10https://gerrit.wikimedia.org/r/745909 (owner: 10Ahmon Dancy) [08:31:49] (03CR) 10Hashar: [C: 03+2] Zuul: Add Jeffrey Wang to the CI trusted users list [integration/config] - 10https://gerrit.wikimedia.org/r/746010 (owner: 10Zoranzoki21) [08:34:18] (03Merged) 10jenkins-bot: Zuul: Add Jeffrey Wang to the CI trusted users list [integration/config] - 10https://gerrit.wikimedia.org/r/746010 (owner: 10Zoranzoki21) [08:53:47] 10Beta-Cluster-Infrastructure, 10Wikidata, 10Wikidata-Query-Service, 10wdwb-tech: deployment-wdqs01 Puppet failure - https://phabricator.wikimedia.org/T296959 (10MPhamWMF) [08:55:34] (03CR) 10Hashar: [C: 03+2] "Not sure why I have missed this one ;)" [integration/config] - 10https://gerrit.wikimedia.org/r/740916 (https://phabricator.wikimedia.org/T296287) (owner: 10Umherirrender) [08:57:27] (03Merged) 10jenkins-bot: Zuul: [extensions/WikiEditor] Add ConfirmEdit as phan dependency [integration/config] - 10https://gerrit.wikimedia.org/r/740916 (https://phabricator.wikimedia.org/T296287) (owner: 10Umherirrender) [12:10:37] (03Abandoned) 10Hashar: Add tox-bullseye container [integration/config] - 10https://gerrit.wikimedia.org/r/713972 (https://phabricator.wikimedia.org/T289222) (owner: 10Legoktm) [12:20:57] (03CR) 10Hashar: [C: 03+2] Update Quibble image node to version 14 [integration/quibble] - 10https://gerrit.wikimedia.org/r/745728 (https://phabricator.wikimedia.org/T294931) (owner: 10Kosta Harlan) [12:40:35] (03Merged) 10jenkins-bot: Update Quibble image node to version 14 [integration/quibble] - 10https://gerrit.wikimedia.org/r/745728 (https://phabricator.wikimedia.org/T294931) (owner: 10Kosta Harlan) [13:53:56] (03PS1) 10Giuseppe Lavagetto: make-container-image: add missing symlinks to the webserver image [tools/release] - 10https://gerrit.wikimedia.org/r/746870 (https://phabricator.wikimedia.org/T285232) [14:48:19] 10GitLab (Infrastructure), 10serviceops, 10Security: GitLab Runner Critical Security Release: 14.5.2, 14.4.2, and 14.3.4 - https://phabricator.wikimedia.org/T297581 (10sbassett) [14:54:34] Project beta-scap-sync-world build #30985: 15ABORTED in 1 hr 0 min: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/30985/ [15:08:41] 10Release-Engineering-Team (Doing), 10MediaWiki-Core-Tests, 10MediaWiki-ResourceLoader, 10MW-1.38-notes (1.38.0-wmf.13; 2021-12-13), 10Performance-Team (Radar): Move bundlesize test from npm script to MediaWikiIntegrationTest - https://phabricator.wikimedia.org/T255149 (10Func) Php code coverage test on... [15:23:15] 10Beta-Cluster-Infrastructure, 10Maps, 10Product-Infrastructure-Team-Backlog: Puppet config is broken for the maps instance on deployment-prep - https://phabricator.wikimedia.org/T291624 (10MSantos) 05Open→03Resolved a:03MSantos @Majavah for a while now we have been using the project `maps-experiments`... [15:39:20] (03PS1) 10Kormat: zuul: Add wmfdb [integration/config] - 10https://gerrit.wikimedia.org/r/746883 (https://phabricator.wikimedia.org/T297616) [15:54:00] 10Continuous-Integration-Infrastructure, 10docker-pkg: docker-pkg 3.0.1 fails with: 404 Client Error: Not Found ("pull access denied for wikimedia-bullseye, - https://phabricator.wikimedia.org/T297619 (10hashar) [15:54:24] (03CR) 10Hashar: dockerfiles: Use opcache optimizations with built-in PHP server (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/738281 (owner: 10Kosta Harlan) [15:55:45] 10Continuous-Integration-Infrastructure, 10docker-pkg: docker-pkg 3.0.1 fails with: 404 Client Error: Not Found ("pull access denied for wikimedia-bullseye, - https://phabricator.wikimedia.org/T297619 (10Majavah) > ` > name=config.yaml,lang=yaml > registry: docker-registry.wikimedia.org > namespace: releng > s... [16:00:05] (03CR) 10Ahmon Dancy: [C: 03+1] "Looks ok to me. See https://releases-jenkins.wikimedia.org/view/incremental%20image%20build/job/build-webserver-image/261/console for a r" [tools/release] - 10https://gerrit.wikimedia.org/r/746870 (https://phabricator.wikimedia.org/T285232) (owner: 10Giuseppe Lavagetto) [16:07:55] 10Beta-Cluster-Infrastructure, 10Maps, 10Product-Infrastructure-Team-Backlog: Puppet config is broken for the maps instance on deployment-prep - https://phabricator.wikimedia.org/T291624 (10dancy) Thanks @MSantos! [16:40:53] 10Continuous-Integration-Config, 10Security-Team, 10SecTeam-Processed: Add CI for mediawiki/extensions/SecurityApi - https://phabricator.wikimedia.org/T297243 (10Reedy) 05Open→03Resolved [17:02:51] (03CR) 10Jforrester: "Oh, thanks for this!" [integration/quibble] - 10https://gerrit.wikimedia.org/r/745728 (https://phabricator.wikimedia.org/T294931) (owner: 10Kosta Harlan) [17:04:57] 10Deployments, 10bacula, 10Sustainability (Incident Followup): Local private files on deployment host should be backed up somewhere - https://phabricator.wikimedia.org/T69818 (10Krinkle) >>! In T69818#7427469, @jcrespo wrote: > I didn't modify the original file permissions. The original on `/srv/mediawiki-s... [17:13:10] 10Deployments, 10bacula, 10Sustainability (Incident Followup): Local private files on deployment host should be backed up somewhere - https://phabricator.wikimedia.org/T69818 (10jcrespo) Thanks, this was useful to detect something applicable for future automated recoveries. Bacula backups full paths, and kee... [17:19:43] 10Deployments, 10bacula, 10Sustainability (Incident Followup): Local private files on deployment host should be backed up somewhere - https://phabricator.wikimedia.org/T69818 (10Krinkle) I see the group of the parent dirs is now wikidev, but note that the issue was not with the parent dirs. I was able to ent... [17:24:27] 10Project-Admins, 10Wikidata: Create project tag for Schematree_recommender - https://phabricator.wikimedia.org/T296599 (10Reedy) [17:28:16] 10Deployments, 10bacula, 10Sustainability (Incident Followup): Local private files on deployment host should be backed up somewhere - https://phabricator.wikimedia.org/T69818 (10jcrespo) Actually, you may have found something that I am unable to answer you on: that directory seems to have the setuid bit on,... [17:36:46] (03PS1) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [17:40:05] (03CR) 10Ahmon Dancy: [C: 04-1] "didn't seem to work." [tools/release] - 10https://gerrit.wikimedia.org/r/746903 (owner: 10Ahmon Dancy) [17:48:51] (03PS2) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [17:53:12] (03PS3) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [18:01:34] 10Continuous-Integration-Infrastructure, 10docker-pkg: docker-pkg 3.0.1 fails with: 404 Client Error: Not Found ("pull access denied for wikimedia-bullseye, - https://phabricator.wikimedia.org/T297619 (10hashar) Thank you @Majavah ! docker-pkg still comes with `seed_image: wikimedia-stretch` which apparently n... [18:09:06] 10GitLab (Infrastructure), 10serviceops: Migrate gitlab-test instance to puppet - https://phabricator.wikimedia.org/T297411 (10Jelto) > I created a new instance called "runner-bullseye" with the idea to put the gitlab_runner puppet class on it and see how it goes and do so on bullseye. But I did not get to act... [18:11:42] (03PS4) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [18:12:37] (03CR) 10jerkins-bot: [V: 04-1] auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 (owner: 10Ahmon Dancy) [18:13:50] (03PS5) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [18:19:09] (03PS6) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [18:26:43] 10Release-Engineering-Team (Doing), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.38.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T293953 (10dancy) Rolled forward to wmf.12 after https://gerrit.wikimedia.org/r/746909 was deployed. [18:51:41] (03PS7) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [18:51:48] (03PS2) 10Jforrester: zuul: [operations/software/wmfdb] Add basic tox CI [integration/config] - 10https://gerrit.wikimedia.org/r/746883 (https://phabricator.wikimedia.org/T297616) (owner: 10Kormat) [18:51:55] (03CR) 10Jforrester: [C: 03+2] zuul: [operations/software/wmfdb] Add basic tox CI [integration/config] - 10https://gerrit.wikimedia.org/r/746883 (https://phabricator.wikimedia.org/T297616) (owner: 10Kormat) [18:52:19] (03CR) 10jerkins-bot: [V: 04-1] auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 (owner: 10Ahmon Dancy) [18:53:21] (03PS8) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [18:54:21] (03Merged) 10jenkins-bot: zuul: [operations/software/wmfdb] Add basic tox CI [integration/config] - 10https://gerrit.wikimedia.org/r/746883 (https://phabricator.wikimedia.org/T297616) (owner: 10Kormat) [18:55:28] !log Zuul: [operations/software/wmfdb] Add basic tox CI for T297616 [18:55:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:55:31] T297616: Run CI for wmfdb CRs - https://phabricator.wikimedia.org/T297616 [18:57:41] (03PS9) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [19:01:27] (03PS10) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [19:03:44] (03PS11) 10Ahmon Dancy: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 [19:04:56] (03CR) 10Ahmon Dancy: [C: 03+2] auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 (owner: 10Ahmon Dancy) [19:09:28] 10Deployments, 10bacula, 10Sustainability (Incident Followup): Local private files on deployment host should be backed up somewhere - https://phabricator.wikimedia.org/T69818 (10jcrespo) Check now I've made another restore into /home/krinkle/restore2/ and I think you should be able to see it. I think I see t... [19:09:56] (03CR) 10Jforrester: [C: 03+2] Add 'parsoid' to 'InputBox' extension dependencies [integration/config] - 10https://gerrit.wikimedia.org/r/745929 (https://phabricator.wikimedia.org/T272943) (owner: 10C. Scott Ananian) [19:10:12] (03Merged) 10jenkins-bot: auto-stage: Prevent *.orig files from being created [tools/release] - 10https://gerrit.wikimedia.org/r/746903 (owner: 10Ahmon Dancy) [19:12:49] (03Merged) 10jenkins-bot: Add 'parsoid' to 'InputBox' extension dependencies [integration/config] - 10https://gerrit.wikimedia.org/r/745929 (https://phabricator.wikimedia.org/T272943) (owner: 10C. Scott Ananian) [19:12:53] !log Zuul: Add 'parsoid' to 'InputBox' extension dependencies for T272943 [19:12:56] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:12:56] T272943: Make InputBox extension compatible with Parsoid - https://phabricator.wikimedia.org/T272943 [19:14:27] (03PS1) 10Ahmon Dancy: auto-stage: Remove cleanup of patch backup files [tools/release] - 10https://gerrit.wikimedia.org/r/746940 [19:15:07] (03PS2) 10Ahmon Dancy: auto-stage: Remove cleanup of patch backup files [tools/release] - 10https://gerrit.wikimedia.org/r/746940 [19:17:43] (03CR) 10Ahmon Dancy: [C: 03+2] auto-stage: Remove cleanup of patch backup files [tools/release] - 10https://gerrit.wikimedia.org/r/746940 (owner: 10Ahmon Dancy) [19:19:38] (03Merged) 10jenkins-bot: auto-stage: Remove cleanup of patch backup files [tools/release] - 10https://gerrit.wikimedia.org/r/746940 (owner: 10Ahmon Dancy) [19:22:31] (03PS1) 10Ahmon Dancy: rsync_cdbs stuff [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 [19:25:50] (03CR) 10jerkins-bot: [V: 04-1] rsync_cdbs stuff [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 (owner: 10Ahmon Dancy) [19:37:09] (03PS2) 10Ahmon Dancy: rsync_cdbs stuff [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 [19:37:49] (03CR) 10jerkins-bot: [V: 04-1] rsync_cdbs stuff [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 (owner: 10Ahmon Dancy) [19:39:59] (03PS3) 10Ahmon Dancy: rsync_cdbs stuff [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 [19:44:53] (03PS4) 10Ahmon Dancy: rsync_cdbs stuff [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 [19:57:21] (03PS5) 10Ahmon Dancy: rsync_cdbs stuff [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 [19:58:02] (03PS6) 10Ahmon Dancy: rsync_cdbs stuff [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 (https://phabricator.wikimedia.org/T297326) [20:02:34] (03PS7) 10Ahmon Dancy: Add rsync_cdbs configuration parameter [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 (https://phabricator.wikimedia.org/T297326) [20:29:25] (03CR) 10Ahmon Dancy: [C: 03+1] "Tested in train-dev." [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 (https://phabricator.wikimedia.org/T297326) (owner: 10Ahmon Dancy) [20:51:30] (03CR) 1020after4: [C: 03+1] "Seems sensible." [tools/release] - 10https://gerrit.wikimedia.org/r/745937 (https://phabricator.wikimedia.org/T191743) (owner: 10Jdlrobson) [20:55:37] (03CR) 1020after4: [C: 03+1] "So if I remember correctly, CDB files change radically even with a minor change in their content due to the way they are structured, ther" [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 (https://phabricator.wikimedia.org/T297326) (owner: 10Ahmon Dancy) [20:58:23] (03CR) 10Ahmon Dancy: [C: 03+1] Add rsync_cdbs configuration parameter (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/746941 (https://phabricator.wikimedia.org/T297326) (owner: 10Ahmon Dancy) [21:03:21] 10Phabricator, 10Design: Task symbol for "Open" suggests "alert" and is inconsistent in its placement when task types are also used - https://phabricator.wikimedia.org/T297249 (10mmodell) Agreed that it's not the best icon choice or placement. The markup used in that UI makes it difficult to achieve a consist... [21:14:35] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments: 1.38.0-wmf.13 deployment blockers - https://phabricator.wikimedia.org/T293954 (10Arlolra) ##### Risky Patch! 🚂🔥 * **Change**: https://gerrit.wikimedia.org/r/c/mediawiki/vendor/+/746936 - Rollout of patches to implemen... [21:21:00] 10Project-Admins, 10User-Luke081515: What Can I Do For Wikimedia - https://phabricator.wikimedia.org/T124814 (10Aklapper) 05Resolved→03Declined Boldly declining as the domain whatcanidoforwikimedia.org had been squatted and is now expired, and https://github.com/wikimedia-asknot is a 404. [21:50:14] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments: 1.38.0-wmf.13 deployment blockers - https://phabricator.wikimedia.org/T293954 (10Catrope) [22:44:20] 10Release-Engineering-Team (Done by Wed 24 Nov 🧟), 10Release, 10Train Deployments, 10User-Ladsgroup: 1.38.0-wmf.9 seems to have introduced a memory leak - https://phabricator.wikimedia.org/T296098 (10tstarling) I dumped some random parts of the heap of a php-fpm7.2 process on mw1414. It looks like DB query... [23:16:54] 10GitLab, 10serviceops: upgrade gitlab-runners to bullseye - https://phabricator.wikimedia.org/T297659 (10Dzahn) [23:22:07] 10GitLab (Infrastructure), 10serviceops: Migrate gitlab-test instance to puppet - https://phabricator.wikimedia.org/T297411 (10Dzahn) >>! In T297411#7567146, @Jelto wrote: > Thanks for thinking about moving the runners to bullseye. I'm not sure if this task has a lot of overlap with the migration of the Runner...