[00:41:21] 10Project-Admins: Create project tag for Header Tabs extension - https://phabricator.wikimedia.org/T299619 (10Yaron_Koren) [01:28:32] 10Phabricator, 10Security-Team, 10Security: Audit members of acl*security for more than x duration of no activity (Jan 2022) - https://phabricator.wikimedia.org/T299400 (10Aklapper) @Dsharpe: I've pasted P18895. [02:43:18] (03CR) 10Krinkle: dockerfiles: Add php-excimer to quibble (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/748312 (https://phabricator.wikimedia.org/T225730) (owner: 10Ladsgroup) [08:06:42] (03PS1) 10Majavah: Stop branching UserMerge [tools/release] - 10https://gerrit.wikimedia.org/r/755535 (https://phabricator.wikimedia.org/T216089) [10:48:12] 10Phabricator, 10Project-Admins: Unarchive WMUA-tech project and create a custom security policy for its members - https://phabricator.wikimedia.org/T286866 (10Aklapper) @Base: Does that custom form help? [10:49:03] 10GitLab (Support), 10Release-Engineering-Team (Doing), 10User-brennen: Couldn't fork a gitlab repository - https://phabricator.wikimedia.org/T295468 (10Aklapper) >>! In T295468#7514003, @brennen wrote: > I've seen it enough that I'm not confident it's gone. Will monitor for a bit. @brennen: Two months late... [10:52:20] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Doing), 10ci-test-error (WMF-deployed Build Failure): TAR_ENTRY_ERROR ENOSPC: no space left on device - https://phabricator.wikimedia.org/T292729 (10hashar) 05Open→03Resolved The immediate fixes have been to clean up the disk space. In... [10:55:33] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen): Move all Wikimedia CI (WMCS integration project) instances from stretch to buster/bullseye - https://phabricator.wikimedia.org/T252071 (10hashar) To summarize, we would need a flavor of instances that has more disks: * more disk, I ha... [10:55:38] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen): Move all Wikimedia CI (WMCS integration project) instances from stretch to buster/bullseye - https://phabricator.wikimedia.org/T252071 (10hashar) [10:59:56] 10Phabricator: "Unhandled Exception: Call to a member function getAppliedTransactionPHIDs() on bool" when viewing old Herald Transcript - https://phabricator.wikimedia.org/T294860 (10Aklapper) ` $:acko\> grep -r getAppliedTransactionPHIDs . ./phabricator/src/applications/herald/storage/transcript/HeraldObjectTra... [11:32:56] hashar: should I try another gate-and-submit build of the Termbox change, now that docker layers are apparently cleaned up better? [11:37:10] Lucas_WMDE: eys please! [11:37:22] apparently contint1001 had toooo many layers and images and containers [11:37:42] my assumption is that it caused IO to be crippled [11:38:00] checking [11:39:46] alright, +2ed it [11:39:58] only one change at a time for now [11:40:15] oof, Zuul is fairly full with core changes right now [11:45:34] https://integration.wikimedia.org/ci/job/termbox-pipeline-rehearse/95/console is running on contint1001… [11:49:19] we will see how it behaves [11:49:36] at least it has least images now [11:50:45] i will try to clean a few more [11:51:41] !log Cleaning very old Docker images on contint1001.wikimedia.Org [11:51:42] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:53:38] there are plenty of old quibble images [11:54:40] 10Phabricator, 10Project-Admins: Unarchive WMUA-tech project and create a custom security policy for its members - https://phabricator.wikimedia.org/T286866 (10Base) @Aklapper , oh sorry, I have missed the comment from mmodell. I will check [11:56:45] Lucas_WMDE: looks like the build is well behaving so far https://integration.wikimedia.org/ci/blue/organizations/jenkins/termbox-pipeline-rehearse/detail/termbox-pipeline-rehearse/95/ :D [11:57:05] yup, looking pretty good [11:57:19] I think it’s mainly done with the risky stuff by now, it just takes a while to copy node_modules over [11:59:20] success \o/ [11:59:49] the box has spinning hdd [12:00:04] and I think the partition simply had wayy too many files to deal with [12:00:50] (03CR) 10Awight: [C: 03+1] "Looks right! A next step could be to wrap the parallel steps up as a ParallelNpmInstall command, to clean up cmd.py ." [integration/quibble] - 10https://gerrit.wikimedia.org/r/754866 (owner: 10Kosta Harlan) [12:04:18] !log contint1001 `docker image prune` [12:04:19] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [12:04:35] Total reclaimed space: 13.8GB [12:04:51] nice [12:07:05] !log contint1001 deleting all the Docker images (they will be pulled as needed) [12:07:05] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [12:07:10] mass clean ;) [12:08:40] Lucas_WMDE: and I should have looked at grafana [12:08:52] we went from 958GB usage to 34G after the clean up https://grafana.wikimedia.org/d/000000377/host-overview?viewPanel=28&orgId=1&var-server=contint1001&var-datasource=eqiad%20prometheus%2Fops&var-cluster=ci&from=now-2d&to=now [12:10:34] !log contint2001 : docker container prune && docker image prune [12:10:35] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [12:12:34] !log contint2001 deleting all the Docker images (they will be pulled as needed) [12:12:35] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [12:14:09] Lucas_WMDE: feel free to +2 a few more. It should be fine now that the disk is almost empty [12:14:34] I might try +2ing two at once and seeing if we’ll get a Node OOM again [12:14:37] thanks [13:02:52] is something stuck? https://integration.wikimedia.org/zuul/ usually gate-and-submit-l10n is very fast, but now waiting for an hour [13:04:15] I think it’s just busy [13:04:26] I’ve had two successful gate-and-submit-wmf’s just now, at least [13:08:54] yeah, they all got processed just now [13:09:11] ok [13:20:24] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10Zabe) [13:38:58] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10Lucas_Werkmeister_WMDE) [15:13:11] 10Phabricator, 10WMDE-Technical-Wishes-Maintenance, 10Technical-Debt: Kill Phab's sprint.phragile-uri config setting - https://phabricator.wikimedia.org/T275188 (10thiemowmde) [15:22:40] 10Release-Engineering-Team, 10MW-on-K8s, 10SRE, 10serviceops: Make scap deploy to kubernetes together with the legacy systems - https://phabricator.wikimedia.org/T299648 (10Joe) [15:23:43] I am rebalancing partitions on some agents to give more disk space [15:50:18] (03PS1) 10Accraze: inference: add draftquality-transformer pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/755721 (https://phabricator.wikimedia.org/T298989) [16:11:50] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments: 1.38.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T293959 (10thcipriani) [16:30:08] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10thcipriani) [16:31:29] !log Rebalancing /var/lib/docker and /srv partitions on CI agents | https://gerrit.wikimedia.org/r/755713 [16:31:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:32:34] hey Antoine! [16:39:53] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Doing), 10Release Pipeline: Pipeline lib still leaks containers on contint1001 / contint2001 - https://phabricator.wikimedia.org/T290608 (10dancy) 05Resolved→03Open [16:40:33] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Doing), 10Release Pipeline: Pipeline lib still leaks containers on contint1001 / contint2001 - https://phabricator.wikimedia.org/T290608 (10dancy) Reopened because containers are still accumulating on contint1001: ` dancy@contint1001:~$ doc... [16:49:28] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10Majavah) [17:05:19] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments: 1.38.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T293959 (10Majavah) [17:16:58] (03PS1) 10Robert Vogel: Add `BlueSpiceProDistributionConnector` [integration/config] - 10https://gerrit.wikimedia.org/r/755734 [17:31:17] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments: 1.38.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T293959 (10Jdlrobson) [17:48:28] (03PS1) 10Ahmon Dancy: scap prep auto: mods for master branch handling [tools/scap] - 10https://gerrit.wikimedia.org/r/755742 [17:49:12] (03CR) 10jerkins-bot: [V: 04-1] scap prep auto: mods for master branch handling [tools/scap] - 10https://gerrit.wikimedia.org/r/755742 (owner: 10Ahmon Dancy) [17:50:19] (03PS2) 10Ahmon Dancy: scap prep auto: mods for master branch handling [tools/scap] - 10https://gerrit.wikimedia.org/r/755742 [17:52:57] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Doing), 10ci-test-error (WMF-deployed Build Failure): TAR_ENTRY_ERROR ENOSPC: no space left on device - https://phabricator.wikimedia.org/T292729 (10kostajh) Seen just now in https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php... [17:54:02] (03CR) 10Ahmon Dancy: [C: 03+2] scap prep auto: mods for master branch handling [tools/scap] - 10https://gerrit.wikimedia.org/r/755742 (owner: 10Ahmon Dancy) [17:54:40] (03Merged) 10jenkins-bot: scap prep auto: mods for master branch handling [tools/scap] - 10https://gerrit.wikimedia.org/r/755742 (owner: 10Ahmon Dancy) [17:54:48] (03CR) 10Dduvall: [C: 03+2] mirror-repos.sh: Get the name of the default branch [tools/train-dev] - 10https://gerrit.wikimedia.org/r/754072 (owner: 10Ahmon Dancy) [17:54:57] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Doing), 10ci-test-error (WMF-deployed Build Failure): TAR_ENTRY_ERROR ENOSPC: no space left on device - https://phabricator.wikimedia.org/T292729 (10hashar) Grblmbl. I have just completed the partition shuffle via https://gerrit.wikimedia.o... [17:55:23] (03CR) 10Ahmon Dancy: [C: 03+2] mirror-repos.sh: Move --prune into git_remote_update [tools/train-dev] - 10https://gerrit.wikimedia.org/r/754073 (owner: 10Ahmon Dancy) [17:56:08] (03Merged) 10jenkins-bot: mirror-repos.sh: Get the name of the default branch [tools/train-dev] - 10https://gerrit.wikimedia.org/r/754072 (owner: 10Ahmon Dancy) [17:56:32] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Doing), 10ci-test-error (WMF-deployed Build Failure): TAR_ENTRY_ERROR ENOSPC: no space left on device - https://phabricator.wikimedia.org/T292729 (10hashar) 05Resolved→03Open [17:56:35] (03PS2) 10Bartosz Dziewoński: Zuul: [DiscussionTools] Add Gadgets as dependency for Phan jobs [integration/config] - 10https://gerrit.wikimedia.org/r/755502 [17:57:16] (03Merged) 10jenkins-bot: mirror-repos.sh: Move --prune into git_remote_update [tools/train-dev] - 10https://gerrit.wikimedia.org/r/754073 (owner: 10Ahmon Dancy) [17:58:10] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10Ladsgroup) [18:00:19] kostajh: sorry about the disk space issue :] [18:00:34] I actually pushed a patch that shrink the docker partition to have a larger /srv [18:06:43] (03PS1) 10Hashar: Revert "jjb: stop using host src for Quibble jobs" [integration/config] - 10https://gerrit.wikimedia.org/r/755743 (https://phabricator.wikimedia.org/T292729) [18:07:44] !log Updating Quibble jobs to have MediaWiki files written on the hosts /srv partition (38G) instead of inside the container which ends in /var/lib/docker (24G) https://gerrit.wikimedia.org/r/755743 # T292729 [18:07:46] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:07:47] T292729: TAR_ENTRY_ERROR ENOSPC: no space left on device - https://phabricator.wikimedia.org/T292729 [18:08:11] that should do it [18:08:22] dinner time, be back in a couple hours or so [18:08:29] the jobs are deploying [18:15:11] (03CR) 10Hashar: [C: 03+2] Revert "jjb: stop using host src for Quibble jobs" [integration/config] - 10https://gerrit.wikimedia.org/r/755743 (https://phabricator.wikimedia.org/T292729) (owner: 10Hashar) [18:17:41] (03Merged) 10jenkins-bot: Revert "jjb: stop using host src for Quibble jobs" [integration/config] - 10https://gerrit.wikimedia.org/r/755743 (https://phabricator.wikimedia.org/T292729) (owner: 10Hashar) [18:18:36] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Doing), 10Patch-For-Review, 10ci-test-error (WMF-deployed Build Failure): TAR_ENTRY_ERROR ENOSPC: no space left on device - https://phabricator.wikimedia.org/T292729 (10hashar) 05Open→03Resolved Claiming it is finally solved. We might... [18:30:26] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments: 1.38.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T293959 (10Majavah) [19:10:36] !log Unpacking scap (4.1.1-1+0~20220120175448.144~1.gbp517f9d) over (4.1.1-1+0~20220113154148.133~1.gbp6e3a17) on deploy03 [19:10:38] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:19:53] !log Pausing beta Jenkins jobs to make a copy of /srv/mediawiki-staging in preparation for testing [19:19:54] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:39:47] 10Release-Engineering-Team (Yak Shaving 🐃🪒), 10Scap, 10Patch-For-Review, 10Performance-Team (Radar), 10Performance-Team-publish: Stop trying to avoid rsyncing l10n CDB files - https://phabricator.wikimedia.org/T297326 (10Krinkle) [19:39:55] 10Release-Engineering-Team (Yak Shaving 🐃🪒), 10Scap, 10Patch-For-Review, 10Performance-Team (Radar), 10Performance-Team-publish: Stop trying to avoid rsyncing l10n CDB files - https://phabricator.wikimedia.org/T297326 (10Krinkle) [20:03:38] (03PS1) 10Ahmon Dancy: beta-code-update-eqiad/beta-mediawiki-config-update-eqiad use scap prep auto [integration/config] - 10https://gerrit.wikimedia.org/r/755763 (https://phabricator.wikimedia.org/T299163) [20:04:05] !log Jenkins beta jobs are back online, using scap prep auto now. [20:04:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:06:59] congratulations dancy ! [20:10:46] 10Phabricator, 10I18n: Adding sicilian language (scn) - https://phabricator.wikimedia.org/T299694 (10XANA000) [20:18:46] (03CR) 10Jforrester: [C: 03+2] Zuul: [DiscussionTools] Add Gadgets as dependency for Phan jobs [integration/config] - 10https://gerrit.wikimedia.org/r/755502 (owner: 10Bartosz Dziewoński) [20:19:26] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen): Move all Wikimedia CI (WMCS integration project) instances from stretch to buster/bullseye - https://phabricator.wikimedia.org/T252071 (10hashar) [20:19:32] 10Continuous-Integration-Infrastructure, 10Quibble, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), and 2 others: Terminating MySQL takes several minutes in (Wikibase?) CI jobs - https://phabricator.wikimedia.org/T265615 (10hashar) [20:20:46] (03Merged) 10jenkins-bot: Zuul: [DiscussionTools] Add Gadgets as dependency for Phan jobs [integration/config] - 10https://gerrit.wikimedia.org/r/755502 (owner: 10Bartosz Dziewoński) [20:22:04] (03PS3) 10Jforrester: Zuul: [Kartographer] Add parsoid as dependency for CI jobs [integration/config] - 10https://gerrit.wikimedia.org/r/755487 (owner: 10MSantos) [20:22:08] (03PS4) 10Jforrester: Zuul: [Kartographer] Add parsoid as dependency for CI jobs [integration/config] - 10https://gerrit.wikimedia.org/r/755487 (owner: 10MSantos) [20:22:23] (03CR) 10Jforrester: [C: 03+2] Zuul: [Kartographer] Add parsoid as dependency for CI jobs [integration/config] - 10https://gerrit.wikimedia.org/r/755487 (owner: 10MSantos) [20:22:42] !log Zuul: [DiscussionTools] Add Gadgets as dependency for Phan jobs [20:22:43] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:24:20] (03Merged) 10jenkins-bot: Zuul: [Kartographer] Add parsoid as dependency for CI jobs [integration/config] - 10https://gerrit.wikimedia.org/r/755487 (owner: 10MSantos) [20:24:29] 10Continuous-Integration-Infrastructure: Create first CI agent with the new disk system - https://phabricator.wikimedia.org/T290783 (10hashar) ` class profile::ci::dockervolume { labs_lvm::volume { 'docker': size => '70%FREE', } ` With https://gerrit.wikimedia.org/r/c/operations/puppet/+/755... [20:24:49] !log Zuul: [Kartographer] Add parsoid as dependency for CI jobs [20:24:50] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:52:14] 10Continuous-Integration-Infrastructure: Create first CI agent with the new disk system - https://phabricator.wikimedia.org/T290783 (10hashar) [21:33:07] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10cscott) ##### Risky Patch! 🚂🔥 * **Change**: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Gadgets/+/755744 * **Summary**: ** Deprecating a particular c... [21:48:28] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10jeena) [21:52:37] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10jeena) [21:59:07] (03CR) 10Ladsgroup: dockerfiles: Add php-excimer to quibble (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/748312 (https://phabricator.wikimedia.org/T225730) (owner: 10Ladsgroup) [22:29:14] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10jeena) [22:32:04] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10jeena) [22:57:00] (03PS2) 10Jforrester: Stop branching UserMerge [tools/release] - 10https://gerrit.wikimedia.org/r/755535 (https://phabricator.wikimedia.org/T216089) (owner: 10Majavah) [23:01:32] (03PS1) 10Cwhite: logstash-filter-verifier: upgrade logstash to 7.16 [integration/config] - 10https://gerrit.wikimedia.org/r/755816 (https://phabricator.wikimedia.org/T299431) [23:03:14] (03CR) 10Cwhite: [C: 04-1] "Staged patch. Not yet ready." [integration/config] - 10https://gerrit.wikimedia.org/r/755816 (https://phabricator.wikimedia.org/T299431) (owner: 10Cwhite) [23:22:53] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10Zabe) [23:23:10] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T293960 (10Zabe) [23:43:21] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Scap, 10Patch-For-Review: Replace most of beta::autoupdater with scap prep auto - https://phabricator.wikimedia.org/T299163 (10dancy)