[01:34:43] 10Beta-Cluster-Infrastructure, 05Account-Vanishing: Define how vanishing requests are processed on Wikimedia beta cluster - https://phabricator.wikimedia.org/T383514#10534383 (10Bugreporter) We can just maintain a bot to clear vanish requests that are opened for at least 24 hours. [08:10:03] maintenance-disconnect-full-disks build 674730 integration-agent-docker-1048 (/: 27%, /srv: 96%, /var/lib/docker: 19%): OFFLINE due to disk space [08:15:03] maintenance-disconnect-full-disks build 674731 integration-agent-docker-1047 (/: 25%, /srv: 98%, /var/lib/docker: 21%): OFFLINE due to disk space [08:15:03] maintenance-disconnect-full-disks build 674731 integration-agent-docker-1048 (/: 27%, /srv: 46%, /var/lib/docker: 20%): RECOVERY disk space OK [08:20:03] maintenance-disconnect-full-disks build 674732 integration-agent-docker-1047 (/: 25%, /srv: 43%, /var/lib/docker: 20%): RECOVERY disk space OK [10:25:02] maintenance-disconnect-full-disks build 674757 integration-agent-docker-1056 (/: 27%, /srv: 100%, /var/lib/docker: 20%): OFFLINE due to disk space [10:30:03] maintenance-disconnect-full-disks build 674758 integration-agent-docker-1056 (/: 27%, /srv: 45%, /var/lib/docker: 19%): RECOVERY disk space OK [10:35:03] maintenance-disconnect-full-disks build 674759 integration-agent-docker-1041 (/: 28%, /srv: 95%, /var/lib/docker: 26%): OFFLINE due to disk space [10:35:03] maintenance-disconnect-full-disks build 674759 integration-agent-docker-1042 (/: 27%, /srv: 95%, /var/lib/docker: 21%): OFFLINE due to disk space [10:35:03] maintenance-disconnect-full-disks build 674759 integration-agent-docker-1048 (/: 27%, /srv: 98%, /var/lib/docker: 20%): OFFLINE due to disk space [10:40:03] maintenance-disconnect-full-disks build 674760 integration-agent-docker-1041 (/: 28%, /srv: 58%, /var/lib/docker: 25%): RECOVERY disk space OK [10:40:03] maintenance-disconnect-full-disks build 674760 integration-agent-docker-1042 (/: 27%, /srv: 46%, /var/lib/docker: 20%): RECOVERY disk space OK [10:40:03] maintenance-disconnect-full-disks build 674760 integration-agent-docker-1048 (/: 27%, /srv: 87%, /var/lib/docker: 20%): RECOVERY disk space OK [10:50:03] maintenance-disconnect-full-disks build 674762 integration-agent-docker-1041 (/: 27%, /srv: 96%, /var/lib/docker: 26%): OFFLINE due to disk space [10:50:03] maintenance-disconnect-full-disks build 674762 integration-agent-docker-1055 (/: 27%, /srv: 98%, /var/lib/docker: 28%): OFFLINE due to disk space [10:55:03] maintenance-disconnect-full-disks build 674763 integration-agent-docker-1041 (/: 27%, /srv: 50%, /var/lib/docker: 25%): RECOVERY disk space OK [10:55:03] maintenance-disconnect-full-disks build 674763 integration-agent-docker-1055 (/: 27%, /srv: 16%, /var/lib/docker: 26%): RECOVERY disk space OK [11:23:26] 10Continuous-Integration-Infrastructure, 07Jenkins, 07ci-test-error: quibble-vendor-mysql-php74-selenium fails: " no space left on device" - https://phabricator.wikimedia.org/T385987 (10Physikerwelt) 03NEW [11:25:13] 06Gerrit-Privilege-Requests, 10MathSearch: Request membership in extension-MathSearch group for HamidRahkooy - https://phabricator.wikimedia.org/T385602#10535101 (10Physikerwelt) p:05Triage→03Medium [11:31:28] To onboard my new collaborator for Math related changes I created a few chnages over the weekend. Today, I discovered a problem in one of the first of those changes and rebased the rest. This created more jobs on https://integration.wikimedia.org/zuul/ than I would have anticipated. Now, I see that some disk seem to be full T385987. I hope there is no relation between both, but I wanted to share [11:31:29] T385987: quibble-vendor-mysql-php74-selenium fails: " no space left on device" - https://phabricator.wikimedia.org/T385987 [11:31:29] this here [11:47:51] 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Determine usefulness of questionable hosts found in 2025-02-06 audit - https://phabricator.wikimedia.org/T385849#10535167 (10rook) deployment-bastion.deployment-prep.eqiad1.wikimedia.cloud is not needed. The system has been shut down and can be removed at any... [13:14:37] (03CR) 10Bunnypranav: "Thank you DreamRimmer and Novem for the comments. I would appreciate getting added onto this list as I can test the patched in Patch demo," [integration/config] - 10https://gerrit.wikimedia.org/r/1118231 (owner: 10Dreamrimmer) [13:50:03] maintenance-disconnect-full-disks build 674798 integration-agent-docker-1053 (/: 27%, /srv: 95%, /var/lib/docker: 22%): OFFLINE due to disk space [13:55:03] maintenance-disconnect-full-disks build 674799 integration-agent-docker-1042 (/: 27%, /srv: 97%, /var/lib/docker: 19%): OFFLINE due to disk space [13:55:03] maintenance-disconnect-full-disks build 674799 integration-agent-docker-1053 (/: 27%, /srv: 65%, /var/lib/docker: 23%): RECOVERY disk space OK [14:00:03] maintenance-disconnect-full-disks build 674800 integration-agent-docker-1042 (/: 27%, /srv: 31%, /var/lib/docker: 19%): RECOVERY disk space OK [14:00:03] maintenance-disconnect-full-disks build 674800 integration-agent-docker-1056 (/: 27%, /srv: 99%, /var/lib/docker: 20%): OFFLINE due to disk space [14:00:18] 10GitLab (Pipeline Services Migration🐤), 10LibUp: Migrate LibUp repositories to Wikimedia GitLab - https://phabricator.wikimedia.org/T341417#10535450 (10Jdforrester-WMF) 05Open→03Resolved a:03taavi [14:03:58] 06Release-Engineering-Team, 10Scap: Remove hard dependencies on GitLab, Toolforge And Phorge/Phab - https://phabricator.wikimedia.org/T358876#10535483 (10jnuche) [14:04:04] 06Release-Engineering-Team, 10Scap, 13Patch-For-Review: scap deploy-promote depends on Toolforge service - https://phabricator.wikimedia.org/T333924#10535486 (10jnuche) →14Duplicate dup:03T358876 [14:05:03] maintenance-disconnect-full-disks build 674801 integration-agent-docker-1056 (/: 27%, /srv: 28%, /var/lib/docker: 18%): RECOVERY disk space OK [14:05:28] 06Release-Engineering-Team, 10Scap: Remove hard dependencies on GitLab, Toolforge And Phorge/Phab - https://phabricator.wikimedia.org/T358876#10535490 (10jnuche) [14:13:38] (03update) 10jnuche: Handle inaccessible train-blockers.toolforge.org [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/662 (https://phabricator.wikimedia.org/T333924) (owner: 10dancy) [14:14:01] (03update) 10jnuche: Handle inaccessible train-blockers.toolforge.org [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/662 (owner: 10dancy) [14:16:16] 06Release-Engineering-Team, 10Scap: Remove hard dependencies on GitLab, Toolforge And Phorge/Phab - https://phabricator.wikimedia.org/T358876#10535506 (10jnuche) Related: https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/662 [14:24:16] (03CR) 10Jforrester: [C:03+2] zuul: Add Bunnypranav to CI allowlist [integration/config] - 10https://gerrit.wikimedia.org/r/1118231 (owner: 10Dreamrimmer) [14:26:44] (03Merged) 10jenkins-bot: zuul: Add Bunnypranav to CI allowlist [integration/config] - 10https://gerrit.wikimedia.org/r/1118231 (owner: 10Dreamrimmer) [14:27:08] !log Zuul: Add Bunnypranav to CI allowlist [14:27:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:37:42] 10Continuous-Integration-Infrastructure, 07Jenkins, 07ci-test-error: quibble-vendor-mysql-php74-selenium fails: " no space left on device" - https://phabricator.wikimedia.org/T385987#10535584 (10Physikerwelt) 05Open→03Resolved a:03Physikerwelt This seems to be resolved https://wm-bot.wmflabs.org/li... [16:09:12] 10Scap: Weird scap install-world failure for aphlict2001.codfw.wmnet - https://phabricator.wikimedia.org/T382271#10535873 (10jnuche) p:05Triage→03Low I think this is this old friend here: T337394 I added a mitigation [[ https://gitlab.wikimedia.org/repos/releng/scap/-/blob/c10b48998b5bc2791ff947d505ed1635031... [16:14:05] (03CR) 10Novem Linguae: [C:03+1] "I'm not sure if this patch will affect patchdemo at all. But it will let CI run for your gerrit patches instead of needing someone to comm" [integration/config] - 10https://gerrit.wikimedia.org/r/1118231 (owner: 10Dreamrimmer) [16:53:33] 10Continuous-Integration-Infrastructure, 10Castor: Running `cypress` in Wikimedia CI requires unusual env variables - https://phabricator.wikimedia.org/T361624#10536042 (10thcipriani) It seems like `CYPRESS_CACHE_FOLDER` is overriding the cache folder the CI currently sets (see @hashar's earlier comment and [[... [17:11:16] 10Release-Engineering-Team (Doing 😎), 10Scap (SpiderPig 🕸️), 10Codex, 07Epic: [EPIC] scap web interface: Create SpiderPig web UI - https://phabricator.wikimedia.org/T375782#10536081 (10CCiufo-WMF) Untagging #dst since our major involvement here is done. We're still happy to provide design/code review and a... [17:12:08] Hey, can someone take a look at T384209? It is resulting in CI failures in GrowthExperiments [17:12:08] T384209: TAR_ENTRY_ERROR ENOSPC: no space left on device (January 2025) (integration-agent-docker-1048) - https://phabricator.wikimedia.org/T384209 [17:19:57] urbanecm: Can you provide a link to a recent failure? I'll take a look [17:20:24] dancy: https://integration.wikimedia.org/ci/job/mediawiki-quibble-apitests-vendor-php74/35097/consoleFull, for example [17:20:33] thx [17:20:45] I was just going to delete one directory [17:20:53] namely /srv/jenkins/workspace/wmf-quibble-selenium-php74@4 [17:20:57] which AFAICT isn’t used [17:21:15] Lucas_WMDE: Go for it. [17:21:19] (https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php74/53866/consoleFull uses php74@3 and https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php74/53876/consoleFull uses php74 and https://integration.wikimedia.org/ci/computer/integration%2Dagent%2Ddocker%2D1048/ shows no other builds) [17:21:20] ok [17:21:45] bleh, permission denied [17:21:50] I tried: [17:21:51] sudo -u jenkins-deploy rm -rf /srv/jenkins/workspace/wmf-quibble-selenium-php74@4 # 3.5G, not used AFAICT: T302477 [17:21:51] T302477: Pipeline lib leaves workspace behind on the Jenkins agents - https://phabricator.wikimedia.org/T302477 [17:22:11] Ok. I'll see what I can do. [17:22:11] :D [17:22:19] (I don’t have permission to sudo as root) [17:22:47] thanks dancy :) [17:34:11] urbanecm, Lucas_WMDE: I deleted about 3.5GB of old workspaces. [17:34:18] ty! [17:35:27] (03update) 10dancy: Handle inaccessible train-blockers.toolforge.org [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/662 (https://phabricator.wikimedia.org/T358876) [17:35:31] (03update) 10dancy: Handle inaccessible train-blockers.toolforge.org [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/662 (https://phabricator.wikimedia.org/T358876) [17:35:58] (03update) 10dancy: Handle inaccessible train-blockers.toolforge.org [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/662 (https://phabricator.wikimedia.org/T358876) [17:38:59] gate-and-submit is extra large today [17:39:15] Elapsed: 1 hr 33 min for https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ContentTranslation/+/1118489, that doesn't seem normal [17:48:58] Indeed. [17:49:25] I'll poke around again [18:24:52] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure): TAR_ENTRY_ERROR ENOSPC: no space left on device (January 2025) (integration-agent-docker-1048) - https://phabricator.wikimedia.org/T384209#10536446 (10Michael) [18:26:38] FIRING: DatasourceError: Queue (Jenkins jobs + Zuul functions) - https://grafana.wikimedia.org/alerting/grafana/b9a8470a-ebab-46f7-9be2-22b5e74a528b/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [18:28:40] FIRING: [2x] DatasourceNoData: - https://alerts.wikimedia.org/?q=alertname%3DDatasourceNoData [18:30:40] (03open) 10dancy: Release 4.135.0 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/664 [18:32:33] dancy: While you are at it, could you also somehow magic more space for 1047? See https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php74/53690/consoleFull (C-F ENOSPC) [18:32:39] (03merge) 10dancy: Release 4.135.0 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/664 [18:33:06] MichaelG_WMF: I'll see what I can do. [18:33:36] dancy: Thank you! 🙏 [18:33:40] RESOLVED: [2x] DatasourceNoData: - https://alerts.wikimedia.org/?q=alertname%3DDatasourceNoData [18:36:16] MichaelG_WMF: I don't see any lingering workspaces that would free significant space. `/srv` usage is at 27% right now. [18:36:38] RESOLVED: DatasourceError: Queue (Jenkins jobs + Zuul functions) - https://grafana.wikimedia.org/alerting/grafana/b9a8470a-ebab-46f7-9be2-22b5e74a528b/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [18:36:43] I think the issue is that we're allowing too many large-disk-usage jobs to run on the same node at the same time. [18:37:02] (and the meager disk allocation) [18:37:22] ok, thank you for checking! [19:33:51] Project beta-scap-sync-world build #193279: 04FAILURE in 2 min 10 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/193279/ [19:37:35] Yippee, build fixed! [19:37:35] Project beta-scap-sync-world build #193280: 09FIXED in 1 min 51 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/193280/ [21:13:21] 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Determine usefulness of questionable hosts found in 2025-02-06 audit - https://phabricator.wikimedia.org/T385849#10536805 (10Southparkfan) Regarding: `deployment-parsoid14`: `parsoid-external-ci-access.beta.wmflabs.org` points to it. Have we verified nothing i... [21:42:49] (03open) 10dancy: spiderpig: Add admin operations [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/665 [21:42:52] (03update) 10dancy: spiderpig: Add admin operations [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/665 [22:00:32] (03update) 10dancy: spiderpig: Add admin operations [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/665 [22:35:15] 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Determine usefulness of questionable hosts found in 2025-02-06 audit - https://phabricator.wikimedia.org/T385849#10537109 (10bd808) >>! In T385849#10536805, @Southparkfan wrote: > Regarding: `deployment-parsoid14`: `parsoid-external-ci-access.beta.wmflabs.org`... [22:50:59] (03update) 10dancy: spiderpig: Add admin operations [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/665