[04:56:08] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10hashar) [05:18:11] (03CR) 10Hashar: [C: 03+2] Zuul: [mediawiki/extensions/ArticlePlaceholder] Enable Sonar Codehealth [integration/config] - 10https://gerrit.wikimedia.org/r/933167 (https://phabricator.wikimedia.org/T321837) (owner: 10Pwangai) [05:18:14] (03CR) 10Hashar: [C: 03+2] Zuul: [mediawiki/extensions/CentralAuth] Enable Sonar Codehealth [integration/config] - 10https://gerrit.wikimedia.org/r/933168 (https://phabricator.wikimedia.org/T321837) (owner: 10Pwangai) [05:18:22] (03CR) 10Hashar: [C: 03+2] Zuul: [mediawiki/extensions/GeoData] Enable Sonar Codehealth [integration/config] - 10https://gerrit.wikimedia.org/r/933144 (https://phabricator.wikimedia.org/T321837) (owner: 10Pwangai) [05:18:27] (03CR) 10Hashar: [C: 03+2] Zuul: [mediawiki/extensions/GlobalUserPage] Enable Sonar Codehealth [integration/config] - 10https://gerrit.wikimedia.org/r/933145 (https://phabricator.wikimedia.org/T321837) (owner: 10Pwangai) [05:19:53] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/ArticlePlaceholder] Enable Sonar Codehealth [integration/config] - 10https://gerrit.wikimedia.org/r/933167 (https://phabricator.wikimedia.org/T321837) (owner: 10Pwangai) [05:20:11] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/CentralAuth] Enable Sonar Codehealth [integration/config] - 10https://gerrit.wikimedia.org/r/933168 (https://phabricator.wikimedia.org/T321837) (owner: 10Pwangai) [05:20:18] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/GeoData] Enable Sonar Codehealth [integration/config] - 10https://gerrit.wikimedia.org/r/933144 (https://phabricator.wikimedia.org/T321837) (owner: 10Pwangai) [05:20:21] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/GlobalUserPage] Enable Sonar Codehealth [integration/config] - 10https://gerrit.wikimedia.org/r/933145 (https://phabricator.wikimedia.org/T321837) (owner: 10Pwangai) [05:21:11] !log Reloaded Zuul to enable Sonar Codehealth on ArticlePlaceholder, CentralAuth, GeoData, GlobalUserPage # T321837 [05:21:14] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [05:21:14] T321837: Repositories integrated into Codehealth Pipeline (Production) - https://phabricator.wikimedia.org/T321837 [07:09:12] 10Gerrit, 10Release-Engineering-Team: CI marking all changes as -1 - https://phabricator.wikimedia.org/T340518 (10Marostegui) [07:09:24] 10Gerrit, 10Release-Engineering-Team: CI marking all changes as -1 - https://phabricator.wikimedia.org/T340518 (10Marostegui) p:05Triage→03Unbreak! [07:11:31] 10Gerrit, 10Release-Engineering-Team, 10ci-test-error (WMF-deployed Build Failure): CI marking all changes as -1 - https://phabricator.wikimedia.org/T340518 (10RhinosF1) [07:17:26] 10Gerrit, 10Release-Engineering-Team, 10ci-test-error (WMF-deployed Build Failure): CI marking all changes as -1 - https://phabricator.wikimedia.org/T340518 (10Marostegui) [07:21:28] 10Gerrit, 10Release-Engineering-Team, 10ci-test-error (WMF-deployed Build Failure): CI marking all changes as -1 - https://phabricator.wikimedia.org/T340518 (10Marostegui) [07:47:21] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team, 10Zuul, 10ci-test-error (WMF-deployed Build Failure): CI marking all changes as -1 - https://phabricator.wikimedia.org/T340518 (10hashar) a:03hashar That is T309376 striking again. I will restart Zuul [07:49:05] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team, 10Zuul, 10ci-test-error (WMF-deployed Build Failure): CI marking all changes as -1 - https://phabricator.wikimedia.org/T340518 (10hashar) [07:49:43] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Zuul, 10Patch-For-Review: Zuul jenkins-bot user holding open SSH sessions - https://phabricator.wikimedia.org/T309376 (10hashar) [08:33:03] !log Deleted beta-mediawiki-config-update-eqiad Jenkins job. No more used since August 2022 / T314378 [08:33:05] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [08:33:06] T314378: Stop triggering `beta-scap-sync-world` on `beta-mediawiki-config-update-eqiad` completion - https://phabricator.wikimedia.org/T314378 [09:30:10] 10GitLab (Auth & Access), 10Release-Engineering-Team, 10CAS-SSO, 10Infrastructure-Foundations, and 2 others: Add GitLab to offboarding workflow - https://phabricator.wikimedia.org/T339843 (10MoritzMuehlenhoff) [10:46:33] PROBLEM - Check systemd state on doc2002 is CRITICAL: CRITICAL - degraded: The following units failed: rsync-doc-host-data-sync.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [11:36:36] 10Continuous-Integration-Config, 10CirrusSearch, 10Discovery-Search, 10Wikidata, and 2 others: WikibaseCirrusSearch and WikibaseLexemeCirrusSearch tests should run with WIkibase in CI - https://phabricator.wikimedia.org/T244487 (10Addshore) [11:43:32] RECOVERY - Check systemd state on doc2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [11:51:03] 10Beta-Cluster-Infrastructure, 10serviceops, 10wikidiff2, 10Better-Diffs-2023, 10Community-Tech (CommTech-Kanban): Install wikidiff2 1.14.0 deb on deployment-prep & test - https://phabricator.wikimedia.org/T340542 (10TheresNoTime) [11:54:22] ref T340542, I intend to run `sudo apt install ./path/to/deb` on `mediawiki11` and `mediawiki12` — any objections/reasons why I shouldn't? [11:54:22] T340542: Install wikidiff2 1.14.0 deb on deployment-prep & test - https://phabricator.wikimedia.org/T340542 [12:22:16] 10GitLab, 10ExtensionDistributor: Add Gitlab Provider - https://phabricator.wikimedia.org/T340523 (10Reedy) [12:23:14] 10GitLab, 10ExtensionDistributor: Add Gitlab Provider - https://phabricator.wikimedia.org/T340523 (10Reedy) [12:23:20] 10GitLab (Project Migration), 10Release-Engineering-Team (They Live 🕶️🧟), 10User-brennen: Migrate mediawiki/ namespace from Gerrit to GitLab - https://phabricator.wikimedia.org/T335921 (10Reedy) [12:23:21] !log deployment-prep: ran `sudo apt install /home/samtar/php7.4-wikidiff2_1.14.0-0+wmf1+buster1_amd64.deb` on `deployment-mediawiki[11-12]` for T340542, watching logs [12:23:23] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [12:23:24] T340542: Install wikidiff2 1.14.0 deb on deployment-prep & test - https://phabricator.wikimedia.org/T340542 [12:25:09] 10Beta-Cluster-Infrastructure, 10serviceops, 10wikidiff2, 10Better-Diffs-2023, 10Community-Tech (CommTech-Kanban): Install wikidiff2 1.14.0 deb on deployment-prep & test - https://phabricator.wikimedia.org/T340542 (10TheresNoTime) nb. ` samtar@deployment-mediawiki12:~$ php --ri wikidiff2 | grep version w... [12:34:00] 10Beta-Cluster-Infrastructure, 10Wikidata, 10wdwb-tech: Wikidata on beta is getting too many edits - https://phabricator.wikimedia.org/T168101 (10Addshore) [12:35:28] 10Continuous-Integration-Infrastructure, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata, 10wdwb-tech, 10Test-Coverage: generate Wikibase.git code coverage on Jenkins - https://phabricator.wikimedia.org/T88434 (10Addshore) [12:37:33] 10Beta-Cluster-Infrastructure, 10Wikidata, 10wdwb-tech, 10Browser-Tests: Wikidata daily browser tests fails on Beta due to "Unable to store text to external storage" - https://phabricator.wikimedia.org/T242717 (10Addshore) [13:45:43] (03PS1) 10Subramanya Sastry: WIP: Config for running diffs with core-integrated Parsoid [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933461 [13:46:21] (03CR) 10CI reject: [V: 04-1] WIP: Config for running diffs with core-integrated Parsoid [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933461 (owner: 10Subramanya Sastry) [13:47:46] !log Creating new DB tables on beta wikishared for the CampaignEvents extension # T339997 [13:47:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:47:49] T339997: Create the tables for participant questions in beta - https://phabricator.wikimedia.org/T339997 [13:55:47] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Zuul, 10ci-test-error: Gerrit gives spurious V-1 Merge Failed in wikimedia/fundraising/tools repo - https://phabricator.wikimedia.org/T336902 (10hashar) > Could you help us move it to just /wikimedia/fundraising/DjangoBannerStats, pleas... [14:02:06] (03PS1) 10Subramanya Sastry: WIP: Reorg configs [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933466 [14:02:41] (03CR) 10CI reject: [V: 04-1] WIP: Reorg configs [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933466 (owner: 10Subramanya Sastry) [14:11:42] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 3 others: Migrate group0 to Kubernetes - https://phabricator.wikimedia.org/T337490 (10Clement_Goubert) [14:14:18] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Migrate group1 to Kubernetes - https://phabricator.wikimedia.org/T340549 (10Clement_Goubert) [14:14:44] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Migrate group1 to Kubernetes - https://phabricator.wikimedia.org/T340549 (10Clement_Goubert) p:05Triage→03Medium [14:19:55] is anyone else having trouble getting "Edit Subtasks" / "Edit Parent Tasks" to work on Phab? [14:19:58] It spins forever for me [14:20:23] WFM [14:20:29] great [14:20:30] cdanis: Has it worked since the upgrades last night? [14:20:34] might just be JS cache stuffs [14:20:37] Reedy: dunno, just logging in this morning [14:21:06] it works if I open in a new window, instead of letting the JS do the websocket thing it's apparently trying to do [14:29:59] 10Beta-Cluster-Infrastructure, 10serviceops, 10wikidiff2, 10Better-Diffs-2023, 10Community-Tech (CommTech-Kanban): Install wikidiff2 1.14.0 deb on deployment-prep & test - https://phabricator.wikimedia.org/T340542 (10TheresNoTime) [14:38:42] 10GitLab (Project Migration), 10Release-Engineering-Team (They Live 🕶️🧟), 10serviceops-collab, 10Patch-For-Review: Provide mechanism to publish to doc.wikimedia.org from GitLab CI - https://phabricator.wikimedia.org/T336168 (10CodeReviewBot) jnuche merged https://gitlab.wikimedia.org/repos/releng/docpub/-/... [15:15:36] maintenance-disconnect-full-disks build 503752 integration-agent-docker-1035 (/: 31%, /srv: 11%, /var/lib/docker: 95%): OFFLINE due to disk space [15:26:13] maintenance-disconnect-full-disks build 503754 integration-agent-docker-1035 (/: 31%, /srv: 8%, /var/lib/docker: 95%): RECOVERY disk space OK [15:41:17] maintenance-disconnect-full-disks build 503757 integration-agent-docker-1035 (/: 31%, /srv: 20%, /var/lib/docker: 100%): OFFLINE due to disk space [15:56:05] maintenance-disconnect-full-disks build 503760 integration-agent-docker-1035 (/: 31%, /srv: 8%, /var/lib/docker: 96%): RECOVERY disk space OK [16:01:17] maintenance-disconnect-full-disks build 503761 integration-agent-docker-1035 (/: 31%, /srv: 13%, /var/lib/docker: 100%): OFFLINE due to disk space [16:01:19] Should https://wikitech.wikimedia.org/wiki/Deployments/Inclusion_criteria have "creating a new production wiki"? :-) [16:05:23] yes, those are complex enough that I don't think backport windows should be used, and in practice we've done those in dedicated windows as long as I've been here [16:05:46] maintenance-disconnect-full-disks build 503762 integration-agent-docker-1035 (/: 31%, /srv: 8%, /var/lib/docker: 99%): RECOVERY disk space OK [16:05:51] * taavi does not like how there are several deployment-related policies that are just ignored [16:10:46] maintenance-disconnect-full-disks build 503763 integration-agent-docker-1035 (/: 31%, /srv: 16%, /var/lib/docker: 100%): OFFLINE due to disk space [16:21:27] maintenance-disconnect-full-disks build 503765 integration-agent-docker-1035 (/: 31%, /srv: 8%, /var/lib/docker: 99%): RECOVERY disk space OK [16:30:45] maintenance-disconnect-full-disks build 503767 integration-agent-docker-1035 (/: 31%, /srv: 22%, /var/lib/docker: 99%): OFFLINE due to disk space [16:35:42] maintenance-disconnect-full-disks build 503768 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 99%): RECOVERY disk space OK [16:38:35] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10brennen) [16:40:47] maintenance-disconnect-full-disks build 503769 integration-agent-docker-1035 (/: 31%, /srv: 13%, /var/lib/docker: 100%): OFFLINE due to disk space [16:45:47] maintenance-disconnect-full-disks build 503770 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 99%): RECOVERY disk space OK [16:55:41] maintenance-disconnect-full-disks build 503772 integration-agent-docker-1035 (/: 31%, /srv: 22%, /var/lib/docker: 100%): OFFLINE due to disk space [16:58:27] 10GitLab (Pipeline Services Migration🐤), 10Release-Engineering-Team (Priority Backlog 📥), 10Toolforge, 10cloud-services-team: Move Toolforge PipelineLib repositories to GitLab - https://phabricator.wikimedia.org/T334399 (10thcipriani) [16:59:02] 10GitLab (Pipeline Services Migration🐤), 10Release-Engineering-Team (Priority Backlog 📥), 10translatewiki.net: Set up translatewiki.net exports to push (and merge) to Wikimedia GitLab - https://phabricator.wikimedia.org/T334419 (10thcipriani) [17:00:44] maintenance-disconnect-full-disks build 503773 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 99%): RECOVERY disk space OK [17:00:51] 10GitLab (Pipeline Services Migration🐤), 10Release-Engineering-Team (They Live 🕶️🧟), 10serviceops-collab: Provide mechanism to publish to doc.wikimedia.org from GitLab CI - https://phabricator.wikimedia.org/T336168 (10thcipriani) [17:06:12] maintenance-disconnect-full-disks build 503774 integration-agent-docker-1035 (/: 31%, /srv: 22%, /var/lib/docker: 100%): OFFLINE due to disk space [17:11:04] maintenance-disconnect-full-disks build 503775 integration-agent-docker-1035 (/: 31%, /srv: 20%, /var/lib/docker: 99%): still OFFLINE due to disk space [17:15:42] maintenance-disconnect-full-disks build 503776 integration-agent-docker-1033 (/: 29%, /srv: 21%, /var/lib/docker: 99%): OFFLINE due to disk space [17:20:41] maintenance-disconnect-full-disks build 503777 integration-agent-docker-1033 (/: 29%, /srv: 21%, /var/lib/docker: 99%): RECOVERY disk space OK [17:25:53] maintenance-disconnect-full-disks build 503778 integration-agent-docker-1033 (/: 29%, /srv: 20%, /var/lib/docker: 99%): OFFLINE due to disk space [17:25:53] maintenance-disconnect-full-disks build 503778 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 100%): RECOVERY disk space OK [17:31:13] maintenance-disconnect-full-disks build 503779 integration-agent-docker-1033 (/: 29%, /srv: 12%, /var/lib/docker: 100%): RECOVERY disk space OK [17:31:13] maintenance-disconnect-full-disks build 503779 integration-agent-docker-1035 (/: 31%, /srv: 16%, /var/lib/docker: 100%): OFFLINE due to disk space [17:36:05] maintenance-disconnect-full-disks build 503780 integration-agent-docker-1035 (/: 31%, /srv: 18%, /var/lib/docker: 100%): still OFFLINE due to disk space [17:43:03] 10Gerrit, 10serviceops-collab: Gerrit LFS objects lack an automatic sync to gerrit replicas - https://phabricator.wikimedia.org/T257741 (10LSobanski) Resolving based on the most recent comment. @QChris, please reopen if this doesn't address the use case you had in mind. [17:46:00] maintenance-disconnect-full-disks build 503782 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 96%): RECOVERY disk space OK [17:58:52] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10brennen) [18:05:15] (03PS2) 10Subramanya Sastry: Config for running diffs with core-integrated Parsoid [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933461 [18:05:17] (03PS2) 10Subramanya Sastry: Reorg configs and helper files [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933466 [18:05:19] (03PS1) 10Subramanya Sastry: Remove stale version code [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933601 [18:05:33] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10brennen) [18:06:20] maintenance-disconnect-full-disks build 503786 integration-agent-docker-1035 (/: 31%, /srv: 23%, /var/lib/docker: 100%): OFFLINE due to disk space [18:06:45] (03CR) 10CI reject: [V: 04-1] Config for running diffs with core-integrated Parsoid [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933461 (owner: 10Subramanya Sastry) [18:06:56] (03CR) 10CI reject: [V: 04-1] Reorg configs and helper files [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933466 (owner: 10Subramanya Sastry) [18:07:16] (03CR) 10CI reject: [V: 04-1] Remove stale version code [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933601 (owner: 10Subramanya Sastry) [18:07:29] (03PS3) 10Subramanya Sastry: Config for running diffs with core-integrated Parsoid [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933461 [18:07:31] (03PS3) 10Subramanya Sastry: Reorg configs and helper files [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933466 [18:07:33] (03PS2) 10Subramanya Sastry: Remove stale version code [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933601 [18:08:33] (Queue (Jenkins jobs + Zuul functions) alert) firing: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [18:09:10] (03CR) 10CI reject: [V: 04-1] Reorg configs and helper files [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933466 (owner: 10Subramanya Sastry) [18:09:12] (03CR) 10CI reject: [V: 04-1] Remove stale version code [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933601 (owner: 10Subramanya Sastry) [18:10:50] maintenance-disconnect-full-disks build 503787 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 100%): RECOVERY disk space OK [18:15:47] maintenance-disconnect-full-disks build 503788 integration-agent-docker-1033 (/: 30%, /srv: 59%, /var/lib/docker: 99%): OFFLINE due to disk space [18:18:33] (Queue (Jenkins jobs + Zuul functions) alert) resolved: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [18:18:48] (Queue (Jenkins jobs + Zuul functions) alert) firing: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [18:20:56] maintenance-disconnect-full-disks build 503789 integration-agent-docker-1033 (/: 30%, /srv: 8%, /var/lib/docker: 94%): RECOVERY disk space OK [18:23:48] (Queue (Jenkins jobs + Zuul functions) alert) resolved: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [18:25:40] maintenance-disconnect-full-disks build 503790 integration-agent-docker-1035 (/: 31%, /srv: 17%, /var/lib/docker: 100%): OFFLINE due to disk space [18:28:48] (Queue (Jenkins jobs + Zuul functions) alert) firing: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [18:30:37] maintenance-disconnect-full-disks build 503791 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 99%): RECOVERY disk space OK [18:33:48] (Queue (Jenkins jobs + Zuul functions) alert) resolved: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [18:40:37] maintenance-disconnect-full-disks build 503793 integration-agent-docker-1035 (/: 31%, /srv: 19%, /var/lib/docker: 100%): OFFLINE due to disk space [18:44:21] Hey there, several Gerrit patches are not merging due to a "No space left on device" CI issue. I assume this relates to the above? Is this being looked in to? [18:45:34] maintenance-disconnect-full-disks build 503794 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 99%): RECOVERY disk space OK [18:51:55] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10brennen) [18:53:05] 10Continuous-Integration-Config: integration-agent-docker-1035 free disk space flapping, causing Gerrit patches to not merge - https://phabricator.wikimedia.org/T340569 (10TheresNoTime) [18:53:20] Jdlrobson: created T340569 just in case.. [18:53:21] T340569: integration-agent-docker-1035 free disk space flapping, causing Gerrit patches to not merge - https://phabricator.wikimedia.org/T340569 [19:15:35] maintenance-disconnect-full-disks build 503800 integration-agent-docker-1035 (/: 31%, /srv: 18%, /var/lib/docker: 99%): OFFLINE due to disk space [19:20:37] maintenance-disconnect-full-disks build 503801 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 100%): RECOVERY disk space OK [19:25:37] maintenance-disconnect-full-disks build 503802 integration-agent-docker-1035 (/: 31%, /srv: 18%, /var/lib/docker: 99%): OFFLINE due to disk space [19:34:45] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10brennen) [19:40:35] maintenance-disconnect-full-disks build 503805 integration-agent-docker-1035 (/: 31%, /srv: 21%, /var/lib/docker: 99%): still OFFLINE due to disk space [19:45:40] maintenance-disconnect-full-disks build 503806 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 99%): RECOVERY disk space OK [19:48:52] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10matmarex) [19:50:38] maintenance-disconnect-full-disks build 503807 integration-agent-docker-1035 (/: 31%, /srv: 16%, /var/lib/docker: 100%): OFFLINE due to disk space [19:55:37] maintenance-disconnect-full-disks build 503808 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 99%): RECOVERY disk space OK [20:05:45] maintenance-disconnect-full-disks build 503810 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 99%): OFFLINE due to disk space [20:10:41] maintenance-disconnect-full-disks build 503811 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 99%): RECOVERY disk space OK [20:13:19] 10Phabricator, 10Release-Engineering-Team, 10VPS-project-Phabricator, 10User-brennen: After a deployment, Phabricator errors out with `Unable to load the "Arcanist" library. Put "arcanist/" next to "phabricator/" on disk.` - https://phabricator.wikimedia.org/T314460 (10brennen) 05Resolved→03Open p:05T... [20:20:59] maintenance-disconnect-full-disks build 503813 integration-agent-docker-1035 (/: 31%, /srv: 17%, /var/lib/docker: 99%): OFFLINE due to disk space [20:21:39] ah [20:23:30] so there was an issue with large disk consumption due to npm caches being very large for MediaWiki builds but I solved that by running `npm cache verify` which garbage collects objects [20:23:32] that was T340092 [20:23:33] T340092: Figure out how to garbage collect the npm cache - https://phabricator.wikimedia.org/T340092 [20:23:50] and caused the Jenkins workspaces under `/srv` to fil up [20:24:06] the above errors are `/var/lib/docker` [20:24:34] I am pretty sure that is `pytorch` again https://phabricator.wikimedia.org/T338317 [20:28:40] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Cloud-VPS (Quota-requests): Rebuild WMCS integration instances to larger flavor - https://phabricator.wikimedia.org/T340070 (10hashar) 05Declined→03Open Reopening cause installing `pytorch` generates a 14GB layer in Docker Buildkit ca... [20:30:38] maintenance-disconnect-full-disks build 503815 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 99%): RECOVERY disk space OK [20:32:10] hashar: Are the GitLab runners pruned too? Just got ENOSPC on runner-1026.gitlab-runners.eqiad1.wikimedia.cloud [20:35:59] maintenance-disconnect-full-disks build 503816 integration-agent-docker-1035 (/: 31%, /srv: 15%, /var/lib/docker: 100%): OFFLINE due to disk space [20:37:50] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Machine-Learning-Team: Python torch fills disk of CI Jenkins instances - https://phabricator.wikimedia.org/T338317 (10hashar) Reopening since the disks keep filing and I also reopened the task to resize the instances (T340070). Not sure... [20:38:06] pytorch takes 15GBytes :( [20:38:15] Oww. [20:38:53] !log integration-agent-docker-1035: `docker buildx prune` T338317 [20:38:56] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:38:56] T338317: Python torch fills disk of CI Jenkins instances - https://phabricator.wikimedia.org/T338317 [20:40:46] maintenance-disconnect-full-disks build 503817 integration-agent-docker-1035 (/: 31%, /srv: 12%, /var/lib/docker: 11%): RECOVERY disk space OK [20:44:03] (03PS4) 10Subramanya Sastry: Config for running diffs with core-integrated Parsoid [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933461 [20:44:05] (03PS4) 10Subramanya Sastry: Reorg configs and helper files [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933466 [20:44:07] (03PS3) 10Subramanya Sastry: Remove stale version code [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933601 [20:44:44] (03CR) 10CI reject: [V: 04-1] Reorg configs and helper files [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933466 (owner: 10Subramanya Sastry) [20:45:05] (03CR) 10CI reject: [V: 04-1] Remove stale version code [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933601 (owner: 10Subramanya Sastry) [20:55:40] (Queue (Jenkins jobs + Zuul functions) alert) firing: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [20:58:14] 10Continuous-Integration-Config: integration-agent-docker-1035 free disk space flapping, causing Gerrit patches to not merge - https://phabricator.wikimedia.org/T340569 (10Jdlrobson) p:05Triage→03Unbreak! This is blocking several patches and thus productivity in the web team today. Hope you don't mind me ma... [21:00:04] hashar: is the work you did above related to this ^ [21:00:17] I'm seeing "Installation failed, reverting ./composer.json and ./composer.lock to their original content." errors in CI now rather than disk space issues. [21:05:39] (Queue (Jenkins jobs + Zuul functions) alert) resolved: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [21:10:00] win 4 [21:21:28] (03PS5) 10Subramanya Sastry: Config for running diffs with core-integrated Parsoid [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933461 [21:21:30] (03PS5) 10Subramanya Sastry: Reorg configs and helper files [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933466 [21:21:33] (03PS4) 10Subramanya Sastry: Remove stale version code [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933601 [21:22:07] (03CR) 10CI reject: [V: 04-1] Remove stale version code [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933601 (owner: 10Subramanya Sastry) [21:22:13] (03CR) 10CI reject: [V: 04-1] Reorg configs and helper files [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933466 (owner: 10Subramanya Sastry) [21:27:37] 10GitLab (CI & Job Runners), 10Release-Engineering-Team: GitLab CI: "ENOSPC: no space left on device, mkdir" - https://phabricator.wikimedia.org/T340586 (10kostajh) [21:30:38] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.41.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T340243 (10matmarex) [21:46:52] (03CR) 10Arlolra: [C: 03+2] Remove stale version code [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/933601 (owner: 10Subramanya Sastry) [22:01:46] 10Continuous-Integration-Config: integration-agent-docker-1035 free disk space flapping, causing Gerrit patches to not merge - https://phabricator.wikimedia.org/T340569 (10Jdlrobson) p:05Unbreak!→03Triage The above error seems to be fixed now (possibly by T340092 ? ) There was an "Installation failed, revert... [22:34:47] 10Gerrit, 10VPS-project-Codesearch, 10VPS-project-Extdist, 10serviceops-collab: Move clients off of gerrit-replica.wikimedia.org back to gerrit.wikimedia.org - https://phabricator.wikimedia.org/T336710 (10Dzahn) so.. summarizing this: codesearch itself was switched to main gerrit in https://gerrit.wikime... [23:03:48] 10Release-Engineering-Team (They Live 🕶️🧟), 10Patch-For-Review, 10Release Pipeline (Blubber): Implement acceptance tests for Blubber as executable examples - https://phabricator.wikimedia.org/T338160 (10CodeReviewBot) dduvall opened https://gitlab.wikimedia.org/repos/releng/blubber/-/merge_requests/48 Provi... [23:04:04] 10Release-Engineering-Team (They Live 🕶️🧟), 10Patch-For-Review, 10Release Pipeline (Blubber): Implement acceptance tests for Blubber as executable examples - https://phabricator.wikimedia.org/T338160 (10CodeReviewBot)