[00:38:40] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: setup 2 contint machines for jenkins - https://phabricator.wikimedia.org/T418521#12052337 (10Dzahn) With [[ https://gerrit.wikimedia.org/r/c/operations/puppet/+/1300916 | gerrit:1300916 ]] I h... [03:13:13] 10Phabricator (Upstream): Typos on various codes? - https://phabricator.wikimedia.org/T430097#12052422 (10Pppery) [03:22:50] 10Phabricator (Upstream): Typos on various codes? - https://phabricator.wikimedia.org/T430097#12052423 (10Pppery) a:03Pppery [03:47:07] 10Phabricator (Upstream), 07Upstream: Typos on various codes? - https://phabricator.wikimedia.org/T430097#12052434 (10Pppery) https://we.phorge.it/D27080 https://we.phorge.it/D27081 https://we.phorge.it/D27082 [07:14:55] (03PS1) 10Hslater: Zuul: [mediawiki/extensions/ImportOfficeFiles] Re-enable master jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1305576 [07:18:46] 10Gerrit, 06collaboration-services, 13Patch-For-Review: Gerrit backups are growing - https://phabricator.wikimedia.org/T411583#12052720 (10ABran-WMF) This might be a consequence of {T425667}, I excluded the [[ https://gerrit.wikimedia.org/r/1305578 | httpd log ]] dir from backups [07:19:03] (03PS1) 10Hslater: Zuul: [mediawiki/extensions/FilterSpecialPages] Re-enable master jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1305579 [07:22:27] (03open) 10jnuche: jenkins-rel: update plugins to address vulnerabilities [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/118 (https://phabricator.wikimedia.org/T430110) [07:22:40] (03merge) 10jnuche: jenkins-rel: update plugins to address vulnerabilities [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/118 (https://phabricator.wikimedia.org/T430110) [07:26:30] (03PS1) 10Hslater: Zuul: [mediawiki/extensions/MergeArticles] Re-enable master jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1305582 [07:30:19] (03CR) 10Hashar: [C:03+2] Zuul: [mediawiki/extensions/FilterSpecialPages] Re-enable master jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1305579 (owner: 10Hslater) [07:30:20] (03CR) 10Hashar: [C:03+2] Zuul: [mediawiki/extensions/ImportOfficeFiles] Re-enable master jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1305576 (owner: 10Hslater) [07:30:21] (03CR) 10Hashar: [C:03+2] Zuul: [mediawiki/extensions/MergeArticles] Re-enable master jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1305582 (owner: 10Hslater) [07:32:02] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/FilterSpecialPages] Re-enable master jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1305579 (owner: 10Hslater) [07:32:06] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/ImportOfficeFiles] Re-enable master jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1305576 (owner: 10Hslater) [07:32:13] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/MergeArticles] Re-enable master jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1305582 (owner: 10Hslater) [07:35:56] (03PS2) 10Hashar: Remove API Portal skin and extension [integration/config] - 10https://gerrit.wikimedia.org/r/1305507 (https://phabricator.wikimedia.org/T429373) (owner: 10Alex Paskulin) [07:38:18] (03CR) 10Hashar: [C:03+2] "**Thank you!**" [integration/config] - 10https://gerrit.wikimedia.org/r/1305507 (https://phabricator.wikimedia.org/T429373) (owner: 10Alex Paskulin) [07:40:20] (03Merged) 10jenkins-bot: Remove API Portal skin and extension [integration/config] - 10https://gerrit.wikimedia.org/r/1305507 (https://phabricator.wikimedia.org/T429373) (owner: 10Alex Paskulin) [07:58:08] (03open) 10jnuche: jenkins.yaml: update Groovy method signature [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/119 [07:58:16] (03merge) 10jnuche: jenkins.yaml: update Groovy method signature [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/119 [08:01:31] 06Release-Engineering-Team (Priority Backlog πŸ“₯), 10Development-Metrics: Evaluate sharing indexed repositories between Grimoirelab and Codesearch - https://phabricator.wikimedia.org/T430117 (10Aklapper) 03NEW p:05Triageβ†’03Low [08:05:08] (03open) 10jnuche: jenkins-rel: update plugins to latest version [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/120 [08:05:11] (03merge) 10jnuche: jenkins-rel: update plugins to latest version [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/120 [08:54:04] 10Gerrit, 06collaboration-services, 13Patch-For-Review: Gerrit backups are growing - https://phabricator.wikimedia.org/T411583#12053080 (10ABran-WMF) p:05Unbreak!β†’03Medium Lowering down priority, while we evaluate the impact of the recent change [09:06:44] (03PS2) 10Mszwarc: WikimediaCustomizations: add CheckUser as dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1305604 (https://phabricator.wikimedia.org/T429785) [09:13:19] (03CR) 10Kosta Harlan: "Should we add it as a phan dependency too?" [integration/config] - 10https://gerrit.wikimedia.org/r/1305604 (https://phabricator.wikimedia.org/T429785) (owner: 10Mszwarc) [09:16:22] (03PS3) 10Mszwarc: WikimediaCustomizations: add CheckUser as dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1305604 (https://phabricator.wikimedia.org/T429785) [09:16:53] (03CR) 10Dreamy Jazz: [C:03+1] WikimediaCustomizations: add CheckUser as dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1305604 (https://phabricator.wikimedia.org/T429785) (owner: 10Mszwarc) [09:18:44] (03CR) 10Mszwarc: "Added Phan dependency" [integration/config] - 10https://gerrit.wikimedia.org/r/1305604 (https://phabricator.wikimedia.org/T429785) (owner: 10Mszwarc) [09:22:24] (03CR) 10Kosta Harlan: [C:03+1] WikimediaCustomizations: add CheckUser as dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1305604 (https://phabricator.wikimedia.org/T429785) (owner: 10Mszwarc) [09:47:46] 06Project-Admins: Request for project "User-tfogli" for organizing my personal workboard - https://phabricator.wikimedia.org/T430128 (10tappof) 03NEW [09:54:12] (03PS1) 10Jforrester: Docker: Bump Node 24 / Node 26 to new releases [integration/config] - 10https://gerrit.wikimedia.org/r/1305618 [09:54:13] (03PS1) 10Jforrester: jjb: Bump jobs to Node 24 / Node 26 new release images [integration/config] - 10https://gerrit.wikimedia.org/r/1305619 [09:54:27] (03CR) 10Jforrester: [C:03+2] Docker: Bump Node 24 / Node 26 to new releases [integration/config] - 10https://gerrit.wikimedia.org/r/1305618 (owner: 10Jforrester) [09:56:14] (03Merged) 10jenkins-bot: Docker: Bump Node 24 / Node 26 to new releases [integration/config] - 10https://gerrit.wikimedia.org/r/1305618 (owner: 10Jforrester) [10:07:08] !log Docker: Bump Node 24 / Node 26 to new releases [10:07:09] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:09:03] 10Gerrit, 06collaboration-services: Gerrit backups are growing - https://phabricator.wikimedia.org/T411583#12053339 (10jcrespo) Thanks for the quick response. The main issue was gerrit2003.wikimedia.org-Hourly-Tue-ReposEqiad-gerrit-repo-data: ` | 4,161 | 7,316,041 | 2,545,343,084,546 | gerrit1001.wikimedi... [11:00:52] 10Phabricator, 10Phabricator maintenance bot: Replace deprecated Phabricator Conduit API calls with their stable equivalents - https://phabricator.wikimedia.org/T428847#12053565 (10Aklapper) From a quick poke (obviously untested plus I don't really python): ` diff --git a/column_mover.py b/column_mover.py inde... [11:36:20] (03CR) 10Jforrester: [C:03+2] jjb: Bump jobs to Node 24 / Node 26 new release images [integration/config] - 10https://gerrit.wikimedia.org/r/1305619 (owner: 10Jforrester) [11:38:41] (03Merged) 10jenkins-bot: jjb: Bump jobs to Node 24 / Node 26 new release images [integration/config] - 10https://gerrit.wikimedia.org/r/1305619 (owner: 10Jforrester) [11:46:28] 06Release-Engineering-Team (Doing 😎), 10Development-Metrics: Improve data quality in Development Metrics database (June 2026) - https://phabricator.wikimedia.org/T429739#12053794 (10Aklapper) 05Openβ†’03Resolved [11:58:20] (03CR) 10Hashar: [C:03+2] "Note some CheckUser tests might start failing if they rely on other dependencies. The tests would need to be skipped ;)" [integration/config] - 10https://gerrit.wikimedia.org/r/1305604 (https://phabricator.wikimedia.org/T429785) (owner: 10Mszwarc) [12:00:24] (03Merged) 10jenkins-bot: WikimediaCustomizations: add CheckUser as dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1305604 (https://phabricator.wikimedia.org/T429785) (owner: 10Mszwarc) [12:12:27] (03PS1) 10Kim.pham: Zuul: Add extension-apitests to WikibaseLexeme extension [integration/config] - 10https://gerrit.wikimedia.org/r/1305645 (https://phabricator.wikimedia.org/T430038) [12:21:49] (03CR) 10Hashar: [C:03+1] "As soon as this is merged and deployed, the new API testing job will fail until I883d930bb38aaca0cae1681cafb1a2c670b1e611 is merged :]" [integration/config] - 10https://gerrit.wikimedia.org/r/1305645 (https://phabricator.wikimedia.org/T430038) (owner: 10Kim.pham) [12:23:10] (03CR) 10Hashar: [C:03+1] "...continuing cause I have pressed send too fast" [integration/config] - 10https://gerrit.wikimedia.org/r/1305645 (https://phabricator.wikimedia.org/T430038) (owner: 10Kim.pham) [12:38:45] (03PS1) 10Hslater: Zuul: [mediawiki/extensions/ContainerFilter] Re-enable master jobs [integration/config] - 10https://gerrit.wikimedia.org/r/1305663 [12:50:37] 10Gerrit, 06collaboration-services: Gerrit backups are growing - https://phabricator.wikimedia.org/T411583#12054018 (10ABran-WMF) Thanks @jcrespo for the heads up. Hopefully not backing up logs should help reduce the total volume. Let me know if it's not enough so we can try something else. What do you think... [12:56:44] (03CR) 10Kim.pham: "ok thank you!" [integration/config] - 10https://gerrit.wikimedia.org/r/1305645 (https://phabricator.wikimedia.org/T430038) (owner: 10Kim.pham) [13:17:49] 10Phabricator, 06DBA: Switchover m3 (phabricator) master (db1250 -> db1228) - https://phabricator.wikimedia.org/T430158#12054162 (10Marostegui) [13:22:05] 10Gerrit, 06collaboration-services: Gerrit backups are growing - https://phabricator.wikimedia.org/T411583#12054172 (10jcrespo) >>! In T411583#12054018, @ABran-WMF wrote: > What do you think about [[ https://codesearch.wmcloud.org/search/?q=predict_linear&files=&excludeFiles=&repos=operations%2Falerts | "predi... [13:26:04] 06Release-Engineering-Team, 10Scap: https://versions.toolforge.org/ shows group2 as partially-deployed - https://phabricator.wikimedia.org/T430159 (10Jdforrester-WMF) 03NEW [13:37:21] (03CR) 10Kosta Harlan: "I implemented the trait approach in Ic8436b96876bb8feebb7cae53006d6f354260e1a" [integration/config] - 10https://gerrit.wikimedia.org/r/1295049 (https://phabricator.wikimedia.org/T426089) (owner: 10Kosta Harlan) [13:37:26] (03Abandoned) 10Kosta Harlan: Zuul: [mediawiki/extensions/Wikibase] Add ConfirmEdit to repository job deps [integration/config] - 10https://gerrit.wikimedia.org/r/1295049 (https://phabricator.wikimedia.org/T426089) (owner: 10Kosta Harlan) [13:40:40] (03CR) 10Alex Paskulin: "Thank you!" [integration/config] - 10https://gerrit.wikimedia.org/r/1305507 (https://phabricator.wikimedia.org/T429373) (owner: 10Alex Paskulin) [13:46:56] (03approved) 10aghirelli: Add rule reference [repos/ci-tools/wikimedia-spectral-ruleset] - 10https://gitlab.wikimedia.org/repos/ci-tools/wikimedia-spectral-ruleset/-/merge_requests/13 (https://phabricator.wikimedia.org/T422930) (owner: 10kbach) [13:47:03] (03merge) 10aghirelli: Add rule reference [repos/ci-tools/wikimedia-spectral-ruleset] - 10https://gitlab.wikimedia.org/repos/ci-tools/wikimedia-spectral-ruleset/-/merge_requests/13 (https://phabricator.wikimedia.org/T422930) (owner: 10kbach) [14:09:27] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: setup 2 contint machines for jenkins - https://phabricator.wikimedia.org/T418521#12054392 (10hashar) >>! In T418521#12051810, @Dzahn wrote: > @hashar regarding the test with netcat: > > This... [14:20:43] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: setup 2 contint machines for jenkins - https://phabricator.wikimedia.org/T418521#12054436 (10hashar) >>! In T418521#12052337, @Dzahn wrote: > With [[ https://gerrit.wikimedia.org/r/c/operation... [14:21:43] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: setup 2 contint machines for jenkins - https://phabricator.wikimedia.org/T418521#12054440 (10hashar) [14:32:32] FIRING: InstanceDown: Project deployment-prep instance deployment-cirrussearch14 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [14:33:32] FIRING: PuppetStaleCertificates: Found non-revoked Puppet certificates for 4 deleted instances on deployment-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [14:33:40] 10Beta-Cluster-Infrastructure: Found non-revoked Puppet certificates for 4 deleted instances on deployment-puppetserver-1 - https://phabricator.wikimedia.org/T430165 (10wmcs-alerts) 03NEW [14:37:32] RESOLVED: [3x] InstanceDown: Project deployment-prep instance deployment-cirrussearch13 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [14:53:27] (03CR) 10Jakob: [C:03+1] "I883d930bb38aaca0cae1681cafb1a2c670b1e611 has landed, so I think this is also good to go now! ✨" [integration/config] - 10https://gerrit.wikimedia.org/r/1305645 (https://phabricator.wikimedia.org/T430038) (owner: 10Kim.pham) [14:53:34] FIRING: InstanceDown: Project deployment-prep instance deployment-dancy3 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [14:53:40] 10Beta-Cluster-Infrastructure: Project deployment-prep instance deployment-dancy3 is down - https://phabricator.wikimedia.org/T430168 (10wmcs-alerts) 03NEW [14:58:34] RESOLVED: InstanceDown: Project deployment-prep instance deployment-dancy3 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [14:58:51] 10Beta-Cluster-Infrastructure: Found non-revoked Puppet certificates for 4 deleted instances on deployment-puppetserver-1 - https://phabricator.wikimedia.org/T430165#12054636 (10dancy) I ran ` sudo puppetserver ca clean --certname deployment-dancy2.deployment-prep.eqiad1.wikimedia.cloud sudo puppetserver ca cle... [15:01:52] 06Release-Engineering-Team, 10Scap: https://versions.toolforge.org/ shows group2 as partially-deployed - https://phabricator.wikimedia.org/T430159#12054665 (10Aklapper) Only a vague idea: Expanding the list for group2 lists `apiportalwiki` while {T429372} is resolved [15:03:34] 10Continuous-Integration-Infrastructure, 07Jenkins, 07SecTeam-Processed, 05Vuln-VulnComponent: Jenkins plugins security advisory 2026-06-24 - https://phabricator.wikimedia.org/T430110#12054669 (10sbassett) [15:03:36] 10Continuous-Integration-Infrastructure, 07Jenkins, 07SecTeam-Processed, 05Vuln-VulnComponent: Jenkins plugins security advisory 2026-06-24 - https://phabricator.wikimedia.org/T430110#12054670 (10sbassett) p:05Triageβ†’03Medium [15:03:58] 10Beta-Cluster-Infrastructure: No Puppet resources found on instance deployment-cirrussearch14 on project deployment-prep - https://phabricator.wikimedia.org/T428819#12054678 (10dancy) β†’14Duplicate dup:03T424100 [15:04:03] 10Beta-Cluster-Infrastructure, 06Data-Platform-SRE (2026-06-05 - 2026-06-26): No Puppet resources found on instance deployment-cirrussearch14 on project deployment-prep - https://phabricator.wikimedia.org/T424100#12054680 (10dancy) [15:04:08] 10Beta-Cluster-Infrastructure: Found non-revoked Puppet certificates for 4 deleted instances on deployment-puppetserver-1 - https://phabricator.wikimedia.org/T430165#12054682 (10dancy) 05Openβ†’03Resolved a:03dancy And: ` for n in $(seq 12 14); do host=deployment-cirrussearch$n.deployment-prep.eqiad1.wik... [15:06:02] 10Beta-Cluster-Infrastructure: Project deployment-prep instance deployment-cirrussearch14 is down - https://phabricator.wikimedia.org/T429296#12054690 (10dancy) 05Openβ†’03Resolved a:03dancy T425585 T430165#1205468 [15:10:45] 10Beta-Cluster-Infrastructure, 06Data-Platform-SRE (2026-06-05 - 2026-06-26): Write lightweight OCI-image-based Puppet plans for beta cluster - https://phabricator.wikimedia.org/T425585#12054747 (10bking) @dcausse thanks for the questions, let me address those: > I believe that search is back on the beta clus... [15:11:35] (03PS2) 10Ottomata: phan: Add Wikibase as EventBus phan dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1297767 (https://phabricator.wikimedia.org/T428176) [15:13:32] RESOLVED: PuppetStaleCertificates: Found non-revoked Puppet certificates for 3 deleted instances on deployment-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [15:19:32] 06Release-Engineering-Team, 10Scap: https://versions.toolforge.org/ shows group2 as partially-deployed - https://phabricator.wikimedia.org/T430159#12054789 (10Jdforrester-WMF) Aha, yes, that might be the issue. [15:21:08] 10Phabricator, 10LibUp: Replace deprecated (frozen) Phabricator Conduit API calls with their stable equivalents - https://phabricator.wikimedia.org/T427620#12054826 (10Aklapper) [15:23:52] (03CR) 10Kim.pham: "ready for merge" [integration/config] - 10https://gerrit.wikimedia.org/r/1305645 (https://phabricator.wikimedia.org/T430038) (owner: 10Kim.pham) [15:35:37] 06Release-Engineering-Team (Priority Backlog πŸ“₯), 05Release, 05Train Deployments: 1.47.0-wmf.8 deployment blockers - https://phabricator.wikimedia.org/T423917#12054948 (10brennen) a:05brennenβ†’03jeena Filed a few things during this morning's log triage meeting. Logs are a little messy, but I don't think th... [15:41:11] (03update) 10dancy: main.py: Only announce rollback when k8s deployment is enabled [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1210 [15:41:13] (03open) 10dancy: main.py: Only announce rollback when k8s deployment is enabled [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1210 [15:44:54] (03merge) 10dancy: main.py: Only announce rollback when k8s deployment is enabled [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1210 [15:45:54] (03update) 10dancy: Release 4.270.1 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1211 [15:45:55] (03open) 10dancy: Release 4.270.1 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1211 [15:48:15] (03merge) 10dancy: Release 4.270.1 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1211 [17:30:29] 10Beta-Cluster-Infrastructure, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 13Patch-For-Review: Write lightweight OCI-image-based Puppet plans for beta cluster - https://phabricator.wikimedia.org/T425585#12055784 (10Krinkle) [18:00:38] 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Eliminate sudo rm -rf /var/lib/puppet/ssl step in new deployment-prep WMCS project (and others) - https://phabricator.wikimedia.org/T429413#12055959 (10bd808) 05Openβ†’03In progress a:03dancy [18:55:32] FIRING: PuppetAgentNoResources: No Puppet resources found on instance deployment-cirrussearch15 on project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [18:55:45] 10Beta-Cluster-Infrastructure: No Puppet resources found on instance deployment-cirrussearch15 on project deployment-prep - https://phabricator.wikimedia.org/T430217 (10wmcs-alerts) 03NEW [19:04:02] 10Beta-Cluster-Infrastructure, 06Data-Platform-SRE (2026-06-05 - 2026-06-26): No Puppet resources found on instance deployment-cirrussearch15 on project deployment-prep - https://phabricator.wikimedia.org/T430217#12056264 (10bking) 05Openβ†’03In progress p:05Triageβ†’03Medium a:03bking [19:14:55] (03PS1) 101Veertje: Fix login form: use label elements instead of th for accessibility [software/gerrit] (wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1305742 [19:15:32] RESOLVED: PuppetAgentNoResources: No Puppet resources found on instance deployment-cirrussearch15 on project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [19:20:05] James_F: oops, I should probalby deploy https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1304247 before the next branch goes out. [19:20:27] I failed to consider what would happen if ShortUrl is not branched / present as wmf branch submodule with that still there [19:20:37] it's unreachable I guess, except for extension-list [19:20:57] I'm surprised it hasn't caused an issue yet [19:22:16] https://gitlab.wikimedia.org/repos/releng/scap/-/blob/908f68c23ce69b0911fe9cac617285e13dd95d35/scap/tasks.py#L568 [19:22:21] "fall back to the old location in wmf-config" [19:22:40] Either that's very outdated or there is an as-of-yet unused new location. [19:29:04] (03PS2) 101Veertje: Fix login form: use label elements instead of th for accessibility [software/gerrit] (wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1305742 [19:29:17] At glance mergeMessageFileList.php would error if it finds a file in extension-list that doesn't exist [19:29:57] That feels dejavu [19:31:48] (03PS1) 101Veertje: Fix login form: responsive labels and mobile-friendly layout [software/gerrit] (wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1305745 [19:45:32] Krinkle: it’s caused an issue in the nightly builds. [19:45:39] So yes, please deploy that. [19:50:37] 06Release-Engineering-Team (Priority Backlog πŸ“₯), 05Release, 05Train Deployments: 1.47.0-wmf.8 deployment blockers - https://phabricator.wikimedia.org/T423917#12056375 (10cscott) ##### Risky Patch! πŸš‚πŸ”₯ * **Change**: https://gerrit.wikimedia.org/r/c/1305746 * **Summary**: ** This configuration change depends... [19:50:53] James_F: trying to find it in logstash, where should I look? [19:51:06] 14Phabricator (2026-06-09), 06Release-Engineering-Team (Doing 😎): Update to Phorge/Arcanist upstream 2026-06-01 - https://phabricator.wikimedia.org/T410849#12056378 (10bd808) [19:51:38] I"m getting nothing from scap and from mediawiki I only see the (absence of) "role-primary: INSERT IGNORE INTO `shorturls`" write on get warnings after June 19 [19:51:46] when searching for ShortUrl [19:52:41] Krinkle: releases-Jenkins [19:53:31] only seeing docpub jobs at https://releases-jenkins.wikimedia.org/computer/releases1003%2Eeqiad%2Ewmnet/builds [19:54:25] Yes, you won’t see it. [20:49:58] !log Investigating castor-save-workspace-cache clog [20:49:59] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:57:49] !log Restarting Jenkins to unstick builds [20:57:50] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:13:58] 10Beta-Cluster-Infrastructure, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 13Patch-For-Review: Write lightweight OCI-image-based Puppet plans for beta cluster - https://phabricator.wikimedia.org/T425585#12056578 (10bking) [21:20:04] (03PS1) 10Dduvall: zuul: Fix uploading of logs for base job [integration/config] - 10https://gerrit.wikimedia.org/r/1305764 [21:32:17] (03Abandoned) 10Dduvall: zuul: Fix uploading of logs for base job [integration/config] - 10https://gerrit.wikimedia.org/r/1305764 (owner: 10Dduvall) [21:44:51] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: setup 2 contint machines for jenkins - https://phabricator.wikimedia.org/T418521#12056664 (10Dzahn) @hashar - let apache httpd load all the proxy configs - loaded ssl module - turned SSLProxy... [21:45:20] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: setup 2 contint machines for jenkins - https://phabricator.wikimedia.org/T418521#12056672 (10Dzahn) note: this only works with `nc -6` when using IPv6 [21:47:29] mutante: well done! I am quite happy you have figured out the Envoy config fix!! And sorry for the misleading debugging due to nc only listening on IPv4 :] [21:48:10] I guess we can attempt the switch over again next week [21:48:27] hashar: :) thank you. well, I can proof now that Apache proxies via https to envoy and envoy connects to 8080 on localhost [21:48:50] the thing is: so far it will only work if jenkins will listen on IPv6.. not if it's IPv4-only [21:49:06] wondering if I should start jenkins itself now [21:49:12] or leave it at the test with nc [21:50:09] well, I say it should just work though [21:50:17] because this is how jenkins looks on the contint1002 host [21:50:18] tcp6 0 0 :::8080 :::* LISTEN 924 197983639 2873424/java [21:50:23] obviously v6 [21:50:37] we can schedule the next meeting I guess [21:51:02] or I can bring up the service so we can see the web UI [21:51:23] mutante: that is an issue with Envoy so? [21:51:54] if you bring it up, you'd need to first delete the jobs/nodes configuration on contint1003 [21:52:06] /var/lib/jenkins/config/nodes /var/lib/jenkins/config/jobs [21:52:38] that'll prevent the new Jenkins from attachign the WMCS agents [21:53:01] and since there is no more job, prevent it from triggering the jobs that run on a timer (doc generation, selenium tests etc) [21:53:17] I would not necessarily call it an issue. it's just that it prefers IPv6. [21:53:26] and we currently say to use "localhost" [21:53:30] or well have a nc listening on ipv4 and another one on ipv6 ;) [21:53:35] we could hardcode 127.0.0.1 [21:53:43] but I dont think we need to or should [21:53:49] since it should work as is [21:54:04] > so far it will only work if jenkins will listen on IPv6.. not if it's IPv4-only [21:54:18] that can be verified on contint1002? I would expect it to listen on both protocols [21:54:20] yea, but then I looked at the jenkins we have [21:54:28] and it's on IPv6 [21:54:33] so it's alright [21:54:44] tcp6 0 0 :::8080 :::* LISTEN 924 197983639 2873424/java [21:55:17] and if you want to hardcode the localhost as ipv6 you can use `ip6-localhost` [21:55:22] /etc/hosts: ::1 localhost ip6-localhost ip6-loopback [21:55:31] yea, this should mean on BOTH [21:55:36] yup [21:56:18] 10Beta-Cluster-Infrastructure: Project deployment-prep instance deployment-cirrussearch12 is down - https://phabricator.wikimedia.org/T429299#12056688 (10bd808) 05Openβ†’03Invalid [21:57:08] https://integration.wikimedia.org/jenkins [21:58:34] let me follow the advice how to safely just start jenkins [21:59:53] hashar: already has no config: /var/lib/jenkins/config: cannot open `/var/lib/jenkins/config' (No such file or directory) [22:01:30] well wrong path maybe :b [22:01:58] yeah jobs config are /var/lib/jenkins/jobs [22:02:01] you can nuke it [22:02:19] ACK, done. rm -rf jobs [22:02:50] after all this "making sure jenkins is REALLY not running" now need to figure out how to make it run again, hah [22:02:52] and nodes are in /var/lib/jenkins/nodes , you can keep them though they are not going to run anything magically [22:03:01] ok, nod [22:03:06] and that validates the new jenkins can attach to the wmcs instances [22:03:14] ok, that's good [22:03:23] I am pretty sure it mostly worked behind the scene when we did the switch [22:03:39] the blocking point was the envoy config + apache https/SSL thing and you have resolved both [22:03:45] :) [22:03:48] so I guess that unblocks the switchover! (kudos!) [22:04:19] great to hear, thx. ok, I will turn jenkins on [22:05:30] do you have a link for the migration handbook? I ll revisit/amend it tomorrow ;) [22:05:35] and then we can schedule a switchover [22:06:20] also deleting jobs dir on contint2003 just in case [22:06:57] I have to write a new one now reflecting the new setup. [22:07:00] but it will be short :) [22:07:22] I can mail that to you so you have it tomorrow when you wake up. [22:08:35] one way can be: 'use /jenkins instead of /ci'. the end." [22:08:53] or you hate that path and we renamed /jenkins to /ci :) [22:11:04] RECOVERY - jenkins_service_running on contint1003 is OK: PROCS OK: 1 process with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [22:11:46] there we go [22:11:52] hashar: success:) https://integration.wikimedia.org/jenkins/ [22:12:10] and with that.. let you go to bed, heh [22:16:18] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: setup 2 contint machines for jenkins - https://phabricator.wikimedia.org/T418521#12056769 (10Dzahn) Setup/fixed the apache->envoy->jenkins proxy chain from contint1002->contint1003 over SSL.... [22:21:55] 10Continuous-Integration-Infrastructure, 10Castor: gate-and-submit backlogged due to waiting for castor-save-workspace-cache - https://phabricator.wikimedia.org/T353925#12056781 (10hashar) @Peter rediscovered it as one job has a very large cache. The investigation has been conducted at T427450 @Mhurd was inv... [22:22:35] 10Continuous-Integration-Infrastructure, 10Castor: gate-and-submit backlogged due to waiting for castor-save-workspace-cache - https://phabricator.wikimedia.org/T353925#12056784 (10hashar) [22:37:12] mutante: great and congrats! Can you shut it down on contint1003? ;] [22:37:23] in case there is some side effect [23:20:49] ok, reverted service activation with https://gerrit.wikimedia.org/r/c/operations/puppet/+/1305782 [23:29:06] 10Beta-Cluster-Infrastructure, 06Traffic: Project deployment-prep instance deployment-cache-text08 is down - https://phabricator.wikimedia.org/T429552#12056902 (10bd808) 05Openβ†’03Invalid [23:29:14] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: setup 2 contint machines for jenkins - https://phabricator.wikimedia.org/T418521#12056905 (10Dzahn) masked the service again because I was asked to deactivate it in case there is a side effect... [23:29:31] 10Beta-Cluster-Infrastructure: Project deployment-prep instance deployment-mwmaint03 is down - https://phabricator.wikimedia.org/T429592#12056906 (10bd808) 05Openβ†’03Invalid [23:30:04] PROBLEM - jenkins_service_running on contint1003 is CRITICAL: PROCS CRITICAL: 0 processes with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [23:30:48] 10Beta-Cluster-Infrastructure: Project deployment-prep instance deployment-dancy3 is down - https://phabricator.wikimedia.org/T430168#12056908 (10bd808) 05Openβ†’03Invalid [23:36:01] 10Beta-Cluster-Infrastructure: Error during startup of php8.3-fpm on deployment-jobrunner05.deployment-prep - https://phabricator.wikimedia.org/T430062#12056931 (10bd808) The file is where I would expect it to be. `lang=shell-session bd808@deployment-jobrunner05:~$ ls -lh /srv/mediawiki/php-master/includes/Logge...