[00:01:39] FIRING: DatasourceError: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [00:11:39] RESOLVED: DatasourceError: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [00:14:12] 10Continuous-Integration-Config, 10doc.wikimedia.org: Jsdoc for MediaWiki extension is published with an extra directory: master/js/js - https://phabricator.wikimedia.org/T359907#9924435 (10Novem_Linguae) [00:14:23] 10Continuous-Integration-Config, 10doc.wikimedia.org: Jsdoc for MediaWiki extension is published with an extra directory: master/js/js - https://phabricator.wikimedia.org/T359907#9924433 (10Novem_Linguae) [00:18:32] 10Continuous-Integration-Config, 10doc.wikimedia.org: Jsdoc for MediaWiki extension is published with an extra directory: master/js/js - https://phabricator.wikimedia.org/T359907#9924470 (10Novem_Linguae) I noticed this too. Pros of having CI automatically add /js: - no chance of it conflicting with Doxygen o... [02:37:59] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure): Builds failing with "Failed to clone https://github.com/wikimedia/oauth2-server.git" - https://phabricator.wikimedia.org/T368490 (10matmarex) 03NEW [03:42:17] 10Beta-Cluster-Infrastructure: Request for interface admin permission on Chinese Wikipedia Beta Cluster - https://phabricator.wikimedia.org/T368491 (10Diskdance) 03NEW [03:48:08] 10Phabricator, 13Patch-For-Review: Make config page display version information - https://phabricator.wikimedia.org/T360756#9924751 (10Pppery) Apologies for misunderstanding the process. Nevertheless, the fact that for certain repositories (apparently puppet/operations stuff) there is an people need to get exp... [07:02:26] 10Beta-Cluster-Infrastructure: Request for interface admin permission on Chinese Wikipedia Beta Cluster - https://phabricator.wikimedia.org/T368491#9924955 (10DannyS712) 05Open→03Resolved a:03DannyS712 Granted for 1 year https://zh.wikipedia.beta.wmflabs.org/wiki/Special:Redirect/logid/10419 [08:03:41] (03CR) 10Hashar: [C:03+2] Merge branch 'stable-3.10' into wmf/stable-3.10 [software/gerrit] (wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1043813 (https://phabricator.wikimedia.org/T367419) (owner: 10Hashar) [08:03:51] (03CR) 10Hashar: [C:03+2] Gerrit 3.10.x rebuild plugins and update TypeScript API [software/gerrit] (deploy/wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1047175 (https://phabricator.wikimedia.org/T367419) (owner: 10Hashar) [08:15:17] (03CR) 10CI reject: [V:04-1] Merge branch 'stable-3.10' into wmf/stable-3.10 [software/gerrit] (wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1043813 (https://phabricator.wikimedia.org/T367419) (owner: 10Hashar) [08:15:41] (03CR) 10Hashar: [C:04-2] Gerrit 3.10.x rebuild plugins and update TypeScript API [software/gerrit] (deploy/wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1047175 (https://phabricator.wikimedia.org/T367419) (owner: 10Hashar) [08:18:50] (03CR) 10Hashar: [C:03+2] Merge branch 'stable-3.10' into wmf/stable-3.10 [software/gerrit] (wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1043813 (https://phabricator.wikimedia.org/T367419) (owner: 10Hashar) [08:29:04] (03Merged) 10jenkins-bot: Merge branch 'stable-3.10' into wmf/stable-3.10 [software/gerrit] (wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1043813 (https://phabricator.wikimedia.org/T367419) (owner: 10Hashar) [08:29:23] (03CR) 10Hashar: [C:03+2] Gerrit 3.10.x rebuild plugins and update TypeScript API [software/gerrit] (deploy/wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1047175 (https://phabricator.wikimedia.org/T367419) (owner: 10Hashar) [08:29:59] (03Merged) 10jenkins-bot: Gerrit 3.10.x rebuild plugins and update TypeScript API [software/gerrit] (deploy/wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1047175 (https://phabricator.wikimedia.org/T367419) (owner: 10Hashar) [08:41:20] Project beta-code-update-eqiad build #501828: 04FAILURE in 4 min 17 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/501828/ [08:43:01] Project beta-code-update-eqiad build #501829: 04STILL FAILING in 0.55 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/501829/ [08:54:16] (03open) 10jnuche: branch-cut-test-patches: temporarily remove cleaup [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/84 [08:54:45] !log gerrit: changed HEAD of operations/software/gerrit from deploy/wmf/stable-3.9 to deploy/wmf/stable-3.10 # T367419 [08:54:47] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [08:54:47] T367419: Upgrade to Gerrit 3.10 - https://phabricator.wikimedia.org/T367419 [08:55:10] (03update) 10jnuche: branch-cut-test-patches: temporarily remove cleaup [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/84 [08:55:22] Yippee, build fixed! [08:55:22] Project beta-code-update-eqiad build #501830: 09FIXED in 2 min 21 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/501830/ [08:55:43] (03update) 10jnuche: branch-cut-test-patches: temporarily remove cleanup [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/84 [08:56:04] 10Gerrit (Gerrit 3.10), 06Release-Engineering-Team, 06collaboration-services: Upgrade to Gerrit 3.10 - https://phabricator.wikimedia.org/T367419#9925355 (10hashar) 05Open→03Resolved [08:56:34] (03merge) 10jnuche: branch-cut-test-patches: temporarily remove cleanup [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/84 [09:02:57] codders: do you have any early feedback from WMDE engineers on the parallel phpunit rollout? [09:03:09] morning! [09:03:24] yeah - had a couple of reports of issues so far - FileImporter is blocked [09:03:42] but we have a patch to fix that. It was because of global state between tests [09:04:00] the WMDE team is taking part in a works-council meeting this morning, so there's also not much development happening so far [09:04:06] no complaints besides that though [09:04:13] (03open) 10jnuche: scap clean: perform l10n cleanup only when l10n files can be found [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/365 (https://phabricator.wikimedia.org/T368239) [09:04:34] (03update) 10jnuche: scap clean: perform l10n cleanup only when l10n files can be found [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/365 (https://phabricator.wikimedia.org/T368239) [09:05:30] kostajh: codders: thank you for monitoring the aftermath of enabling PHPUnit parallelization [09:05:31] :) [09:06:14] did the servers fall over? or are we doing okay for load? [09:06:58] (03update) 10jnuche: scap clean: perform l10n cleanup only when l10n files can be found [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/365 (https://phabricator.wikimedia.org/T368239) [09:07:19] codders: ok thx. Can you please link the tasks / patches to https://phabricator.wikimedia.org/T361190 if they are not already? [09:07:58] codders: I hvae no idea :) [09:09:15] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 07Composer: "composer install" flaky in CI due to "Failed to connect to github.com port 443: Connection timed out" - https://phabricator.wikimedia.org/T362095#9925393 (10Tgr) [09:09:24] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure): Builds failing with "Failed to clone https://github.com/wikimedia/oauth2-server.git" - https://phabricator.wikimedia.org/T368490#9925391 (10Tgr) →14Duplicate dup:03T362095 [09:09:25] (03update) 10jnuche: scap clean: perform l10n cleanup only when l10n files can be found [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/365 (https://phabricator.wikimedia.org/T368239) [09:11:11] 10Continuous-Integration-Infrastructure, 07Developer Productivity: CI depending on GitHub results in numerous failures outside our control - https://phabricator.wikimedia.org/T362426#9925407 (10Tgr) [09:11:13] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 07Composer: "composer install" flaky in CI due to "Failed to connect to github.com port 443: Connection timed out" - https://phabricator.wikimedia.org/T362095#9925410 (10Tgr) [09:11:17] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review: ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.c... - https://phabricator.wikimedia.org/T362425#9925408 [09:13:55] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 07Composer: "composer install" flaky in CI due to "Failed to connect to github.com port 443: Connection timed out" - https://phabricator.wikimedia.org/T362095#9925430 (10Tgr) [09:18:03] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 07Composer: "composer install" flaky in CI due to "Failed to connect to github.com port 443: Connection timed out" - https://phabricator.wikimedia.org/T362095#9925444 (10hashar) 05Open→03Resolved Marking this task... [09:24:49] 10Continuous-Integration-Infrastructure, 06cloud-services-team, 10MediaWiki-Vagrant, 07Composer, 07Upstream: Composer activity from Cloud VPS hosts can be rate limited by GitHub - https://phabricator.wikimedia.org/T106452#9925469 (10hashar) 05Open→03Resolved a:03bd808 This was solved for #conti... [09:25:50] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 07Composer: "composer install" flaky in CI due to "Failed to connect to github.com port 443: Connection timed out" - https://phabricator.wikimedia.org/T362095#9925482 (10hashar) [09:27:16] codders: there were a bunch of CI issues yesterday that had to do with other things, so it was not really a "normal" day for comparison [09:27:29] 10Continuous-Integration-Infrastructure, 07Developer Productivity: CI depending on GitHub results in numerous failures outside our control - https://phabricator.wikimedia.org/T362426#9925502 (10hashar) 05Open→03Declined I feel this task was filed as GitHub had some transient issue and I don't think the... [09:42:05] 10Phabricator: Automate weekly request for Phabricator data for potential Tech News entries - https://phabricator.wikimedia.org/T368460#9925573 (10Aklapper) p:05Triage→03High [09:45:50] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review: ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.c... - https://phabricator.wikimedia.org/T362425#9925584 [10:29:34] FIRING: DatasourceError: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [10:34:34] RESOLVED: DatasourceError: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [10:47:29] 10Beta-Cluster-Infrastructure, 10Cloud-VPS (Debian Buster Deprecation): Migrate deployment-prep away from Debian Buster to Bullseye/Bookworm - https://phabricator.wikimedia.org/T327742#9925836 (10hnowlan) [10:48:32] 10Phabricator, 10Release-Engineering-Team (Priority Backlog 📥), 13Patch-For-Review: Automate weekly request for Phabricator data for potential Tech News entries - https://phabricator.wikimedia.org/T368460#9925837 (10Aklapper) [10:54:22] you can copy lines on the diff screen on gerrit in safari (it's been fixed, from 3.10) [10:57:55] 10Phabricator, 13Patch-For-Review: Make config page display version information - https://phabricator.wikimedia.org/T360756#9925855 (10Aklapper) a:03Dzahn I guess I should realistically assign this to @Dzahn who's been brave enough to sort out all those small bits and pieces in our infrastructure setup? (Tha... [11:51:05] (03PS1) 10Daimona Eaytoy: Zuul: [mediawiki/extensions/CampaignEvents] Add Translate dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1049912 [11:51:13] (03CR) 10CI reject: [V:04-1] Zuul: [mediawiki/extensions/CampaignEvents] Add Translate dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1049912 (owner: 10Daimona Eaytoy) [11:52:33] (03open) 10btullis: Allow repos/data-engineering/pgbouncer to use trusted runners [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/86 (https://phabricator.wikimedia.org/T368030) [12:18:08] (03PS1) 10Arthur taylor: Add notice at the end of console log for parallel test runs [integration/quibble] - 10https://gerrit.wikimedia.org/r/1049919 (https://phabricator.wikimedia.org/T361190) [12:34:21] (03PS2) 10Arthur taylor: Add notice at the end of console log for parallel test runs [integration/quibble] - 10https://gerrit.wikimedia.org/r/1049919 (https://phabricator.wikimedia.org/T361190) [12:35:46] kostajh, hashar - per the request yesterday, I made a patch to add a little notice to the end of quibble runs that have parallel enabled - https://gerrit.wikimedia.org/r/c/integration/quibble/+/1049919 [12:50:23] 10Phabricator, 13Patch-For-Review: Make config page display version information - https://phabricator.wikimedia.org/T360756#9926267 (10Dzahn) Yea, I can confirm that adding the git config snippet will fix the "display version" issue. But still working on how to apply the config via puppet and/or scap. [13:01:58] (03CR) 10Kosta Harlan: "Thanks! Can you add an entry to the CHANGELOG.rst please?" [integration/quibble] - 10https://gerrit.wikimedia.org/r/1049919 (https://phabricator.wikimedia.org/T361190) (owner: 10Arthur taylor) [13:03:55] thanks codders [13:08:43] (03PS3) 10Arthur taylor: Add notice at the end of console log for parallel test runs [integration/quibble] - 10https://gerrit.wikimedia.org/r/1049919 (https://phabricator.wikimedia.org/T361190) [13:09:07] (03CR) 10Arthur taylor: "done!" [integration/quibble] - 10https://gerrit.wikimedia.org/r/1049919 (https://phabricator.wikimedia.org/T361190) (owner: 10Arthur taylor) [13:27:07] (03open) 10lucaswerkmeister-wmde: scap backport: Also allow /r/c/ change URLs [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/366 [14:59:47] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review: ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.c... - https://phabricator.wikimedia.org/T362425#9926852 [15:03:03] hashar: looking at castor.. the php80 job was removed and other jobs don't seem to have this cache? [15:03:51] yeah [15:04:03] I don't get why :/ [15:06:22] hashar: so.. here is a question. if I write to XDG_CACHE_HOME/foobar from a PHPUnit job, and I merge that patch. [15:06:32] And then I write a second patch that removes writing to that sub directory and merge that too. [15:06:37] Will XDG_CACHE_HOME/foobar still exist for some time? [15:07:17] Yesterday this unit test was disabled so it is not used right now, I wonder if that might be why we're not seeing it on Castor anymore. [15:08:04] (03update) 10sg912: Adding CIM repo [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/84 [15:10:41] hmm maybe [15:11:00] then if it is restored fromcache it should be saved back [15:11:06] but maybe [15:13:21] I am trying with https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php74/5640/console [15:13:26] made it to be a "postmerge" job [15:13:29] which should trigger castor [15:13:32] we shall see [15:31:25] `krinkle@integration-castor05:/srv/castor/mediawiki-core/master$ ls -d */mw-foreign` [15:31:33] moauahaha [15:31:37] found an issue [15:31:51] Krinkle: ah thanks! so I had a script watching it [15:31:57] while true; do [ ! -d /srv/castor/mediawiki-core/master/mediawiki-quibble-vendor-mysql-php74/mw-foreign ] && echo "VANISHED ! $(date -R)"; sleep 1; done; [15:32:04] VANISHED ! Wed, 26 Jun 2024 15:30:02 +0000 [15:32:10] so yeah something deleted the dir [15:32:24] wait, it gets uploaded and then deleted? [15:32:44] different jobs I guess [15:32:45] or instances [15:32:47] who knows [15:32:51] so something is restoring it without it, and then overwriting it. [15:33:01] as long as those are atomic, not too bad I guess. equally worthy [15:33:06] but is it atomic? [15:33:34] what does it mean if two separate processes both `rsync --delete-after` very different content to the same destination. [15:33:38] presumalby it ends up with a mix [15:34:01] symlink hackery might be a way to improve on that. [15:34:25] also you dont' want someone to start restoring while it is mid-write. [15:34:34] but maybe you avoid that already [15:34:35] 10Continuous-Integration-Infrastructure, 10Castor: castor does not restore caches? - https://phabricator.wikimedia.org/T368550 (10hashar) 03NEW [15:34:46] yeah I think we have been running mostly without caches [15:34:59] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review: ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.c... - https://phabricator.wikimedia.org/T362425#9927007 [15:35:25] interesting, if it was "just" the above race, that should not leave it empty. [15:35:39] so yeah I guess it smells like something is really explicitly deleting it for another reason [15:35:57] but mw-foreign disappearing might be just a race [15:37:41] yeah [15:37:45] it is a race for sure [15:37:54] 10Continuous-Integration-Infrastructure, 10Castor: castor does not restore caches? - https://phabricator.wikimedia.org/T368550#9927034 (10hashar) p:05Triage→03Unbreak! The rsync fails: `counterexample sudo docker run --rm -it -e CASTOR_HOST=integration-castor05.integration.eqiad1.wikimedia.cloud -e JOB_NAM... [15:38:10] and the other issue is I have removed `docker` from the job names which in turns killed castor [15:43:33] so yeah I am restoring castor first [15:47:42] 10Continuous-Integration-Config, 10Release-Engineering-Team (Seen): Remove "docker" suffix from Jenkins jobs - https://phabricator.wikimedia.org/T360327#9927126 (10hashar) That eventually caused castor to no more restore caches in jobs since it relied on the job name having `docker`: {T368550} [15:48:14] 10Continuous-Integration-Infrastructure, 10Castor: castor does not restore caches? - https://phabricator.wikimedia.org/T368550#9927132 (10hashar) That was caused by {T360327} [15:48:58] (03PS1) 10Hashar: dockerfiles: castor: always restore to /cache [integration/config] - 10https://gerrit.wikimedia.org/r/1049976 (https://phabricator.wikimedia.org/T368550) [15:49:28] that means since May 13th ~ 13:48 UTC, we have been runnin more or less without cache [15:49:45] and that should show up in whatever metric/graph tracking the job duration [15:50:23] (03CR) 10CI reject: [V:04-1] dockerfiles: castor: always restore to /cache [integration/config] - 10https://gerrit.wikimedia.org/r/1049976 (https://phabricator.wikimedia.org/T368550) (owner: 10Hashar) [15:54:03] (03PS2) 10Hashar: dockerfiles: castor: always restore to /cache [integration/config] - 10https://gerrit.wikimedia.org/r/1049976 (https://phabricator.wikimedia.org/T368550) [15:56:23] (03PS1) 10Hashar: jjb: switch jobs to releng/castor:0.4.0 image [integration/config] - 10https://gerrit.wikimedia.org/r/1049981 (https://phabricator.wikimedia.org/T368550) [15:57:49] (03CR) 10Jforrester: "Oh no." [integration/config] - 10https://gerrit.wikimedia.org/r/1049976 (https://phabricator.wikimedia.org/T368550) (owner: 10Hashar) [15:58:01] (03CR) 10Hashar: [C:03+2] dockerfiles: castor: always restore to /cache [integration/config] - 10https://gerrit.wikimedia.org/r/1049976 (https://phabricator.wikimedia.org/T368550) (owner: 10Hashar) [15:58:08] hashar: Wow. [15:58:10] James_F: yeah :/ [15:58:41] (03CR) 10Hashar: [C:03+2] jjb: switch jobs to releng/castor:0.4.0 image [integration/config] - 10https://gerrit.wikimedia.org/r/1049981 (https://phabricator.wikimedia.org/T368550) (owner: 10Hashar) [15:58:50] It's amazing things worked for six months without. [15:58:55] Err. Six weeks. [15:58:58] which also mean we can hammer packagist / npmjs quite a lot [15:59:07] Yeah. Surprised they didn't throttle us. [15:59:23] I guess they have worse offenders [15:59:27] (Or maybe they did, but their throttles are very high and we didn't often hit them and just thought it was line noise?) [15:59:32] (03CR) 10Jforrester: [C:03+2] dockerfiles: castor: always restore to /cache [integration/config] - 10https://gerrit.wikimedia.org/r/1049976 (https://phabricator.wikimedia.org/T368550) (owner: 10Hashar) [15:59:40] or because we are a famous non profit we get some allow list for us [16:00:01] who knows [16:00:08] I will apply for a job at GitHub and figure it out [16:00:14] then come back with the answer [16:00:15] ;) [16:00:19] Ha. [16:00:48] I imagine my interview returning to the WMF: [16:00:48] Interviewer: do you have the answer [16:00:48] Me: I do [16:00:48] Interviewer: you are hired [16:00:53] I'm reminded of the story of a programmer joining a company whose product he uses. Day 1, he submits a PR for a bug that really irritates him. Day 2, it lands, and he quits. [16:01:01] (03Merged) 10jenkins-bot: dockerfiles: castor: always restore to /cache [integration/config] - 10https://gerrit.wikimedia.org/r/1049976 (https://phabricator.wikimedia.org/T368550) (owner: 10Hashar) [16:01:03] lol [16:01:39] !log Updating docker-pkg files on contint primary for https://gerrit.wikimedia.org/r/1049976 [16:01:40] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:01:59] (03Merged) 10jenkins-bot: jjb: switch jobs to releng/castor:0.4.0 image [integration/config] - 10https://gerrit.wikimedia.org/r/1049981 (https://phabricator.wikimedia.org/T368550) (owner: 10Hashar) [16:02:50] (03PS2) 10Daimona Eaytoy: Zuul: [mediawiki/extensions/CampaignEvents] Add Translate dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1049912 [16:03:19] !log Updating all jobs to switch to releng/castor:0.4.0 and fix cache restoration # T368550 [16:03:21] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:03:22] T368550: castor does not restore caches? - https://phabricator.wikimedia.org/T368550 [16:04:24] hmm [16:04:24] https://grafana.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board?orgId=1&var-project=integration&var-instance=integration-castor05&viewPanel=106&from=now-90d&to=now [16:04:30] (03open) 10dancy: Add repos/generated-data-platform/aqs/commons-impact-analytics to projects.json [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/87 (https://phabricator.wikimedia.org/T358718) [16:04:32] (03update) 10dancy: Add repos/generated-data-platform/aqs/commons-impact-analytics to projects.json [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/87 (https://phabricator.wikimedia.org/T358718) [16:04:33] we only have a month worth of data on grafana :/ [16:04:57] (03merge) 10dancy: Add repos/generated-data-platform/aqs/commons-impact-analytics to projects.json [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/87 (https://phabricator.wikimedia.org/T358718) [16:06:54] (03close) 10dancy: Adding CIM repo [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/84 (owner: 10sg912) [16:07:29] stuff updating [16:12:51] 06Release-Engineering-Team, 10Scap, 10MW-on-K8s, 06serviceops: Pushing mediawiki-multiversion Docker image from deploy server takes 4 minutes - https://phabricator.wikimedia.org/T341441#9927255 (10akosiaris) Just to point out that this is probably not from the network. We don't have networking rate limitin... [16:13:00] the build that erased the mw-foreign dir is https://integration.wikimedia.org/ci/job/castor-save-workspace-cache/4678152/console [16:13:05] which comes from https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php74/5641/console [16:15:48] 06Release-Engineering-Team, 10Scap, 10MW-on-K8s, 06serviceops: Pushing mediawiki-multiversion Docker image from deploy server takes 4 minutes - https://phabricator.wikimedia.org/T341441#9927265 (10Clement_Goubert) It's possible it's to do with docker using single-threaded gzip for compression on push https... [16:18:14] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review: ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.c... - https://phabricator.wikimedia.org/T362425#9927274 [16:18:14] Krinkle: the explanation is https://phabricator.wikimedia.org/T362425#9927274 [16:18:21] so yeah no cache restored [16:18:31] and the test being skipped means there is no mw-foreign ever generated [16:18:37] and thus it is never entering the cache [16:18:59] neither via the test running (it is skipped currently) nor by a restored cache from a previous biuld (since that was broken) [16:19:00] ;) [16:21:13] 10Continuous-Integration-Infrastructure, 10Castor, 13Patch-For-Review: castor does not restore caches? - https://phabricator.wikimedia.org/T368550#9927285 (10hashar) [16:23:11] 10Continuous-Integration-Infrastructure, 10Castor, 13Patch-For-Review: castor does not restore caches? - https://phabricator.wikimedia.org/T368550#9927303 (10hashar) [16:25:03] hashar: nice work! [16:25:15] yeah well hmm [16:25:39] I am not sure whether breaking the infra for 30+ days without noticing it qualifies as good work :D [16:25:58] I wonder if this bug is visible in download stats at https://packagist.org/packages/wikimedia/at-ease/stats [16:26:06] at least the good news is that we know the infra works without cache [16:26:15] if it's not noticed, is it really broken? [16:26:37] mutante: well, it was noticed yesterday/today by CI being down when github was spotty. [16:26:48] We now roll the dice 300 times in every build instead of between 0-1 times per build. [16:26:53] ah, it's THAT [16:26:56] hehe [16:26:59] sorry to bother: I'm getting https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php74/5644/console failure on load of FR again for backport. Is there a patch on master we should backport or just wait for github? [16:27:03] I like that "roll the dice" analogy [16:27:35] 16:03:41 1) ForeignResourceStructureTest::testVerifyIntegrity [16:27:35] 16:03:41 LogicException: Failed to download resource at https://codeload.github.com/wikimedia/jquery.i18n/tar.gz/70b5ee20a638cb8fe36baef8d51ac2eb577ce012 [16:27:36] ... [16:27:42] but it is not enabled !? [16:27:52] hashar: nice work for fixing it. breaking things is easy. [16:27:57] It's disabled in master but not wmf.11 yet. [16:28:00] oh that is wmf/1.43.0-wmf.11 [16:28:08] yeah, which patch should I backport? [16:28:27] 10Phabricator, 13Patch-For-Review: Make config page display version information - https://phabricator.wikimedia.org/T360756#9927323 (10Dzahn) This turned into a much bigger discussion than anticipated. Running this command manually fixes the situation while avoiding to use *. ` git config --global --add safe... [16:28:32] I can force merge the patches to annoy James too [16:28:43] Amir1: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1049594 ? [16:28:48] https://gerrit.wikimedia.org/r/c/1049594 [16:29:23] Amir1: Annoys hashar more. [16:29:38] Thanks. I backported them [16:31:28] 10Continuous-Integration-Infrastructure, 10Castor: castor does not restore caches? - https://phabricator.wikimedia.org/T368550#9927353 (10hashar) [16:31:33] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review: ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.c... - https://phabricator.wikimedia.org/T362425#9927352 [16:34:02] 10Release-Engineering-Team (Priority Backlog 📥), 07Epic, 10Release Pipeline (Blubber): Deprecate Blubber's CLI and microservice (blubberoid) interfaces - https://phabricator.wikimedia.org/T318289#9927367 (10dduvall) [16:38:13] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review: ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.c... - https://phabricator.wikimedia.org/T362425#9927379 [16:39:49] Amir1: sorry for the mess :\ [16:40:18] oh no worries. Thanks for working on it! [16:42:31] 10Phabricator, 13Patch-For-Review: Make config page display version information - https://phabricator.wikimedia.org/T360756#9927394 (10Dzahn) >>! In T360756#9924751, @Pppery wrote: > Apologies for misunderstanding the process. That's not your fault. Don't worry. >Nevertheless, the fact that for certain repos... [16:43:01] 10Release-Engineering-Team (Priority Backlog 📥), 07Epic, 10Release Pipeline (Blubber): Deprecate Blubber's CLI and microservice (blubberoid) interfaces - https://phabricator.wikimedia.org/T318289#9927401 (10dduvall) >>! In T318289#9827420, @bd808 wrote: >>>! In T318289#9826786, @dancy wrote: >> I was recentl... [16:51:33] 10Continuous-Integration-Infrastructure, 10Castor: castor does not restore caches? - https://phabricator.wikimedia.org/T368550#9927432 (10hashar) 05Open→03Resolved I have picked a build of https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php74-noselenium/ assuming the cache got saved, and... [16:54:56] !log integration: fixed castor cache restoration which was broken since mid may # T368550 [16:54:59] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:54:59] T368550: castor does not restore caches? - https://phabricator.wikimedia.org/T368550 [17:02:03] andre: well done :) [17:05:19] (03open) 10dduvall: epic: Native LLB refactor of Blubber [repos/releng/blubber] - 10https://gitlab.wikimedia.org/repos/releng/blubber/-/merge_requests/99 (https://phabricator.wikimedia.org/T345458) [18:23:20] 10Release-Engineering-Team (Priority Backlog 📥), 07Epic, 10Release Pipeline (Blubber): Deprecate Blubber's CLI and microservice (blubberoid) interfaces - https://phabricator.wikimedia.org/T318289#9927845 (10dancy) >>! In T318289#9927401, @dduvall wrote: >>>! In T318289#9827420, @bd808 wrote: >>>>! In T318289... [18:25:25] (03approved) 10dancy: epic: Native LLB refactor of Blubber [repos/releng/blubber] - 10https://gitlab.wikimedia.org/repos/releng/blubber/-/merge_requests/99 (https://phabricator.wikimedia.org/T345458) (owner: 10dduvall) [19:08:47] 10Release-Engineering-Team (Priority Backlog 📥), 13Patch-For-Review, 05Release, 05Train Deployments: 1.43.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T366956#9928063 (10jeena) [19:12:04] 10Release-Engineering-Team (Priority Backlog 📥), 13Patch-For-Review, 05Release, 05Train Deployments: 1.43.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T366956#9928074 (10jeena) I have rolled back the train because of a higher than normal DBQueryError rate. I've created a bug task with t... [19:20:29] (03open) 10dancy: projects.json: Fix ordering of repos/sre/miscweb/security-landing-page [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/88 [19:20:32] (03update) 10dancy: projects.json: Fix ordering of repos/sre/miscweb/security-landing-page [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/88 [19:23:05] (03merge) 10dancy: projects.json: Fix ordering of repos/sre/miscweb/security-landing-page [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/88 [19:23:30] (03open) 10dancy: verify-config: Ensure repos are maintained in sorted order [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/89 [19:23:33] (03update) 10dancy: verify-config: Ensure repos are maintained in sorted order [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/89 [19:24:07] (03merge) 10dancy: verify-config: Ensure repos are maintained in sorted order [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/89 [19:33:45] (03update) 10dancy: Allow repos/data-engineering/pgbouncer to use trusted runners [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/86 (https://phabricator.wikimedia.org/T368030) (owner: 10btullis) [19:37:21] 10Release-Engineering-Team (Priority Backlog 📥), 05Release, 05Train Deployments: 1.43.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T366956#9928206 (10jeena) [19:37:22] (03update) 10dancy: Allow repos/data-engineering/pgbouncer to use trusted runners [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/86 (https://phabricator.wikimedia.org/T368030) (owner: 10btullis) [19:38:36] (03merge) 10dduvall: epic: Native LLB refactor of Blubber [repos/releng/blubber] - 10https://gitlab.wikimedia.org/repos/releng/blubber/-/merge_requests/99 (https://phabricator.wikimedia.org/T345458) [19:38:46] (03update) 10dancy: Allow repos/data-engineering/pgbouncer to use trusted runners [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/86 (https://phabricator.wikimedia.org/T368030) (owner: 10btullis) [19:39:20] (03merge) 10dancy: Allow repos/data-engineering/pgbouncer to use trusted runners [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/86 (https://phabricator.wikimedia.org/T368030) (owner: 10btullis) [19:46:06] (03approved) 10dancy: scap backport: Also allow /r/c/ change URLs [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/366 (owner: 10lucaswerkmeister-wmde) [19:46:12] (03merge) 10dancy: scap backport: Also allow /r/c/ change URLs [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/366 (owner: 10lucaswerkmeister-wmde) [19:50:32] 10GitLab (CI & Job Runners), 06Release-Engineering-Team, 13Patch-For-Review: Buildkit v0.14.0 released - https://phabricator.wikimedia.org/T367352#9928235 (10dancy) 05In progress→03Resolved a:03dancy [20:27:40] 10Phabricator, 13Patch-For-Review: Explore restricting editing task priority to Trusted-Contributors/staff/etc - https://phabricator.wikimedia.org/T363337#9928392 (10thcipriani) >>! In T363337#9811366, @Aklapper wrote: > It would mean that only members of #Trusted-Contributors, #WMF-NDA #acl_sre-team or #acl_s... [20:31:50] 10GitLab (Administration, Settings & Policy), 10Release-Engineering-Team (Seen): Onboard non-WMF staff as GitLab admins - https://phabricator.wikimedia.org/T333386#9928421 (10thcipriani) 05Open→03Stalled Still interested in doing something akin to this (and figuring out how to onboard more folks), but the... [20:36:53] 10GitLab (CI & Job Runners), 10Release-Engineering-Team (Priority Backlog 📥), 10observability: Explore monitoring for the GitLab runners k8s cluster - https://phabricator.wikimedia.org/T363919#9928451 (10brennen) [20:36:59] 10GitLab (CI & Job Runners), 10Release-Engineering-Team (Priority Backlog 📥), 10observability: Explore monitoring for the GitLab runners k8s cluster - https://phabricator.wikimedia.org/T363919#9928453 (10brennen) [20:37:20] 10GitLab (Auth & Access), 10Phabricator, 10Toolforge: Look for ways to consolidate "we trust this human" access lists - https://phabricator.wikimedia.org/T364516#9928449 (10brennen) [20:38:25] 10GitLab (Project Migration), 10Wikidata, 10wmde-wikidata-tech, 10Wikidata Dev Team (Quality Tools "Sprint"): [QB] Investigate moving Query Builder from Gerrit to GitLab - https://phabricator.wikimedia.org/T350705#9928455 (10brennen) [20:41:33] (03open) 10dduvall: buildkit: Use dockerui upstream pkg for common frontend features [repos/releng/blubber] - 10https://gitlab.wikimedia.org/repos/releng/blubber/-/merge_requests/100 [20:41:34] (03update) 10dduvall: buildkit: Use dockerui upstream pkg for common frontend features [repos/releng/blubber] - 10https://gitlab.wikimedia.org/repos/releng/blubber/-/merge_requests/100 [20:41:45] (03update) 10dduvall: buildkit: Use dockerui upstream pkg for common frontend features [repos/releng/blubber] - 10https://gitlab.wikimedia.org/repos/releng/blubber/-/merge_requests/100 [20:41:48] (03update) 10dduvall: buildkit: Use dockerui upstream pkg for common frontend features [repos/releng/blubber] - 10https://gitlab.wikimedia.org/repos/releng/blubber/-/merge_requests/100 [20:47:47] (03open) 10dancy: kubernetes: Make k8s deployment failures fatal [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/367 [20:47:50] (03update) 10dancy: kubernetes: Make k8s deployment failures fatal [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/367 [20:49:57] (03update) 10dancy: kubernetes: Make k8s deployment failures fatal [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/367 [20:50:34] (03update) 10dancy: scap backport: Also allow /r/c/ change URLs [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/366 (owner: 10lucaswerkmeister-wmde) [20:54:09] (03approved) 10dancy: buildkit: Use dockerui upstream pkg for common frontend features [repos/releng/blubber] - 10https://gitlab.wikimedia.org/repos/releng/blubber/-/merge_requests/100 (owner: 10dduvall) [21:00:12] (03merge) 10dduvall: buildkit: Use dockerui upstream pkg for common frontend features [repos/releng/blubber] - 10https://gitlab.wikimedia.org/repos/releng/blubber/-/merge_requests/100 [21:02:27] 10MediaWiki-Releasing, 06MediaWiki-Engineering, 05MW-1.42-release: Write and send release announcement for 1.42.0 - https://phabricator.wikimedia.org/T359849#9928594 (10Krinkle) [21:05:39] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team: Rebuild integration-agent-pkgbuilder-1001 and integration-agent-pkgbuilder-1002 to get rid of Debian Buster - https://phabricator.wikimedia.org/T360786#9928596 (10hashar) [21:05:41] 06Release-Engineering-Team, 06Infrastructure-Foundations, 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "integration" project Buster deprecation - https://phabricator.wikimedia.org/T367534#9928595 (10hashar) [21:07:08] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 13Patch-For-Review: Move all Wikimedia CI (WMCS integration project) instances from stretch to buster/bullseye - https://phabricator.wikimedia.org/T252071#9928598 (10hashar) [21:09:08] 06Release-Engineering-Team, 06Infrastructure-Foundations, 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "integration" project Buster deprecation - https://phabricator.wikimedia.org/T367534#9928606 (10hashar) [21:11:07] 06Release-Engineering-Team, 06Infrastructure-Foundations, 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "integration" project Buster deprecation - https://phabricator.wikimedia.org/T367534#9928610 (10hashar) [21:11:08] 10Phabricator, 06collaboration-services, 10LDAP-Access-Requests, 06SRE: Offboard Lea WMDE (Lea Voget) from the WMF systems - https://phabricator.wikimedia.org/T368139#9928619 (10Dzahn) @MoritzMuehlenhoff I _think_ this is complete? Except we might want to follow-up regarding the part where the offboard scr... [22:30:32] Yippee, build fixed! [22:30:32] Project beta-update-databases-eqiad build #77023: 09FIXED in 10 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/77023/