[01:15:07] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team, 06Infrastructure-Foundations, 10observability, and 2 others: beta-scap-sync-world fails: logstash_checker.py: KeyError: 'aggregations' - https://phabricator.wikimedia.org/T360595#9664170 (10Ladsgroup) That was one hell of a rabbit hole. So as I sa... [03:54:15] 10Phabricator, 10Phabricator maintenance bot: #patch-for-review tags are not being removed from tasks - https://phabricator.wikimedia.org/T361074#9664283 (10Pppery) Looks like it's working to me. The above task shouldn't have patch-for-review removed since it has an open patch (https://gerrit.wikimedia.org/r/c... [04:16:09] 10Phabricator, 06translatewiki.net, 10Language-Team (Language-2024-January-March), 03Localization Infrastructure FY2023-24, and 2 others: Reduce or remove translation export threshold for Phabricator - https://phabricator.wikimedia.org/T360861#9664297 (10abi_) p:05Triage→03Medium [04:16:14] 10Phabricator, 06translatewiki.net, 10Language-Team (Language-2024-January-March), 03Localization Infrastructure FY2023-24, and 2 others: Reduce or remove translation export threshold for Phabricator - https://phabricator.wikimedia.org/T360861#9664298 (10abi_) [05:28:05] (03CR) 10Hashar: [C:03+2] Zuul: [mediawiki/extensions/CodeMirror] Publish JS documentation [integration/config] - 10https://gerrit.wikimedia.org/r/1012723 (https://phabricator.wikimedia.org/T359986) (owner: 10MusikAnimal) [05:30:07] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/CodeMirror] Publish JS documentation [integration/config] - 10https://gerrit.wikimedia.org/r/1012723 (https://phabricator.wikimedia.org/T359986) (owner: 10MusikAnimal) [07:00:52] 10Continuous-Integration-Infrastructure, 06Code-Review-Workgroup, 10Quibble, 10Quality-and-Test-Engineering-Team (Test engineering): Lightweight preview environment for gerrit changes - https://phabricator.wikimedia.org/T241140#9664419 (10hashar) I think that has been addressed by patchdemo T76245. Long t... [08:53:00] PROBLEM - jenkins_service_running on releases1003 is CRITICAL: PROCS CRITICAL: 0 processes with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [08:54:00] RECOVERY - jenkins_service_running on releases1003 is OK: PROCS OK: 1 process with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [09:03:00] PROBLEM - jenkins_service_running on releases1003 is CRITICAL: PROCS CRITICAL: 0 processes with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [09:07:00] RECOVERY - jenkins_service_running on releases1003 is OK: PROCS OK: 1 process with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [09:22:02] ^ that was me upgrading Jenkins (T360759) and failing at it: T361084 [09:22:02] T360759: Jenkins core security advisory - 2024-03-20 - https://phabricator.wikimedia.org/T360759 [09:22:03] T361084: Upgrade matrix-auth for Jenkins 2.440 - https://phabricator.wikimedia.org/T361084 [10:01:13] 10Continuous-Integration-Infrastructure, 07Jenkins, 10Release-Engineering-Team (Radar), 07Upstream: 14Jenkins Gearman plugin has deadlock on executor threads (was: Beta Cluster stopped receiving code updates (beta-update-databases-eqiad hung) - 14https://phabricator.wikimedia.org/T72597#9664464 (10hasha... [10:03:25] 10Continuous-Integration-Infrastructure, 07Jenkins, 06Release-Engineering-Team, 07SecTeam-Processed, and 2 others: Jenkins core security advisory - 2024-03-20 - https://phabricator.wikimedia.org/T360759#9664500 (10hashar) >>! In T360759#9653467, @Stashbot wrote: > {nav icon=file, name=Mentioned in SAL (#wi... [10:04:01] 10Phabricator maintenance bot: #patch-for-review tags are not being removed from tasks - https://phabricator.wikimedia.org/T361074#9664513 (10Aklapper) [10:04:17] 10Continuous-Integration-Infrastructure, 07Jenkins, 06Release-Engineering-Team, 07SecTeam-Processed, and 2 others: Jenkins core security advisory - 2024-03-20 - https://phabricator.wikimedia.org/T360759#9664516 (10hashar) The release Jenkins fails: ` Mar 27 08:53:49 releases1003 jenkins[1678621]: [03/27/24... [10:04:28] 10Phabricator, 10Observability-Logging: Phabricator error log PHP entries truncated - https://phabricator.wikimedia.org/T360975#9664517 (10Aklapper) p:05Triage→03Low Thanks, that's helpful (also to know that we do not "lose" messages but just split). Cannot find `Got error` in the Phorge/Arcanist codebases... [10:04:40] 10Phabricator (Upstream), 07Regression, 07Upstream: PhabricatorDataNotAttachedException when rendering project hovercard with username mentioned in project description - https://phabricator.wikimedia.org/T360530#9664525 (10Aklapper) [10:04:48] 10Phabricator (Upstream), 07Regression, 07Upstream: PhabricatorDataNotAttachedException in PhabricatorRepositoryCommit (via getRepository()) - https://phabricator.wikimedia.org/T360714#9664524 (10Aklapper) [10:04:52] 10Phabricator (Upstream), 07Regression, 07Upstream: PhabricatorDataNotAttachedException in PhabricatorRepositoryCommit (via getRepository()) - https://phabricator.wikimedia.org/T360714#9664521 (10Aklapper) 05Open→03Stalled Let's wait if deploying the patch for T360530 will also fix this [10:04:56] 10Phabricator (Upstream), 07Regression, 07Upstream: PhabricatorDataNotAttachedException in PhabricatorRepositoryCommit (via getRepository()) - https://phabricator.wikimedia.org/T360714#9664526 (10Aklapper) p:05Triage→03Low [10:05:00] PROBLEM - jenkins_service_running on releases1003 is CRITICAL: PROCS CRITICAL: 0 processes with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [10:05:18] 10Phabricator, 06Release-Engineering-Team, 06Trust-and-Safety: Account recovery help needed for Phabricator account Ifeatu_Nnaobi_WMDE - https://phabricator.wikimedia.org/T355414#9664527 (10Aklapper) @Ifeatu_Nnaobi_WMDE / @WMDE-leszek: Could you please answer the last comment? Thanks in advance! [10:06:43] 10Continuous-Integration-Infrastructure, 07Jenkins: Upgrade matrix-auth for Jenkins 2.440 - https://phabricator.wikimedia.org/T361084 (10hashar) 03NEW [10:06:59] 10Continuous-Integration-Infrastructure, 07Jenkins: Upgrade matrix-auth for Jenkins 2.440 - https://phabricator.wikimedia.org/T361084#9664567 (10hashar) [10:07:03] 10Continuous-Integration-Infrastructure, 07Jenkins, 06Release-Engineering-Team, 07SecTeam-Processed, and 2 others: Jenkins core security advisory - 2024-03-20 - https://phabricator.wikimedia.org/T360759#9664566 (10hashar) [10:11:00] RECOVERY - jenkins_service_running on releases1003 is OK: PROCS OK: 1 process with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [10:14:06] 10Phabricator, 06Release-Engineering-Team, 06Trust-and-Safety: Account recovery help needed for Phabricator account Ifeatu_Nnaobi_WMDE - https://phabricator.wikimedia.org/T355414#9664691 (10taavi) I don't think I did anything other than moving this task to the correct projects? [10:20:00] PROBLEM - jenkins_service_running on releases1003 is CRITICAL: PROCS CRITICAL: 0 processes with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [10:23:23] 10GitLab (Pipeline Services Migration🐤), 06collaboration-services, 13Patch-For-Review: move security.wikimedia.org to kubernetes - https://phabricator.wikimedia.org/T350796#9664760 (10Arnoldokoth) [10:24:26] hashar: are you still working on jenkins? jenkins on releases1003 is unhappy and throws stacktraces [10:26:22] jelto: that's my fault now, can you disable the alarms temporarily while I look into that? [10:26:31] 10Continuous-Integration-Infrastructure, 07Jenkins, 06Release-Engineering-Team, 07SecTeam-Processed, and 2 others: Jenkins core security advisory - 2024-03-20 - https://phabricator.wikimedia.org/T360759#9664765 (10Jelto) [10:27:16] jnuche: sure. Should I create a 1h downtime for releases1003? [10:27:23] and thanks for taking a look! [10:27:33] yes please, and thanks! [10:30:08] downtime active for releases1003 [10:32:00] RECOVERY - jenkins_service_running on releases1003 is OK: PROCS OK: 1 process with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [10:32:10] 10Continuous-Integration-Infrastructure, 07Jenkins, 06Release-Engineering-Team, 06collaboration-services, and 3 others: Jenkins core security advisory - 2024-03-20 - https://phabricator.wikimedia.org/T360759#9664782 (10Jelto) [10:40:49] 10Continuous-Integration-Infrastructure, 07Jenkins: Upgrade matrix-auth for Jenkins 2.440 - https://phabricator.wikimedia.org/T361084#9664800 (10hashar) https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/55 has added stacktrace to the logging. So we now get: `counterexample javaposse.job... [10:41:17] 10Continuous-Integration-Infrastructure, 07Jenkins: Upgrade matrix-auth for Jenkins 2.440 - https://phabricator.wikimedia.org/T361084#9664801 (10hashar) [10:41:43] I have posted my fringe theory [10:42:00] which is that the matrix authorization in conf/releasing/casc/jobs/docpub.groovy has to be adjusted [10:42:07] presumably cause something changed in jenkins core itself [10:43:30] something doesn't add up, I got 2.440.2 running in the scap3-dev env without any changes to the plugin versions, including matrix-auth [10:43:50] or [10:44:00] could it be that another version of matrix-auth get installed? [10:44:07] there is a breaking change in 3.2.0 [10:45:33] the thing is that matrix-auth-3.1.5 has nothing matching an "entries" string [10:45:35] but master does [10:46:14] that aligns with the breaking change yes: [10:46:14] "In all three cases, the permissions list has been replaced with the entries list and a more elaborate element syntax decoupled from the serialized XML configuration format. See examples below for the new syntax." [10:46:29] but doesn't explain how the matrix plugin got updated in prod [10:46:32] and src/main/java/org/jenkinsci/plugins/matrixauth/integrations/casc/MatrixAuthorizationStrategyConfigurator.java has: [10:46:32] Loading deprecated attribute 'permissions' for instance of ' + container.getClass().getName() + "'. Use 'entries' instead."); [10:46:54] so [10:47:04] on deployment my guess is we get another version than the one which is pinned [10:47:10] and therefore, the pinning doe snot work [10:47:24] * hashar shakes fists at jenkins plugin manager [10:48:36] maybe we have its output somewhere in scap logs [10:57:15] I can't find anything relevant in the logs at `/srv/deployment/releng/jenkins-deploy/scap/log`in the deploy server or in logstash [10:57:59] on the deployment server I went with `scap deploy-log --verbose` [10:58:10] Mar 27, 2024 10:31:17 AM org.apache.http.impl.execchain.RetryExec execute [10:58:10] INFO: Retrying request to {tls}->http://url-downloader.wikimedia.org:8080->https://updates.jenkins.io:443 [10:58:10] Done [10:58:35] that is cause the jenkins plugin manager runs in quiet / non verbose mode by default [10:58:46] if we pass it `--verbose` there should be more output in the scap debug log [11:01:05] at least locally with the 2.440.2 jenkins.war it still honor our pinfile [11:01:08] but [11:01:48] matrix-auth is a detached plugin in Jenkins core and it is updated in 2.440.2 [11:02:01] 2.426.3 had 3.1.8 [11:02:51] then relesaes jenkins has 3.1.5 [11:02:52] so I don't know [11:06:02] maybe we can try debugging by limiting the deployment to releases2003 [11:06:29] 10GitLab (Pipeline Services Migration🐤), 06collaboration-services: Move micro sites from Ganeti to Kubernetes and from Gerrit to GitLab - https://phabricator.wikimedia.org/T300171#9664869 (10Jelto) [11:08:22] OR [11:08:34] 1) we get the casc jobs generate [11:09:35] 2) the Jenkins debian package is installed (which installs the bundled matrix-auth 3.2.1 [11:09:45] (but does it?) [11:10:25] the apt install spins up the Jenkins daemon [11:10:50] the casc fails cause our job has an obsolete definition compared to the matrix-auth plugiin bundled [11:10:54] 3) we run the jenkins plugin manager [11:12:28] mmmh, I thought Jenkins hasn't bundled plugins in ages [11:12:41] there are some in the war [11:12:46] named "detachedplugins" [11:13:23] ex: https://github.com/jenkinsci/jenkins/blob/master/war/pom.xml#L277-L283 [11:14:10] funny, still, according to docs pinned plugins shouldn't be overwritten by bundles whatever the version: https://www.jenkins.io/doc/book/managing/plugins/#pinned-plugins [11:14:28] and matrix-auth is the example they give for a pinned plugin [11:14:38] yeah that is a plugin whose filename ends with .pinned I guess? [11:14:47] or I dont know [11:14:53] it is merely jsut a fringe theory I have [11:15:11] mmh, not sure, but .pinned seems to be manually pinned ones [11:19:11] 10Phabricator, 06Release-Engineering-Team, 06Trust-and-Safety: Account recovery help needed for Phabricator account Ifeatu_Nnaobi_WMDE - https://phabricator.wikimedia.org/T355414#9664885 (10Aklapper) Ah, thanks. Ideally this would require a quick video call for verification, as Phab does not offer 2FA recove... [11:20:53] ok, I can't explained how the problem is triggered, but passing `--verbose` to the plugin manager and deploying to the second release Jenkins is a good idea [11:20:57] I'm going to do that [11:25:54] I think we would need a sudo rule to allow `systemctl mask jenkins` and `systemctl unmask jenkins` [11:25:58] 10Deployments, 06Release-Engineering-Team, 06serviceops, 13Patch-For-Review: httpbb appserver test breaks deployment of the week due to a timeout parsing page - https://phabricator.wikimedia.org/T360867#9664889 (10Clement_Goubert) Some context given by @RLazarus from the CR: > At the time we added this tes... [11:25:58] then the script would: [11:26:01] a) stop jenkins [11:26:05] b) systemctl mask jenkins [11:26:10] c) apt-get install jenkins [11:26:14] d) plugin manager install [11:26:22] e) systemctl unmask jenkins [11:26:25] f) start jenkins [11:26:40] maybe there is another way to prevent jenkins from starting [11:27:58] apt-get install jenkins fails if the service is masked, we ran into that before [11:28:11] but we don't need it, scap is already taking care of disabling the service in the sceondary host [11:28:16] *secondary [11:33:25] going to have lunch now, will continue later [11:39:17] oh my god [11:39:19] it is never ending [11:39:23] cause: profile::releases::mediawiki::jenkins_service_enable: "mask" [11:39:30] but if I look at releases2003 it is clearly not masked [11:41:49] * hashar lunches & [11:45:27] let me know if you need anything. I can run mask/unmask commands if needed. You plan to failover to releases2003? [11:55:14] 10GitLab (Pipeline Services Migration🐤), 06collaboration-services, 13Patch-For-Review: move security.wikimedia.org to kubernetes - https://phabricator.wikimedia.org/T350796#9664966 (10Arnoldokoth) [11:59:40] 10Phabricator maintenance bot: #patch-for-review tags are not being removed from tasks - https://phabricator.wikimedia.org/T361074#9664969 (10Ladsgroup) Looking at the activities of the bot, it does remove the tag: https://phabricator.wikimedia.org/p/Maintenance_bot/ [12:01:53] jelto: nope, I'm going to deploy to releases2003 to do some debugging, but releases1003 is going to stay as the primary [12:02:06] ack sounds good [12:02:10] still need to do some local checks first though [12:02:47] hashar: `profile::releases::mediawiki::jenkins_service_enable` seems some leftover deprecated configuration in puppet [12:03:16] the service in releases2003 doesn't get masked anymore, it is now disabled by scap since https://phabricator.wikimedia.org/T343447 [12:09:00] ahh [12:18:26] jnuche: on the CI jenkins, the service is masked and `apt install jenkins` works for sure [12:18:44] cause I don't think there is any good way to prevent apt/dpkg from starting the service [12:19:46] no idea why that works there, maybe it's something that changed between buster and bullseye [12:22:14] also `systemctl disable` is merely to prevent the service from starting on the machine boot [12:24:05] AHHHH https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/170/diffs#b722e25e5e724662ece3bc57f9397d8523a73e53_889_909 [12:24:38] yeah it is not doing what you think it should :) [12:24:46] a disabled service can still be started/restarted [12:25:12] and since it is only disabled, `apt install jenkins` happilly start it [12:26:08] which is why scap disables it, yes [12:28:15] hmm [12:28:41] I gave it a try on bullseye and the installation works with a masked unit [12:32:25] I don't know man, I'm getting a different weird error locally with scap now, let me focus on that :) [12:32:57] hehe [12:35:39] oh my god [12:36:04] surely if I head to /srv/deployment/releng/jenkins-deploy/scap/log [12:36:21] there are no logs from the deployments I did this morning [12:37:47] * hashar checks kibana [12:43:23] Mar 27 08:35:03 releases1003 wmf-auto-restart[1674912]: INFO: 2024-03-27 08:35:03,368 : Detected necessary restart for service jenkins (1465086) [12:43:24] joy [13:07:44] 10GitLab (CI & Job Runners), 06collaboration-services, 13Patch-For-Review: Create a special-purpose Trusted Runner with Dockerfile frontend - https://phabricator.wikimedia.org/T357612#9665255 (10CodeReviewBot) jelto opened https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/66 m... [13:08:13] 10GitLab (CI & Job Runners), 06collaboration-services, 13Patch-For-Review: Create a special-purpose Trusted Runner with Dockerfile frontend - https://phabricator.wikimedia.org/T357612#9665256 (10Jelto) [13:08:20] some progress, after adding `--verbose` to the plugin manager, I can see detailed output from it locally with `scap deploy-log --verbose` [13:08:42] in particular, I can see it installing the right matrix-auth version: [13:08:42] Will install new plugin matrix-auth 3.1.5 [13:09:46] now let's deploy this to releases2003.codfw.wmnet [13:20:01] (03PS1) 10Jforrester: Revert "jjb: record beta-scap-sync-world is disabled" [integration/config] - 10https://gerrit.wikimedia.org/r/1014715 [13:20:43] (03PS2) 10Jforrester: Revert "jjb: record beta-scap-sync-world is disabled" [integration/config] - 10https://gerrit.wikimedia.org/r/1014715 (https://phabricator.wikimedia.org/T360595) [13:34:41] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team, 06Infrastructure-Foundations, 10observability, and 2 others: beta-scap-sync-world fails: logstash_checker.py: KeyError: 'aggregations' - https://phabricator.wikimedia.org/T360595#9665327 (10colewhite) >>! In T360595#9664170, @Ladsgroup wrote: > At... [13:35:49] hashar: the plugin manager downloaded the right matrix-auth version in releases2003.codfw.wmnet and the bug still reproduced [13:36:19] plugin manager installed the right matrix-auth and then the failure happened during the subsequent jenkins restart at the end of the deployment [13:36:24] same issue: "No signature of method: permissions()" [13:36:41] at least now we have a place to reproduce the error [13:37:32] I'm going to update the CasC configuration to use the new syntax used by the plugin and see if I can fix the problem that way [13:38:16] (by "right version" version about I mean the one we specify in our plugins file: 3.1.5) [13:38:26] s/about/above [13:41:39] jnuche: my theory is that jenkins boot with the wrong plugin [13:42:00] I mean, maybe it loads the bundled version instead of the one downloaded by the plugin manager [13:42:35] it is a mess :/ [13:47:35] 10Beta-Cluster-Infrastructure: 14error.log is not rotated in beta - 14https://phabricator.wikimedia.org/T345566#9665389 (10Tgr) 14Seems properly rotated now: ` tgr@deployment-mwlog02:~$ ls -l /srv/mw-log/archive/err* -rw-r--r-- 1 udp2log udp2log 39300 Mar 12 15:45 /srv/mw-log/archive/error.log-20240313.gz -... [13:50:59] 10Beta-Cluster-Infrastructure: 14error.log is not rotated in beta - 14https://phabricator.wikimedia.org/T345566#9665413 (10Jdforrester-WMF) 14Brilliant! [13:52:17] hashar: that could be, I think the detached plugins are not installed by default, they are there in case the user selects them during a fresh install [13:52:36] but maybe a bug is causing them to overwrite the stuff already installed in some cases [13:52:39] ah who knows [13:55:13] something telling: the matrix-auth installed on disk for my local dev environment is different from the one that ended up installed in releases2003 [13:55:24] scap-deploy@jenkins-rel:/var/lib/jenkins/plugins$ ls -l matrix-auth.jpi [13:55:24] -rw-r--r-- 1 jenkins jenkins 158964 Mar 27 13:16 matrix-auth.jpi [13:55:34] --- [13:55:41] jnuche@releases2003:/var/lib/jenkins/plugins$ ls -l matrix-auth.jpi [13:55:41] -rw-rw-r-- 1 jenkins jenkins 177310 Mar 27 13:20 matrix-auth.jpi [13:55:59] and my local env has 3.1.5 [13:56:20] so it did end up with a different version, somehow [14:06:58] Gerrit question: Wondering if at some point Gerrit gained functionality to _pull_ (and thus mirror) a git repo from a third place, e.g. from GitHub? [14:19:34] andre: no we don't do that [14:19:36] :) [14:20:03] okay. my question was if the Gerrit software is able to do that, not what we do or not :) [14:20:19] there was a hack to satisfy some people that wanted to use github and did not want hear about Gerrit at all [14:20:34] and the hack they found was to have Phabricator to do the replication from Github to Gerrit [14:20:40] I know [14:20:43] since we forbid to deploy from github [14:20:45] ah yeah that is andre [14:20:46] sorry [14:21:14] My question is if the Gerrit software is able to import a git repo from a third-party location. [14:21:22] so no Gerrit is not able to fetch / mirror from some other places [14:21:29] yay, thanks! [14:22:39] for importing repos into Gerrit one can use git mirroring: `git fetch --mirror https://source.example.org' and then `git push --mirror ssh://gerrit.wikimedia.org:29418/dest` https://www.mediawiki.org/wiki/Git/Creating_new_repositories#Importing_from_an_existing_repository [14:23:05] true, thanks [14:24:23] :) [14:26:03] 06Project-Admins: Expansion of cookbooks tag usage across teams - https://phabricator.wikimedia.org/T357895#9665590 (10Aklapper) a:03joanna_borun @joanna_borun Please reset/remove the task assignee and set the task status to "open" once required data has been provided - thanks [14:26:21] 06Project-Admins: 14Create project tag for "Machine-Learning-Backlog" - 14https://phabricator.wikimedia.org/T352997#9665596 (10Aklapper) 05Stalled→03Declined 14Unfortunately closing this Phabricator task as no further information has been provided. @calbon: After replying to my previous, please set the... [14:27:49] 06Project-Admins: 14Replace tracking bug T5646 by new project tag "Feeds" - 14https://phabricator.wikimedia.org/T102495#9665606 (10Aklapper) 05Open→03Declined [14:27:59] jnuche: 177310 Mar 27 13:20 matrix-auth.jpi <--- yes that is the bundled plugin :/ [14:30:00] 10Beta-Cluster-Infrastructure, 06MediaWiki-Platform-Team, 10MW-1.42-notes (1.42.0-wmf.24; 2024-03-26), 13Patch-For-Review: Cannot create a new wiki on beta cluster - https://phabricator.wikimedia.org/T358236#9665616 (10pmiazga) >>! In T358236#9635697, @Ladsgroup wrote: > > Even more ideally, it shouldn't... [14:30:13] silly Jenkins [14:30:44] and why is that happening in the prod servers but not in the dev environment? bleh [14:33:09] maybe cause jenkins is running or not running [14:38:51] doesn't look like that, the problem reproduced in both prod servers, one was running, the other wasn't [14:43:12] 06Release-Engineering-Team, 10Scap: Ask for a commit/change summary in scap train if one not provided? - https://phabricator.wikimedia.org/T360729#9665646 (10dancy) [15:02:03] 10Continuous-Integration-Infrastructure, 07Jenkins, 13Patch-For-Review: Upgrade matrix-auth for Jenkins 2.440 - https://phabricator.wikimedia.org/T361084#9665760 (10CodeReviewBot) jnuche opened https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/57 jenkins-rel: updated matrix-auth plug... [15:10:37] 10Continuous-Integration-Infrastructure, 07Jenkins, 13Patch-For-Review: Upgrade matrix-auth for Jenkins 2.440 - https://phabricator.wikimedia.org/T361084#9665809 (10CodeReviewBot) jnuche merged https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/57 jenkins-rel: updated matrix-auth plug... [15:15:03] NOOOo [15:21:31] hashar: I deployed the fix to releases2003 and that worked, Jenkins now can read the CasC config and start up correctly :) [15:21:45] and I can see the permissions are still set correctly with the new config [15:22:01] I'm going to deploy the fix to the primary now [15:22:42] well, unless that Darth Vader impersonation has something to do with the fix :P [15:24:48] hashar: any objection to deploy to releases1003? [15:29:41] jnuche: yeah go for it [15:30:02] I am still puzzled as to why it failed [15:32:05] or it is a good old race condition [15:32:39] so imagine [15:32:48] Jenkins boot [15:33:12] the jenkins-plugin-manager has `--clean-download-directory` which nukes JENKINS_HOME/plugins [15:33:20] while Jenkins is running in the background [15:33:45] at some point it checks whether the plugin is there and since there is none (cause the dir got nuked), Jenkins use the bundled one [15:35:27] 10Deployments, 06Release-Engineering-Team, 06serviceops, 13Patch-For-Review: httpbb appserver test breaks deployment of the week due to a timeout parsing page - https://phabricator.wikimedia.org/T360867#9665942 (10RLazarus) >>! In T360867#9664888, @Clement_Goubert wrote: > The reason we're catching it now,... [15:35:55] 10Continuous-Integration-Infrastructure, 07Jenkins: Upgrade matrix-auth for Jenkins 2.440 - https://phabricator.wikimedia.org/T361084#9665945 (10hashar) Some debugging session from releases2003 which has the issue. At first, in `/var/lib/jenkins/plugins` all plugins are from 13:17 except for the bundled ones:... [15:37:55] releases1003 updated and it's healthy 🥳 [15:39:02] hashar: not sure, if that were the case we would have seen non-matching versions for the plugins in prod by now [15:39:21] the current setup has been running for more than a year I think [15:45:43] (I just verified that the plugins in prod match exactly what we have configured in `plugins.txt`) [15:46:40] what's more, the problem reproduced every time we tried in prod (3 times in total I think) [15:46:56] and not a single time locally in scap3-dev (I must have tried at least 3-4 times) [15:47:18] in prod happening in two different boxes too [15:47:27] yeah I can't repro either :\ [15:47:35] I don't know, it might be a race condition, but doesn't look much like one [15:52:16] 10Deployments, 06Release-Engineering-Team, 06serviceops, 13Patch-For-Review: 14httpbb appserver test breaks deployment of the week due to a timeout parsing page - 14https://phabricator.wikimedia.org/T360867#9665997 (10CodeReviewBot) 14dancy opened https://gitlab.wikimedia.org/repos/releng/train-dev/-/... [15:52:22] 10Deployments, 06Release-Engineering-Team, 06serviceops, 13Patch-For-Review: 14httpbb appserver test breaks deployment of the week due to a timeout parsing page - 14https://phabricator.wikimedia.org/T360867#9665992 (10Clement_Goubert) 05Open→03Resolved a:03Clement_Goubert 14`--retry_on_timeout`... [15:52:42] jnuche: well I am willing to forget about it :) [15:52:49] 10Deployments, 06Release-Engineering-Team, 06serviceops, 13Patch-For-Review: 14httpbb appserver test breaks deployment of the week due to a timeout parsing page - 14https://phabricator.wikimedia.org/T360867#9666018 (10CodeReviewBot) 14dancy merged https://gitlab.wikimedia.org/repos/releng/train-dev/-/... [15:56:09] 10Continuous-Integration-Infrastructure, 07Jenkins: 14Upgrade matrix-auth for Jenkins 2.440 - 14https://phabricator.wikimedia.org/T361084#9666025 (10jnuche) 05Open→03Resolved a:03jnuche 14The fix from https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/57 has fixed the issue. B... [15:57:11] 10Continuous-Integration-Infrastructure, 07Jenkins, 06Release-Engineering-Team, 06collaboration-services, and 3 others: Jenkins core security advisory - 2024-03-20 - https://phabricator.wikimedia.org/T360759#9666032 (10jnuche) Release Jenkins instances have now been upgraded to 2.440.2. An update of some c... [16:03:24] jnuche: or that is related to upgrade Jenkins itself [16:06:04] (03PS1) 10Physikerwelt: Zuul: [mediawiki/extensions/MathSearch] Disable PHP 8.2 testing for now [integration/config] - 10https://gerrit.wikimedia.org/r/1015079 (https://phabricator.wikimedia.org/T360709) [16:09:05] (03PS2) 10Physikerwelt: Zuul: [mediawiki/extensions/MathSearch] Disable PHP 8.2 testing for now [integration/config] - 10https://gerrit.wikimedia.org/r/1015079 (https://phabricator.wikimedia.org/T360709) [16:10:13] (03CR) 10Physikerwelt: "See https://gerrit.wikimedia.org/r/c/mediawiki/extensions/MathSearch/+/1013229 for failing tests" [integration/config] - 10https://gerrit.wikimedia.org/r/1015079 (https://phabricator.wikimedia.org/T360709) (owner: 10Physikerwelt) [16:11:30] 10Beta-Cluster-Infrastructure, 06MediaWiki-Platform-Team, 10MW-1.42-notes (1.42.0-wmf.24; 2024-03-26), 13Patch-For-Review: Cannot create a new wiki on beta cluster - https://phabricator.wikimedia.org/T358236#9666109 (10Ladsgroup) Sounds good for now! [16:16:51] 10Phabricator, 03Wikimedia-Hackathon-2024: Phorge (Phabricator) Code Review Sprint - https://phabricator.wikimedia.org/T356384#9666130 (10Aklapper) [16:26:15] 10Continuous-Integration-Infrastructure, 07Jenkins, 13Patch-For-Review: 14Upgrade matrix-auth for Jenkins 2.440 - 14https://phabricator.wikimedia.org/T361084#9666212 (10CodeReviewBot) 14hashar opened https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/58 Log hudson.pluginManager  [16:52:57] 10Phabricator: "Unexpected object type from git cat-file" errors in various imported Gerrit repositories - https://phabricator.wikimedia.org/T360270#9666327 (10Aklapper) [16:53:49] 10Release-Engineering-Team (Priority Backlog 📥), 10wikimedia.biterg.io: Sort out how to pull data (affiliations etc) from Bitergia DB via SortingHat API to find needed data updates - https://phabricator.wikimedia.org/T360762#9666330 (10Aklapper) a:03Aklapper [16:55:54] andre: for that `git cat-file` issue I am pretty sure it is an issue within Phabricator [16:56:04] yes, me too [16:56:26] it is fed the output of `git branch -r` [16:56:48] and it is cool to see Phabricator logs are in logstash \o/ [16:57:19] heh [17:12:20] I so hate the kibana search [17:12:22] or lucene [17:12:23] dql [17:12:29] I can never find what I am looking for :/ [17:15:43] `http.request.referrer:*EMEM*` [17:37:29] 10Phabricator: "Unexpected object type from git cat-file" errors in various imported Gerrit repositories - https://phabricator.wikimedia.org/T360270#9666499 (10hashar) From a stacktrace: ` 2024-03-22 23:44:16 Unexpected object type from `git cat-file` in rEMEM: HEAD -> master missing ... referer https://phabric... [17:39:32] 10Phabricator: "Unexpected object type from git cat-file" errors in various imported Gerrit repositories - https://phabricator.wikimedia.org/T360270#9666501 (10hashar) Another one I have investigated: > [2024-02-02 15:06:01] Unexpected object type from git cat-file in rEBTX: tag: 3.1.8 missing, referer: https:... [17:40:11] so yeah I don't know [20:23:31] (03PS1) 10Kosta Harlan: zuul: Add EventBus to phan dependencies for WikimediaEvents [integration/config] - 10https://gerrit.wikimedia.org/r/1015106 (https://phabricator.wikimedia.org/T354597) [20:25:21] (03CR) 10CI reject: [V:04-1] zuul: Add EventBus to phan dependencies for WikimediaEvents [integration/config] - 10https://gerrit.wikimedia.org/r/1015106 (https://phabricator.wikimedia.org/T354597) (owner: 10Kosta Harlan) [20:26:51] (03PS2) 10Kosta Harlan: zuul: Add EventBus to phan dependencies for WikimediaEvents [integration/config] - 10https://gerrit.wikimedia.org/r/1015106 (https://phabricator.wikimedia.org/T354597) [20:35:13] (03CR) 10Hashar: [C:03+2] zuul: Add EventBus to phan dependencies for WikimediaEvents [integration/config] - 10https://gerrit.wikimedia.org/r/1015106 (https://phabricator.wikimedia.org/T354597) (owner: 10Kosta Harlan) [20:35:27] kostajh: last minute deploy :) [20:36:11] thanks! [20:36:48] (03Merged) 10jenkins-bot: zuul: Add EventBus to phan dependencies for WikimediaEvents [integration/config] - 10https://gerrit.wikimedia.org/r/1015106 (https://phabricator.wikimedia.org/T354597) (owner: 10Kosta Harlan) [20:37:53] kostajh: I have reloaded Zuul [20:38:06] merci [20:38:08] and well I am sleeping now! Happy hacking [21:35:10] 10Phabricator maintenance bot: #patch-for-review tags are not being removed from tasks - https://phabricator.wikimedia.org/T361074#9667232 (10matmarex) >>! In T361074#9664283, @Pppery wrote: > Looks like it's working to me. The above task shouldn't have patch-for-review removed since it has an open patch (https:... [22:59:15] 10Continuous-Integration-Config, 10MediaWiki-extensions-CentralAuth, 10MediaWiki-Installer, 06MediaWiki-Platform-Team, 07ci-test-error: Admin account created by the installer isn't made global by CentralAuth - https://phabricator.wikimedia.org/T358985#9667428 (10matmarex) I started playing with the appro... [23:20:57] 10MediaWiki-Releasing: Big holes in the MediaWiki release archive - https://phabricator.wikimedia.org/T190369#9667498 (10tstarling) I uploaded the following release tarballs from my personal archives, which mostly derive from a copy I made of the [[https://sourceforge.net/projects/wikipedia/files/|SourceForge fi... [23:45:48] 10Continuous-Integration-Config, 10MediaWiki-Documentation, 10MediaWiki-Platform-Team (Radar), 05MW-1.39-notes, and 4 others: MediaWiki core docs unavailable for MW 1.35 and later - https://phabricator.wikimedia.org/T317451#9667565 (10Krinkle) Yay! https://doc.wikimedia.org/mediawiki-core/1.39.7-alpha-T317... [23:59:12] 10Release-Engineering-Team (Priority Backlog 📥), 13Patch-For-Review, 05Release, 05Train Deployments: 1.42.0-wmf.24 deployment blockers - https://phabricator.wikimedia.org/T360156#9667593 (10Ladsgroup)