[00:00:20] 10GitLab (Project Migration), 10Release-Engineering-Team (Done by Feb 23 🧟): Create new GitLab project group for Wikimedia Italia - https://phabricator.wikimedia.org/T301791 (10brennen) 05Open→03In progress a:03brennen I've created https://gitlab.wikimedia.org/repos/wikimedia-it and added @valerio.bozzol... [00:08:27] 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.38.0-wmf.24 deployment blockers - https://phabricator.wikimedia.org/T300200 (10brennen) 05Open→03Resolved Seems stable on all wikis at end-of-workday. Optimistically resolving. [06:32:55] 10Phabricator, 10Release-Engineering-Team (Seen), 10Technical-Debt: Stop using Differential for code review - https://phabricator.wikimedia.org/T191182 (10hashar) Not much to be done to Gerrit. The last active repositories in Phabricator Differential will be migrated to #gitlab [06:51:29] 10Gerrit: Gerrit notification email contains "null" - https://phabricator.wikimedia.org/T288312 (10hashar) The message payload can be retrieved from the change metadata `git fetch origin refs/changes/84/710084/meta` ` commit f79c7956faef859fa07144f50dd1188158f558d1 Author: Gerrit User 6729 <6729@e9e9afe9-4712-48... [07:02:39] 10Gerrit: Gerrit notification email contains "null" - https://phabricator.wikimedia.org/T288312 (10hashar) The email comes from templates and I found out that we forked the upstream ones for some reason, the sources are in operations/puppet: ` $ git ls-files modules/gerrit/ |grep soy modules/gerrit/files/homedir... [07:04:42] 10Gerrit: Gerrit notification email contains "null" - https://phabricator.wikimedia.org/T288312 (10hashar) a:03hashar Looks like the template got introduced in 2017 by 84c651bfc6e8db67933e6322f37efa082946d259 for T43608. I think I will drop them in favor of using the upstream built in templates. [07:10:29] (03PS1) 10Majavah: log: fix inconsistent separators [tools/scap] - 10https://gerrit.wikimedia.org/r/767923 [08:32:30] Krinkle & hashar: I'm pretty sure I've seen timeouts with npm install before we implemented the parallel install flag. Are you seeing this regularly? or just once? [08:33:56] kostajh: apparently it was a one time issue [08:34:13] I looked at the build log, there is something sketchy [08:34:25] cause we have an equal amount of "Start: npm install" and "Finish: npm install" [08:34:36] which would indicte that all the npm install commands have completed [08:34:51] then the ParallelTask reporter keeps idling waiting for one last task to complete [08:35:15] apparently all the npm install console output got successfully logged [08:35:24] we capture the output and then log.info() it [08:35:46] so I suspect a bug in the ProgressReporter that might not properly track completion of tasks [08:43:38] hashar: hmm, yeah that's possible, but you'd think an off-by-one error for ProgressReporter would get surfaced a lot more [08:46:12] kostajh: the timed our step was for selenium-test, with no output or progress available after 60min [08:46:41] essentially I have no ide awhat is going on :/ [08:46:48] It was one off but the lack of output seems a new bug [08:46:53] but all npm install commands seem to have properly completed [08:47:02] at least based on the log messages [08:50:26] I'm not sure if there's a way to dump output once Jenkins decides it's time to end the job [08:51:35] we could have quibble stream its output (including the logs held by ProgressReporter) to e.g. quibble.log, as a backup for this type of scenario? then you'd be able to get it from build artificats https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php72-docker/138874/ [08:52:27] Jenkins stream the log output that is emitted [08:52:38] but our parallel task capture the output of `npm install` in a buffer [08:52:49] and on completion of the task we call log.info(capture) [08:53:19] and after that the task completion is logged via a message that looks like `>>> Finish npm install in /x/y/z` [08:54:34] which leads me to believe that all "npm install" commands successfully completed [08:55:23] Krinkle: have you filed a task about the stall quibble run ? Or I can fill one [09:13:51] Oh I see. kostajh, it's Jenkins ending it a few seconds before quibble has a chance to call it time and print? [09:15:17] The "60 elapsed" made me think it was quibble calling the time out [09:15:21] Make sense indeed [09:15:35] I wonde what signal that sends [09:16:02] Maybe that's something docker passes on and gives us a natural chance to respond. Or whether it's a hard switch immediately [09:18:36] But yeah streaming to a file could be a good fallback and perhaps even tail from a post step for easy access [09:23:09] Jenkins should be sending a SIGTERM on timeout [09:23:19] and SIGKILL after some time [09:29:14] my suspicion is that the "completed" variable that keep track of tasks being completed is not thread safe [09:29:22] and somehow sometime it might end up being off by one [09:35:04] (03CR) 10Thiemo Kreuz (WMDE): [C: 03+2] log: fix inconsistent separators [tools/scap] - 10https://gerrit.wikimedia.org/r/767923 (owner: 10Majavah) [09:35:46] (03Merged) 10jenkins-bot: log: fix inconsistent separators [tools/scap] - 10https://gerrit.wikimedia.org/r/767923 (owner: 10Majavah) [09:40:42] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.25 deployment blockers - https://phabricator.wikimedia.org/T300201 (10dom_walden) [09:59:01] 10Phabricator, 10Patch-For-Review: Delete unused custom /src/other/CustomLoginHandler.php in Phab extensions - https://phabricator.wikimedia.org/T228518 (10Aklapper) [10:10:09] 10Project-Admins, 10Tool-DrTrigonBot---General: Archive DrTrigonBot* Phab projects and decline its open tasks - https://phabricator.wikimedia.org/T300969 (10Aklapper) It sounds like it's unknown what the software projects and (now defunct) services did or do (plus no other maintainers joined this conversation)... [10:13:38] kostajh: Krinkle: I highly suspect the ProgressReporter is thread unsafe when it keeps track of tasks completed [10:13:53] and somehow `self.completed =+ 1` might not be taken in account [10:13:57] it is a rabbit hole really [10:49:27] I give up on it :\ [10:49:48] but maybe we can add some debug output when the progressreporter has been running for too long [11:59:45] 10Project-Admins: Create project tag for Wikinews - https://phabricator.wikimedia.org/T303039 (10Lens0021) [12:10:32] 10MediaWiki-Releasing, 10AbuseFilter (Overhaul-2020), 10MW-1.38-release: Bundle AbuseFilter extension with MediaWiki - https://phabricator.wikimedia.org/T191740 (10Daimona) Can this be closed now? [12:17:26] 10GitLab (Auth & Access), 10Release-Engineering-Team, 10User-brennen: Investigate what's required to allow a user to fork or transfer a project to a group - https://phabricator.wikimedia.org/T300935 (10tchin) I can sort of get around this issue if I go to the the project group page (in my case `repos/api-pla... [12:32:28] 10Project-Admins: Create project tag for Wikinews - https://phabricator.wikimedia.org/T303039 (10Aklapper) I'm very reluctant about this; see T154549#2916179. Who would use it and how, and for what, and especially for what not? [12:45:18] 10Project-Admins: Create project tag for Wikinews - https://phabricator.wikimedia.org/T303039 (10Lens0021) Though there is no guarantee to be implemented, [[ https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2022/Larger_suggestions/Refinement_of_MediaWiki_to_meet_the_needs_of_the_Wikinews_project_(news_w... [13:53:04] 10Continuous-Integration-Config, 10MediaWiki-Core-Tests, 10Quibble, 10Code-Health, 10Patch-For-Review: Quibble: Run PHPUnit databaseless and database stages in parallel - https://phabricator.wikimedia.org/T235449 (10kostajh) [14:48:21] did we get a new Jenkins version? https://i.imgur.com/5IeYqEE.png [14:48:32] I don’t remember seeing this behavior before https://i.imgur.com/dEe0xdO.png [14:48:57] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.25 deployment blockers - https://phabricator.wikimedia.org/T300201 (10Ammarpad) [14:59:57] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, 10Patch-For-Review: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10Majavah) [15:13:36] Lucas_WMDE: I think that is cause we run `npm install` in parallel [15:13:49] and the Jenkins plugin that creates the section ends up confused somehow :-` [15:18:21] OH [15:19:43] I see, I wasn’t sure if the sections displayed at nested before but it’s possible I just didn’t notice [15:19:49] do you want a task to track it? [15:20:01] I once did a hack of that plugin to fix a bug in it [15:20:06] and though that maybe we have lost those hacks [15:20:17] but they got merged upstream and are included in the version of the plugin we run [15:20:34] so that must be due to a change in the output generated by the new Quibble (we have upgraded yesterday) [15:20:45] or maybe we never noticed [15:21:01] the parallel npm install is from mid february , so probably we never noticed [15:21:22] i will fill the task reusing your screenshots :] [15:21:51] ok :) [15:22:05] * Lucas_WMDE hereby CC0s the screenshots if they are copyrightable ;) [15:24:00] (03PS1) 10Krinkle: Revert "zuul: Install MobileFrontend when testing Echo" [integration/config] - 10https://gerrit.wikimedia.org/r/768068 (https://phabricator.wikimedia.org/T225730) [15:24:10] :]]] [15:25:22] 10Continuous-Integration-Infrastructure, 10OOUI: Demos page for OOUI in php is broken - https://phabricator.wikimedia.org/T297035 (10Krinkle) [15:25:30] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, 10Patch-For-Review: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10Krinkle) [15:26:38] 10Continuous-Integration-Infrastructure, 10Jenkins, 10Release-Engineering-Team: Jenkins collapsible sections are deeply nested in Quibble output - https://phabricator.wikimedia.org/T303056 (10hashar) [15:32:08] 10Continuous-Integration-Infrastructure, 10Jenkins, 10Release-Engineering-Team, 10Quibble: Jenkins collapsible sections are deeply nested in Quibble output - https://phabricator.wikimedia.org/T303056 (10hashar) Quibble 1.3.0 build: https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php72-docker/... [15:33:30] Lucas_WMDE: it is definitely caused by the Quibble 1.3.0 > 1.4.3 upgrade [15:37:43] ok [15:44:05] 10Continuous-Integration-Infrastructure, 10Jenkins, 10Release-Engineering-Team, 10Quibble: Jenkins collapsible sections are deeply nested in Quibble output - https://phabricator.wikimedia.org/T303056 (10hashar) The plugin regex are at https://integration.wikimedia.org/ci/configure | Start | ^(?:.+)?INFO(?... [15:44:39] why? [15:44:40] well [15:44:43] I HAVE NO CLUE :D [15:45:11] :D [15:45:21] sorry to send you down this rabbit hole :'D [15:46:16] yeah well I found it I think [15:46:22] stupid logging module [15:46:23] :D [15:46:55] we log the output of the npm install command [15:46:58] which has no trailing new line [15:47:24] and when we logg the magic <<< Finish line it ends up appended to the npm output [15:47:32] all of that on a single log entry [15:47:44] o_O bad npm [15:47:58] gotta flush something somewhere [15:48:23] maybe npm does some cursor-movement stuff where, at the end, it clears the line instead of terminating it [15:48:35] could Quibble always print a newline before its markers? [15:48:45] might make them stand out to humans too (would look like a blank line after well-behaved tools) [15:52:40] (Queue (Jenkins jobs + Zuul functions) alert) firing: Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [15:58:18] at least I can reproduce it ;] [15:58:20] 10Continuous-Integration-Infrastructure, 10Jenkins, 10Release-Engineering-Team, 10Quibble: Jenkins collapsible sections are deeply nested in Quibble output - https://phabricator.wikimedia.org/T303056 (10hashar) Reproducible with: ` lang=python import logging from quibble.commands import Parallel logging.... [16:05:48] PROBLEM - Work requests waiting in Zuul Gearman server on contint2001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [400.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/d/000000322/zuul-gearman?panelId=10&fullscreen&orgId=1 [16:10:06] RECOVERY - Work requests waiting in Zuul Gearman server on contint2001 is OK: OK: Less than 100.00% above the threshold [200.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/d/000000322/zuul-gearman?panelId=10&fullscreen&orgId=1 [16:12:40] (Queue (Jenkins jobs + Zuul functions) alert) firing: (2) Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [16:22:07] 10Continuous-Integration-Infrastructure, 10Jenkins, 10Release-Engineering-Team, 10Quibble: Jenkins collapsible sections are deeply nested in Quibble output - https://phabricator.wikimedia.org/T303056 (10hashar) The issue is in quibble/util.py _redirect_logging. It creates a new logger `logger = logging.get... [16:22:18] Lucas_WMDE: I will try to remember about it next week. There is no obvious simple fix for the issue :] [16:22:25] ok :) [16:32:40] (Queue (Jenkins jobs + Zuul functions) alert) resolved: Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [17:08:37] (03PS7) 10Dduvall: Provide scap prep auto history browsing and replay [tools/scap] - 10https://gerrit.wikimedia.org/r/763864 (https://phabricator.wikimedia.org/T301417) [17:09:39] (03CR) 10Dduvall: "Fixed the _many_ flake errors and fixed .pipeline/blubber.yaml in the latest PS." [tools/scap] - 10https://gerrit.wikimedia.org/r/763864 (https://phabricator.wikimedia.org/T301417) (owner: 10Dduvall) [17:09:53] (03CR) 10VolkerE: doc: Add Codex, wikimedia-ui-base, less.php (032 comments) [integration/docroot] - 10https://gerrit.wikimedia.org/r/740937 (owner: 10Krinkle) [19:09:43] (03CR) 10Krinkle: [C: 03+2] Revert "zuul: Install MobileFrontend when testing Echo" [integration/config] - 10https://gerrit.wikimedia.org/r/768068 (https://phabricator.wikimedia.org/T225730) (owner: 10Krinkle) [19:12:10] (03Merged) 10jenkins-bot: Revert "zuul: Install MobileFrontend when testing Echo" [integration/config] - 10https://gerrit.wikimedia.org/r/768068 (https://phabricator.wikimedia.org/T225730) (owner: 10Krinkle) [19:13:42] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/768068 [19:13:44] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:08:24] hi, it is me again. can anyone explain https://phabricator.wikimedia.org/T303074 ? [20:17:36] (03CR) 10Dduvall: "This is ready I think, although the commit message needs a little TLC. I'll do that after my lunch." [tools/scap] - 10https://gerrit.wikimedia.org/r/763864 (https://phabricator.wikimedia.org/T301417) (owner: 10Dduvall) [20:26:34] (03PS1) 10Krinkle: zuul: Install MobileFrontend when testing Echo (take 2) [integration/config] - 10https://gerrit.wikimedia.org/r/768146 [20:26:38] (03CR) 10Krinkle: [C: 03+2] zuul: Install MobileFrontend when testing Echo (take 2) [integration/config] - 10https://gerrit.wikimedia.org/r/768146 (owner: 10Krinkle) [20:28:35] (03Merged) 10jenkins-bot: zuul: Install MobileFrontend when testing Echo (take 2) [integration/config] - 10https://gerrit.wikimedia.org/r/768146 (owner: 10Krinkle) [20:29:38] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/768146 [20:29:39] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:32:49] 10MediaWiki-Releasing, 10MediaWiki-Installer, 10MediaWiki-Stakeholders-Group, 10Epic, 10MW-1.38-release: Expand the set of bundled extensions and skins in MediaWiki 1.38 - https://phabricator.wikimedia.org/T290934 (10Jdforrester-WMF) [22:32:52] 10MediaWiki-Releasing, 10MediaWiki-Installer, 10MediaWiki-Stakeholders-Group, 10Epic, 10MW-1.37-release: Expand the set of bundled extensions and skins in MediaWiki 1.37 - https://phabricator.wikimedia.org/T279842 (10Jdforrester-WMF) [22:33:11] 10MediaWiki-Releasing, 10AbuseFilter (Overhaul-2020), 10MW-1.38-release: Bundle AbuseFilter extension with MediaWiki - https://phabricator.wikimedia.org/T191740 (10Jdforrester-WMF) 05Open→03Resolved a:03Daimona Yup! [22:56:57] 10GitLab (Auth & Access), 10Release-Engineering-Team (Next), 10gitlab-settings, 10User-brennen: gitlab-settings: Automate people/* group policies - https://phabricator.wikimedia.org/T288697 (10brennen) 05Open→03Declined See J273 for some thoughts on mostly scrapping the `/people` groups. [23:02:14] 10Release-Engineering-Team (Next), 10Release Pipeline (Blubber), 10User-dduvall: Implement buildkit frontend support in blubber for use on GitLab runners - https://phabricator.wikimedia.org/T301169 (10dduvall) p:05Triage→03Medium [23:56:22] (03PS1) 10Dduvall: Update all dependencies and stop using dep [blubber] - 10https://gerrit.wikimedia.org/r/768179 (https://phabricator.wikimedia.org/T301169) [23:56:58] (03CR) 10jerkins-bot: [V: 04-1] Update all dependencies and stop using dep [blubber] - 10https://gerrit.wikimedia.org/r/768179 (https://phabricator.wikimedia.org/T301169) (owner: 10Dduvall)