[06:27:17] 10Project-Admins, 10User-Urbanecm: Create User-IN project for IN / Q28 - https://phabricator.wikimedia.org/T289915 (10IN) >>! In T289915#7323771, @Aklapper wrote: > @IN: Hi, I don't see a usecase for the workboard columns "Feature request", "Invalid task", "Won't fix", "Fixed bug". All this information is alre... [07:44:58] 10Release-Engineering-Team, 10MobileFrontend, 10Quality-and-Test-Engineering-Team (QTE), 10Patch-For-Review, and 2 others: FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory - https://phabricator.wikimedia.org/T291145 (10zeljkofilipin) 05Open→03In progress a:03zeljkofil... [07:50:38] 10Continuous-Integration-Config, 10Wikidata, 10Wikidata Query Builder, 10wdwb-tech, and 2 others: Add branch hosting to Query Builder integration pipeline - https://phabricator.wikimedia.org/T278706 (10Michael) Successfully tested it in the wild at https://gerrit.wikimedia.org/r/c/wikidata/query-builder/+/... [09:06:52] 10Continuous-Integration-Config, 10Wikidata, 10Wikidata Query Builder, 10wdwb-tech, and 2 others: Add branch hosting to Query Builder integration pipeline - https://phabricator.wikimedia.org/T278706 (10Addshore) 05Open→03Resolved [09:15:02] 10Release-Engineering-Team (Seen), 10Quality-and-Test-Engineering-Team (QTE), 10User-zeljkofilipin: Release Engineering Data Collection and Retention (aka Data³) - https://phabricator.wikimedia.org/T216085 (10zeljkofilipin) [09:15:50] 10Release-Engineering-Team (Seen), 10Quality-and-Test-Engineering-Team (QTE), 10User-zeljkofilipin: Release Engineering Data Collection and Retention (aka Data³) - https://phabricator.wikimedia.org/T216085 (10zeljkofilipin) p:05Medium→03Triage [09:16:21] logstash-beta seems to be broken (/app/dashboards#/list is blank), is that a known issue? [09:16:42] 10Release-Engineering-Team (Seen), 10Quality-and-Test-Engineering-Team (QTE), 10User-zeljkofilipin: Release Engineering Data Collection and Retention (aka Data³) - https://phabricator.wikimedia.org/T216085 (10zeljkofilipin) For an example of a task where having test data would be really useful, see {T277205}. [09:16:59] (there are some other logstash-beta issues on Phabricator but they don’t seem to describe the same thing) [09:20:36] 10Continuous-Integration-Config, 10Quality-and-Test-Engineering-Team (QTE), 10Patch-For-Review, 10User-zeljkofilipin: Drop Ruby Selenium CI jobs; we don't support them any more - https://phabricator.wikimedia.org/T220035 (10zeljkofilipin) p:05Medium→03Triage [09:26:33] 10Scap: Update scap man page - https://phabricator.wikimedia.org/T291245 (10Majavah) [09:26:45] 10Scap: Update scap man page - https://phabricator.wikimedia.org/T291245 (10Majavah) [09:28:06] 10Continuous-Integration-Config, 10MediaWiki-Core-Tests, 10Quality-and-Test-Engineering-Team (QTE), 10Browser-Tests, and 2 others: Make MediaWiki Wdio tests less slow (Sept 2019) - https://phabricator.wikimedia.org/T234002 (10zeljkofilipin) p:05Medium→03Triage [10:01:03] 10Continuous-Integration-Config, 10Patch-For-Review, 10User-zeljkofilipin: Drop Ruby Selenium CI jobs; we don't support them any more - https://phabricator.wikimedia.org/T220035 (10zeljkofilipin) [10:03:23] 10Continuous-Integration-Config, 10Patch-For-Review, 10Ruby, 10User-zeljkofilipin: Drop Ruby Selenium CI jobs; we don't support them any more - https://phabricator.wikimedia.org/T220035 (10zeljkofilipin) [10:09:30] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201907), 10Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice, 10User-zeljkofilipin: Port CentralNotice Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T180223 (10zeljkofilipin) [10:29:48] 10Continuous-Integration-Config, 10ProofreadPage, 10Patch-For-Review: Get ProofreadPage Lua tests working in CI - https://phabricator.wikimedia.org/T289465 (10Inductiveload) The Lua tests are still not being run: this should not be passing: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ProofreadPage/... [11:00:05] PROBLEM - Work requests waiting in Zuul Gearman server on contint2001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [150.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [11:13:15] RECOVERY - Work requests waiting in Zuul Gearman server on contint2001 is OK: OK: Less than 100.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [12:30:20] Lucas_WMDE: https://phabricator.wikimedia.org/T233134 [12:31:44] ok thanks [12:46:01] PROBLEM - Work requests waiting in Zuul Gearman server on contint2001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [150.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [12:57:28] RECOVERY - Work requests waiting in Zuul Gearman server on contint2001 is OK: OK: Less than 100.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [12:58:12] ebernhardson: I commmented about the `shellcheck` patch for puppet ( https://gerrit.wikimedia.org/r/c/operations/puppet/+/720402 ) . Turns out we already have a Jenkins job to run shellcheck against a repository :-] [12:58:38] it is staightforward to add to any repo [13:01:02] if it is a more complicated use case such as invoking shellcheck from maven or whatever, we would have to add the package to the docker image [13:01:15] so that the usual tool (maven, tox, npm etc) can in turn invoke shellcheck [13:02:01] 10MediaWiki-Releasing: MEDIAWIKI CREATE (dnf) MODULE to ease install of Mediawiki and NGINX (and Apache too) on Fedora Worksation - https://phabricator.wikimedia.org/T291263 (10Majavah) 05Open→03Declined Please submit this request to the distributions in question. We (MediaWiki developers) are not familiar... [13:42:49] 10Continuous-Integration-Config, 10Wikidata, 10Wikidata Query Builder, 10wdwb-tech, and 2 others: Add branch hosting to Query Builder integration pipeline - https://phabricator.wikimedia.org/T278706 (10Lucas_Werkmeister_WMDE) How does this work? How do I use it? I just submitted a Query Builder change and... [14:10:52] 10Continuous-Integration-Config, 10Wikidata, 10Wikidata Query Builder, 10wdwb-tech, and 2 others: Add branch hosting to Query Builder integration pipeline - https://phabricator.wikimedia.org/T278706 (10Michael) >>! In T278706#7361959, @Lucas_Werkmeister_WMDE wrote: > How does this work? How do I use it? I... [15:19:56] 10Project-Admins, 10User-Urbanecm: Create User-IN project for IN / Q28 - https://phabricator.wikimedia.org/T289915 (10Aklapper) @IN: I wrote "All this information is already expressed via task status, or the "Feature" task type, so I don't understand why there are also columns for this." You did not reply to t... [15:21:55] 10Project-Admins, 10User-Urbanecm: Create User-IN project for IN / Q28 - https://phabricator.wikimedia.org/T289915 (10Aklapper) For example, this makes no sense, but tagging and moving creates notification noise: {F34646194} [15:27:35] hashar: i'll see if i can make that work, still needs the piece that runs jjb and extracts all the shell, but i think it should be able to then re-use that in a second docker container for shellcheck? [16:01:34] 10Release-Engineering-Team (Doing), 10GitLab, 10User-brennen: Early adoption signup for WMF GitLab - https://phabricator.wikimedia.org/T282842 (10Mholloway) Not sure if I was supposed to say something here first, but I've just created four new repos in GitLab to host the individual #metrics-platform client l... [16:02:55] mholloway: gitlab isn't open for non-wmf/nda use yet [16:03:58] RhinosF1: i intend to transfer them to jason linehan [16:04:34] mholloway: no idea who Jason is [16:04:54] More thinking if people outside that group want access to the code [16:05:26] They can't contribute [16:05:27] ah, i see. jason is the other WMF engineer on the product metrics platform team [16:05:34] Ok [16:14:26] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms, 10Patch-For-Review: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Yaron_Koren) 05Open→03Resolved a:03Yaron_Koren I don't really understand phan stuff, but I assume this is resolved now... [16:19:10] !log Enabled TLS on Jumbo Kafka instances in deployment-prep. [16:19:16] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:26:47] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms, 10Patch-For-Review: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Daimona) 05Resolved→03Open No, this wasn't finished. I stopped working on my patch because there were too many failures. Perhaps a better approach c... [17:01:37] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms, 10Patch-For-Review: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Daimona) There's a bunch of errors: https://integration.wikimedia.org/ci/job/mwext-php72-phan-docker/136559/console Some are due to SMW not being clone... [17:21:43] 10Release-Engineering-Team (Doing), 10Release, 10Train Deployments: 1.38.0-wmf.1 deployment blockers - https://phabricator.wikimedia.org/T281165 (10DannyS712) ##### Risky Patch! 🚂🔥 * **Change**: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/718414 (T267861) * **Summary**: (Why is it risky?) ** The glo... [18:07:40] !log Re-recreating qemu-1002 as integration-agent-qemu-1003 (Debian 11 Bullseye, g3.cores8.ram24.disk20.ephemeral40.4xiops), ref T28477 [18:07:43] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:07:43] T28477: editing of discussion page header is broken - https://phabricator.wikimedia.org/T28477 [18:08:02] !log Re-recreating qemu-1002 as integration-agent-qemu-1003 (Debian 11 Bullseye, g3.cores8.ram24.disk20.ephemeral40.4xiops), ref T284774 [18:08:09] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:08:09] T284774: Provide one or more Qemu agents in CI that use a newer version than 2.x - https://phabricator.wikimedia.org/T284774 [18:13:41] RhinosF1, well, we can throw patches over the wall from Phabricator [18:17:38] ebernhardson: somehow back. What was the use case you had in mind for shellcheck? I don't understand the jjb and "extracts all the shell" ;) [18:18:00] (03PS1) 10Subramanya Sastry: 1.37.0-wmf.23 workaround: Strip entity spans from Parsoid HTML [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/721878 [18:20:01] hashar: there are lots of shell scripts embedded inside the jjb configuration, my part compiles the jjb and extracts all the shell code that will be executed inside jenkins into individual .sh scripts [18:20:46] hashar: then shellcheck runs over them to find errors. I thought of it because i had a patch merged that used a multi-line command (with \ at the end of each). I forgot a \ and so it failed once deployed. Shellcheck at severity=warning catches that error (and many others) [18:21:33] OH smart [18:21:53] are you extracting the shell bits from the XML job definition? [18:22:04] yes, just an xpath for the tag usde [18:22:11] "just an xpath" [18:22:14] I love that :] [18:22:32] i mean, you say load this document and give me all the Hudson.task.Shell tags, its like 10 lines :P [18:23:16] james merged that bit, so you can actually run this locally. utils/shellcheck.sh in integration/config [18:23:22] so for integration-config jobs, they use the regular `docker-registry.wikimedia.org/releng/tox-buster` [18:23:42] we could add shellcheck to it, or most probably craft a child image that simply has shellcheck added [18:24:58] hmm, yea a child one should be easy enough to create. I suppose i was trying to take an easy route and do what the other scripts in utils/ were doing (but they don't have deps, so understandable) [18:25:21] (03CR) 10Subramanya Sastry: "I'm going to merge this and restart a fresh test run." [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/721878 (owner: 10Subramanya Sastry) [18:25:33] (03CR) 10Subramanya Sastry: [V: 03+2 C: 03+2] 1.37.0-wmf.23 workaround: Strip entity spans from Parsoid HTML [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/721878 (owner: 10Subramanya Sastry) [18:25:51] that sounded sane yes [18:26:04] we have stopped using puppet / dependencies on hosts a few years ago though [18:26:16] Because getting review/merge was too hard. [18:26:24] so the whole execution environment is in docker images. Theorically that eases reproductability [18:26:32] hashar: does this run docker containers inside docker containers? [18:26:36] Yes. [18:26:43] It's docker all the way down. [18:26:54] not docker in docker though! [18:26:56] oh. So i thought it was on the host because clearly this was invoking containers, but it's just a container invoking more containers [18:27:08] makes more sense now :) [18:27:10] hashar: Qemu in docker though? [18:27:21] the arch is: prod machines running jenkins --- ssh ---> WMCS integration project instances --> job runs 'docker run' [18:28:07] there is one edge case which is a WMCS instance > Qemu > Docker image > test script > x * docker run [18:28:17] Yeah. [18:29:42] once upon a time we had a system spinning up a fresh WMCS instance that would then be discarded after the build is done [18:29:49] but that is from a different era really [18:45:34] ebernhardson: want me to forge a tox-buster child image that has shellcheck? [18:45:45] then the integration-config-shellcheck job can be moved to it [18:46:23] hashar: sure, i was just reading the xml outputs. Turns out if i had read integration-config-shellcheck-docker/config.xml this would have all been clear from the start :) [18:46:39] :D [18:50:16] i suppose one annoyance, of course our code doesn't actually pass the linter right now :P Would be nice if there was some way to only fail if there are new errors, but seemed complicated [18:51:45] James_F: given shellcheck is 18MB maybe I can just add it to tox-buster? [18:52:29] ebernhardson: on our setup you gotta run against HEAD, then again against HEAD^ and then do the difference [18:52:43] hashar: Yeah, let's do that. Do you want to or should I? [18:53:24] Jenkins does have system to compare the progress between build and not fail if no new errors has been introduced. But that can't be done on our setup where the previous build most probably hasn't been build for the parent commit [18:53:56] I am not sure how I ended up creatin gso many tox images :\ [18:53:59] James_F: will do [18:54:01] that makes sense. [18:55:01] Done. [18:55:26] (03PS1) 10Jforrester: dockerfiles: [tox-buster] Install shellcheck and cascade [integration/config] - 10https://gerrit.wikimedia.org/r/721881 [18:55:51] We should fold most of the tox images together, I agree. [18:55:59] hashar: ^^ [18:56:10] Oh, sorry, I jumped too fast. :-) [18:56:11] damn [18:56:18] I HAVE LOST THE RAce [18:56:38] I think the idea might have been to avoid ahving an image that is too big [18:56:39] The slow bit was double-checking with https://packages.debian.org/buster/shellcheck that the package is called shellcheck. ;-) [18:56:44] then having so many is a bit of a pain [18:56:53] Yes, but ten "small" images is more of a pain than one large one. [18:56:57] Yeah. [18:57:26] as seen in that change [18:57:42] there might be a use case for a child image when something specific is being done [18:57:43] I mean [18:57:49] more than just adding a few packages [18:58:25] (03CR) 10Hashar: [C: 03+2] "And that provides shellcheck in the CI execution environment for releng/tox-buster image." [integration/config] - 10https://gerrit.wikimedia.org/r/721881 (owner: 10Jforrester) [18:58:55] And filed ^ [18:58:56] 10Continuous-Integration-Infrastructure: Fold most of the tox images into tox-buster - https://phabricator.wikimedia.org/T291292 (10Jdforrester-WMF) [19:01:02] (03Merged) 10jenkins-bot: dockerfiles: [tox-buster] Install shellcheck and cascade [integration/config] - 10https://gerrit.wikimedia.org/r/721881 (owner: 10Jforrester) [19:04:36] building [19:05:21] !log Building Docker images for [tox-buster] Install shellcheck and cascade [integration/config] - https://gerrit.wikimedia.org/r/721881 [19:05:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:05:37] so I can pretend I am of any assistance [19:06:01] ebernhardson: so essentially once those docker images are build, we can bump the jenkins job to use the new version and shellcheck will beavailable in the execution environment [19:07:32] (03PS1) 10Jforrester: Docker: [tox-buster] Add several packages so we can scrap sub-images [integration/config] - 10https://gerrit.wikimedia.org/r/721884 (https://phabricator.wikimedia.org/T291292) [19:07:34] (03PS1) 10Jforrester: Docker: Drop tox-(censorshipmonitoring|conftool|eventlogging|homer|ldap) [integration/config] - 10https://gerrit.wikimedia.org/r/721885 (https://phabricator.wikimedia.org/T291292) [19:07:45] I'll leave those for later. [19:10:57] excellent, i'll see what i can workup on running previous and current version, it's complicated because we don't have real names of these files, i don't really know if debian-glu-unstable/12.sh is the same file as the last time round, but maybe something can work [19:11:26] if the current shellcheck test for jjb fails, we probably want to make the job `voting: false` in zuul/layout.yaml [19:11:29] and make it voting once it passes [19:11:57] ebernhardson: or we fix them all [19:12:01] it failes at severity=critical for 1 place, but the failure isn't really a failure, it's just mixing a templating language with shell so the resulting shell is odd [19:12:16] then we can just enforce it and don't have to bother about comparing with previous state since it is guaranteed to be all ok [19:12:22] ah [19:12:24] its something like `if [ -n "{foo}" ]` and shellcheck says -n against a string constant is pointless [19:12:26] 10Continuous-Integration-Infrastructure, 10Performance-Team, 10Patch-For-Review: Provide one or more Qemu agents in CI that use a newer version than 2.x - https://phabricator.wikimedia.org/T284774 (10Krinkle) [19:12:34] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Krinkle) [19:12:45] but aren't the templates expanded when crawling the xml configurations? [19:13:08] hashar: yes, so shellcheck sees `if [ -n "debian" ]`. And it says this shouldn't be an if condition because it can only go one way [19:13:17] instead of seeing that this is a variable [19:13:25] OH true [19:13:50] i can fix it easily, i just sometimes wonder if we really modify reasonable code to make a linter happy [19:13:52] well I guess we can add a comment to have shellcheck ignore that specific error at that specific line [19:14:00] add a variable to the script and fix that linting issue? [19:14:12] bd808: yea i have a patch for that, just trying to write the commit message now :) [19:14:19] :) [19:14:23] or that yeah [19:14:46] you could argue that any jjb variables should be correspond shell variables at the top of the shell snippet [19:14:57] foo="{foo}" [19:15:04] // do stuff with $foo [19:15:49] (03PS1) 10Ebernhardson: jjb: Pass shellcheck at severity=critical [integration/config] - 10https://gerrit.wikimedia.org/r/721889 [19:17:25] (03PS2) 10Ebernhardson: jjb: Pass shellcheck at severity=critical [integration/config] - 10https://gerrit.wikimedia.org/r/721889 [19:17:26] maybe better commit message [19:19:04] Successfully published image docker-registry.discovery.wmnet/releng/tox-buster:0.4.1 [19:20:43] (03PS3) 10Ebernhardson: jjb: Pass shellcheck at severity=critical [integration/config] - 10https://gerrit.wikimedia.org/r/721889 [19:21:13] (03PS1) 10Hashar: jjb: bump image for integration-config-shellcheck [integration/config] - 10https://gerrit.wikimedia.org/r/721892 [19:21:27] ebernhardson: I have updated the job [19:21:40] (03CR) 10Ebernhardson: "check experimental" [integration/config] - 10https://gerrit.wikimedia.org/r/721889 (owner: 10Ebernhardson) [19:21:45] (03CR) 10Hashar: [C: 03+2] "Job updated" [integration/config] - 10https://gerrit.wikimedia.org/r/721892 (owner: 10Hashar) [19:22:01] and I think that is all for tonight [19:22:34] hashar: thanks!! [19:23:45] (03Merged) 10jenkins-bot: jjb: bump image for integration-config-shellcheck [integration/config] - 10https://gerrit.wikimedia.org/r/721892 (owner: 10Hashar) [19:24:36] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): Support Cinder for CI docker workers - https://phabricator.wikimedia.org/T277078 (10Krinkle) I've done the following: * Deleted agent-qemu-1002 which we both did va... [19:34:26] 10Continuous-Integration-Config, 10ProofreadPage, 10Patch-For-Review: Get ProofreadPage Lua tests working in CI - https://phabricator.wikimedia.org/T289465 (10Inductiveload) p:05Triage→03High I think this is fairly important to get working if we can. [19:53:26] ebernhardson: might want to add the output of `shellcheck --version` . I think we get 0.5.0 with Buster [19:53:47] https://packages.debian.org/search?keywords=shellcheck shows buster-backports has 0.7.1 which would be an improvement [19:54:12] but that will be for later. I am off to bed [20:02:09] (03PS1) 10Ebernhardson: dockerfile: [tox-buster] Install shellcheck from buster-backports [integration/config] - 10https://gerrit.wikimedia.org/r/721895 [20:17:44] James_F: how are you writing the changelogs? I imagine not by hand [20:28:46] Ebernardson: There’s a Debian tool that is used as part of docker-pkg. See the README for the docker-pkg update command. [20:29:31] James_F: thx [21:42:51] (03PS2) 10Ebernhardson: dockerfile: [tox-buster] Install shellcheck from buster-backports and cascade [integration/config] - 10https://gerrit.wikimedia.org/r/721895