[08:02:21] (03CR) 10Kosta Harlan: [C: 03+2] Clean-up: remove old parallel_run [integration/quibble] - 10https://gerrit.wikimedia.org/r/587890 (owner: 10Awight) [08:02:39] (03PS5) 10Kosta Harlan: phpbench: Support aggregate reports [integration/quibble] - 10https://gerrit.wikimedia.org/r/741974 (https://phabricator.wikimedia.org/T291549) [08:10:25] (03PS1) 10Kosta Harlan: ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 [08:13:37] kostajh: The further in we go, the more convinced I'm becoming that "ParallelCommand" was not quite the right abstraction... [08:13:53] I think a dependency tree might be more appropriate. [08:14:26] (03CR) 10jerkins-bot: [V: 04-1] ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 (owner: 10Kosta Harlan) [08:16:54] Along the same lines, starting to think that the fork points would be much simpler if we could use copy-on-write semantics... if that could magically span the instantiated DB as well... [08:18:35] (03Merged) 10jenkins-bot: Clean-up: remove old parallel_run [integration/quibble] - 10https://gerrit.wikimedia.org/r/587890 (owner: 10Awight) [08:18:55] Something like your docker concept: phase 1 is that we check out the single repo and run linters, phase 2 is to set up the full environment with dependencies and a MW install; but build this as a docker image, and then run the remaining tests on top of several instances of that container... [08:19:56] I guess it comes down to the concrete trade-offs, how much memory each container would require, vs. how much extra processing and disk churn to recreate the base state in a single container. [08:20:26] I should stop letting "hunches" guide my thinking because I have no feel for what's happening on VPSes. [08:21:07] Random thought: we add telemetry to sample resource usage and elapsed time for each Command. [08:22:51] (03PS2) 10Kosta Harlan: ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 [08:26:44] (03CR) 10jerkins-bot: [V: 04-1] ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 (owner: 10Kosta Harlan) [08:27:42] (03CR) 10Awight: [C: 03+2] "Worth noting that our full-run will have to be either core or an extension, and the code paths diverge in many places. But adding an exte" [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 (owner: 10Kosta Harlan) [08:31:24] (03CR) 10jerkins-bot: [V: 04-1] ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 (owner: 10Kosta Harlan) [08:42:37] (03PS3) 10Kosta Harlan: ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 [08:44:47] awight: I think ParallelCommand is still compatible with the other idea you've described. (like you might want to run the linters in parallel in step 1.) But yeah, I think there are advantages to a staged approach for running the tests, so we can reuse assets generated in a previous stage [08:48:05] for GitClean, the idea is that when quibble runs for an extension it clones mediawiki/core vendor, the extension and its dependencies [08:48:41] then one of the first step is to run the linters for the extension or skin. So we do composer/npm install in order to run the linters which installs a bunch of package there [08:49:02] (03CR) 10jerkins-bot: [V: 04-1] ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 (owner: 10Kosta Harlan) [08:49:17] after the lint stage is done, we want to get rid of those composer and npm packages cause that is not how they should be shipped in reality [08:49:53] a) for composer package of an extension, they must be installed from mediawiki/core using the composer merge plugin which would resolve the set of dependencies accross core + all the extensions enabled. So we can't do a composer install in each of the extension [08:50:19] b) for npm packages, we don't support npm. The javascript lib have to be vendored (either by git adding them or using a build script which is what Wikibase is doing afaik) [08:50:40] so the GitClean after lint is to ensure we never borrow dependencies directly from npm [08:51:06] and the extension composer dependenncies gte installed later when running composer install from mediawiki/core after having created a composer.local.json which is used by the composer merge plugin [08:52:44] awight: the CI Jenkins is indeed not configurable by people. The rules can't even be seen unless you are an admin (they are at https://integration.wikimedia.org/ci/configure ) [08:52:55] awight: the plugin rule for Quibble is a catch all [08:53:02] Section start: `^INFO:quibble.commands:>>> Start: ((?!User commands: mediawiki-fresnel-patch)[^{,]+).*` [08:53:13] Section end: `^INFO:quibble.commands:<<< Finish: .+, in .+ s` [08:53:20] hashar: Strange that it doesn't catch sections in our quibble ci-fullrun-job, then. [08:53:26] Section name: `{1}` [08:53:37] the name is thus whatever the user command is and show up as a header [08:53:51] off to commute my kid back to school. Be back in ~ 15 mins [08:54:14] eg, https://integration.wikimedia.org/ci/job/integration-quibble-fullrun/608/consoleFull [08:54:26] 20:11:56 INFO:quibble.commands:>>> Start: Extension and skin submodule update under MediaWiki root /workspace/src [08:59:38] (03PS4) 10Kosta Harlan: ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 [09:01:31] awight: ^ I added a depends-on to a GrowthExperiments patch that I think should fix the ci-full-run failure [09:02:11] however I've also changed the patch to set TEST_PROJECT to an extension, which ends up running different code paths than when TEST_PROJECT is mediawiki/core... I'll experiment with this [09:04:44] (03PS5) 10Kosta Harlan: ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 [09:09:32] (03CR) 10jerkins-bot: [V: 04-1] ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 (owner: 10Kosta Harlan) [09:10:52] oh right, depends on doesn't really work with quibble's ci-full-run [09:12:59] > some Selenium tests are already relying on Apache [09:13:15] this is a surprising development. What makes them dependent? [09:13:33] awight: OH maybe because the fullrun job has ansi coloring for the quibble.commands logging lines and the regex does not take those ansi escapes in account [09:14:06] (03PS6) 10Kosta Harlan: [WIP] ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 [09:16:33] So I guess the regex just needs to account for the escapes? [09:18:02] https://integration.wikimedia.org/ci/job/integration-quibble-fullrun/608/consoleText [09:19:38] awight: https://regexr.com/6e2ci [09:19:53] the regex I have entered does (start pattern|end pattern) [09:19:58] as taken from the jenkins config [09:20:13] there four test, two with pure text, two with ansi sequences [09:21:29] (03CR) 10jerkins-bot: [V: 04-1] [WIP] ci-full-run: Add an extension to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 (owner: 10Kosta Harlan) [09:22:57] awight: the tests that rely on apache are the ones that use visualeditor to save edits to existing pages. However, those should check if QUIBBLE_APACHE env is set before being run [09:23:38] (IRC is terrible for threaded conversations) [09:24:24] (03PS7) 10Kosta Harlan: [WIP] ci-full-run: Add extensions to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 [09:26:02] hashar: [09:26:08] hashar: cat consoleText | grep -E '^[^:]*INFO[^:]*:quibble.commands:>>> Start' [09:26:22] oh [09:26:26] [^:] is rad [09:26:39] <3 [09:27:40] then if .+ is not greedy we can use it [09:28:36] https://regexr.com/6e2ci updated [09:28:46] which has: [09:28:48] `^(\u001B.+)?INFO(\u001B[^:]+)?:quibble.commands:>>> Start: ((?!User commands: mediawiki-fresnel-patch)[^{,]+).*` [09:28:58] `^(\u001B.+)?INFO(\u001B[^:]+)?:quibble.commands:<<< Finish: .+, in .+ s` [09:31:18] 10Quibble, 10Continuous-Integration-Infrastructure, 10Jenkins: integration-quibble-funrun job console does not have collapsible sections - https://phabricator.wikimedia.org/T300112 (10hashar) [09:31:33] I should make my little burst of quibble energy transparent on my team's work board, and was wondering which related task we consider a high priority. [09:32:03] 10Quibble: Quibble concurrent capture tests fail on MacOS - https://phabricator.wikimedia.org/T299840 (10awight) p:05Triage→03Low [09:32:16] 10Quibble, 10Continuous-Integration-Infrastructure, 10Jenkins: integration-quibble-funrun job console does not have collapsible sections - https://phabricator.wikimedia.org/T300112 (10awight) p:05Triage→03Low [09:33:34] 10Quibble, 10Continuous-Integration-Infrastructure, 10Jenkins: integration-quibble-funrun job console does not have collapsible sections - https://phabricator.wikimedia.org/T300112 (10hashar) a:03hashar The reason is that most of the jobs have Quibble to autodetect coloring and they end up with no coloring... [09:34:29] awight: there is one about LocalSettings.php not being deleted between runs which is quite annoying when playing with quibble locally [09:34:41] might not be a high prio though :]]] [09:35:08] 10Quibble, 10Unplanned-Sprint-Work, 10User-awight, 10WMDE-TechWish-Sprint-2022-01-19: Use interruptable parallelism - https://phabricator.wikimedia.org/T234309 (10awight) 05Open→03Resolved a:03awight [09:36:10] 10Quibble, 10Unplanned-Sprint-Work, 10User-awight, 10WMDE-TechWish-Sprint-2022-01-19: Use interruptable parallelism - https://phabricator.wikimedia.org/T234309 (10awight) 05Resolved→03Open [09:36:22] 10Quibble, 10Unplanned-Sprint-Work, 10User-awight, 10WMDE-TechWish-Sprint-2022-01-19: Use interruptable parallelism - https://phabricator.wikimedia.org/T234309 (10awight) a:05awight→03None [09:38:10] (03CR) 10jerkins-bot: [V: 04-1] [WIP] ci-full-run: Add extensions to full run [integration/quibble] - 10https://gerrit.wikimedia.org/r/757381 (owner: 10Kosta Harlan) [09:40:25] how fragile our phpunit tests are :\ [09:41:55] and of course `\x1B` is not recognized bah [09:42:03] awight: not exactly quibble related, but do you have familiar with wikibase tests and know why this failure happens at 10:28.517 ? https://integration.wikimedia.org/ci/job/integration-quibble-fullrun/619/console [09:42:36] I noticed the same issue when trying to parallelize PHPUnit tests. As soon as you start executing tests outside the currently configured order in CI, weird things start to happen [09:50:32] oh [09:52:24] 10Quibble, 10Continuous-Integration-Infrastructure, 10Jenkins: integration-quibble-funrun job console does not have collapsible sections - https://phabricator.wikimedia.org/T300112 (10hashar) 05Open→03Resolved Somehow `\u001B` is not recognized so I went with `.`, in the end I have applied: | Start | ^(... [09:52:36] 10Quibble: Switch QUnit tests to use Apache backend - https://phabricator.wikimedia.org/T299491 (10awight) Noting that some qunit tests e.g. in MediaWiki extensions are already using Apache, but at least the mediawiki/core repo is still launching a PHP standalone server. [09:52:45] awight: sections are available when colored logging is used https://integration.wikimedia.org/ci/job/integration-quibble-fullrun/609/consoleFull [09:53:06] Right on! [09:53:29] It's even nesting... wow [09:55:17] I am not sure what would happen though when the finish lines are out of order [10:00:26] nice [10:00:38] shouldn't the "Start backends" part be back at root level indentation though? [10:00:50] https://integration.wikimedia.org/ci/job/integration-quibble-fullrun/609/consoleFull#console-section-14 [10:10:35] 10Quibble, 10Patch-For-Review, 10Unplanned-Sprint-Work, 10WMDE-TechWish-Sprint-2022-01-19: Switch QUnit tests to use Apache backend - https://phabricator.wikimedia.org/T299491 (10awight) [10:12:23] 10Quibble, 10MediaWiki-Core-Tests, 10Code-Health, 10Continuous-Integration-Config, 10Patch-For-Review: Quibble: Run PHPUnit databaseless and database stages in parallel - https://phabricator.wikimedia.org/T235449 (10awight) a:05awight→03None [10:18:18] 10Quibble: Quibble should backup and delete LocalSettings.php - https://phabricator.wikimedia.org/T218647 (10awight) >>! In T218647#5640971, @Niedzielski wrote: > I've been away for awhile so please excuse this comment if it doesn't make sense, but did you consider a "`--deleteLocalSettings`" (or similar) flag?... [10:22:20] 10Quibble, 10Continuous-Integration-Infrastructure, 10Continuous-Integration-Config, 10Patch-For-Review: Run linters before starting longer running jobs - https://phabricator.wikimedia.org/T297561 (10awight) Should this task be stalled, pending the discussion in https://gerrit.wikimedia.org/r/c/integration... [10:22:57] kostajh: yeah indentation seems to be accidental, the "*" trigger some sort of markdown-like logic in jenkins or the collapsing section plugin. [10:23:51] IMO we don't need any nesting for parallel command logs, they can be interpreted as if they were run sequentially. [10:29:15] kostajh: I see, yes that makes sense to skip the VE editing tests if we can detect they'll fail for reasons unrelated to the patch being tested. I suppose this limitation will go away with multithreaded php-standalone, soon enough. [10:33:09] kostajh: sorry for answering threads backwards. Finally took a look at the wikibase etc. failures you linked, it's quite an impressive pile of assumptions that break! [10:33:35] But it seems like we can fix incrementally as we did with browser tests. [10:34:22] Only 18 tests to fix (for now) out of 43k is not a bad head start. [10:34:32] I'm sure more will emerge later. [10:36:30] I misunderstood. The quibble patch being tested doesn't add any parallelism to phpunit, so why is the db so badly contaminated? [10:41:50] Because our tests are fragile, and work only when executed with the exact configuration run in CI. Since they are not run in the usual way, there’s probably some test that is relying on global state that has been modified in unexpected ways [10:44:14] 10Quibble, 10Patch-For-Review, 10User-zeljkofilipin: Run Parsoid service in quibble - https://phabricator.wikimedia.org/T218534 (10awight) 05Open→03Resolved Looks like this is done. [10:52:26] Another example of how ParallelCommand isn't quite right: phpunit:unit can run before install-mediawiki, but phpunit:dbless and phpunit:standalone need LocalSettings.php. This makes a nice tree, where phpunit:unit branches off and installation proceeds in parallel, then the other two phpunit suites run in parallel. But it isn't expressed nicely using explicit parallel groups. [11:09:19] 10Quibble: Quibble initialize step should only clone the target repository - https://phabricator.wikimedia.org/T211702 (10awight) 05Open→03Stalled Deploying a naive patch for this caused problems described in the revert, e19c2a96e998d. The implementation needs to be planned better before trying again. Mayb... [11:10:57] 10Quibble, 10Continuous-Integration-Infrastructure, 10Continuous-Integration-Config, 10Patch-For-Review: Run linters before starting longer running jobs - https://phabricator.wikimedia.org/T297561 (10kostajh) >>! In T297561#7651941, @awight wrote: > Should this task be stalled, pending the discussion in ht... [11:11:02] 10Quibble: Quibble initialize step should only clone the target repository - https://phabricator.wikimedia.org/T211702 (10awight) It's still a motivating problem: cloning all dependencies takes 50 seconds of early startup, which must finish before any tests can be run. [11:22:34] (03PS1) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 [11:22:46] awight: perhaps, yeah. I made ^ while thinking about that. It is awkward indeed. [11:23:33] I think we could try to split up cmd.py into different phases (methods?) so it's easier to reason about what is happening where/when [11:25:35] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan) [11:32:53] awight: how can I get the full diff when the plan YAML doesn't match? The test output says ` ...Full output truncated (2 lines hidden), use '-vv' to show` but passing -vv to tox doesn't do anything [11:37:56] (03PS2) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 [11:42:17] hmm, well -v and -vv seem to do something, but they don't expand the full diff, anyway, i'll leave that for another day [11:42:43] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan) [11:50:08] (03PS3) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 [11:54:22] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan) [12:00:28] kostajh: +1 SequentialCommand might help with rolling up steps into larger units. [12:00:46] kostajh: I use "pytest -vv" [12:02:17] (03CR) 10Kosta Harlan: "The error seems to be:" [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan) [12:05:20] awight: hmm, I'm using `tox -e py3-unit -v` [12:05:47] kostajh: under the hood, afaik that calls pytest [12:06:01] But I don't know if you can pass arbitrary cli args through. [12:06:25] `pytest` exists globally on your system? [12:06:32] oh cool, https://tox.wiki/en/latest/example/general.html#interactively-passing-positional-arguments [12:06:42] kostajh: no, but it's on the path when I use venv [12:07:28] hmm, it isn't for me [12:07:51] I run `python setup.py install` and then `quibble` is on path, but `pytest` isn't [12:17:02] kostajh: tox installs it under .tox/venv-py38/bin/pytest, I believe I ran `pip install pytest` locally to make it easy to run from the cli. [12:17:20] But we can try the {posargs} technique from above--lemme push a patch for that. [12:18:18] oh! we already have {posargs} [12:18:31] tox -e py3-unit -- -vv [12:18:48] TIL [12:21:33] awight: nice, thanks [12:26:09] (03PS8) 10Awight: Sequence of commands as a command [integration/quibble] - 10https://gerrit.wikimedia.org/r/588083 [12:26:21] (03CR) 10Awight: "PS 8: manual rebase" [integration/quibble] - 10https://gerrit.wikimedia.org/r/588083 (owner: 10Awight) [12:54:28] Whew, I'm really struggling finding a comfortable notation for this workflow. Fun problem to have, though. [12:55:12] (03PS9) 10Awight: Sequence of commands as a command [integration/quibble] - 10https://gerrit.wikimedia.org/r/588083 [12:56:56] The interesting things are, * many steps are conditional, * several potential fan-out points, * but requires fan-in at several points before continuing. [12:57:18] Maybe it will help to draw another version of the activity diagram, with the ideal arrangement of steps. [12:58:09] btw I used the online editor with preview, "online server" in the sidebar of https://plantuml.com/ [13:02:25] 10Quibble, 10Continuous-Integration-Infrastructure, 10Jenkins: integration-quibble-funrun job console does not have collapsible sections - https://phabricator.wikimedia.org/T300112 (10awight) I think that `(?:[^:]+)?` can be simplified to `[^:]*`, now? [13:13:39] awight: I mean, one thing is that we may not be wanting to do this parallelization in quibble at all. you could have job 1 which clones repos and does nothing more. jobs 2, 3, 4 run in parallel and install MW, run composer/npm test, and run phpunit:unit. jobs 5, 6, 7, 8 run the different PHPUnit flavors, API Testing, Selenium. Each reusing the assets generated in the previous phase [13:19:42] kostajh: +1 this is definitely an unresolved tension in the architecture, the responsibilities of jjb + jenkins + quibble overlap quite a bit. [13:38:41] awight: do you have an idea what is the problem in https://gerrit.wikimedia.org/r/c/integration/quibble/+/757411/3#message-85e95b9d2a8f5aa189c54e59ce9840127a48161b ? [14:20:00] kostajh: I haven't been able to look in depth, but it certainly seems interesting... [14:22:25] I'm hoping it's just a sad path coming back from one of the child processes needs to handle exceptions differently. [15:11:41] awight: multiprocessing attempts to serialize a thread object to pass to the pool worker [15:11:53] and it looks like the thread object can't be seralized [15:12:29] probably one of the backends such as the database backend which has a thread streaming the server output to logging [15:12:58] based on https://stackoverflow.com/a/8805244/639804 :] [15:14:38] I think we passed the backend solely to retrieve the parameters such as username or the uri to reach it [15:15:46] and +1 on kostajh remarks regarding multiple jobs doing essentially the same step. The clone and install dependencies one being the obvious one [15:16:27] or a change might well fail the lint step ( which can be detected by cloning solely the repository independently of quibble) [15:16:34] but CI would still run all the other heavy jobs [15:16:46] I might have filed a task about that [15:16:57] one possibility is to have a fast lint job that does git clone / npm test / composer test [15:17:11] and once that one has run, run the other jobs. Which in Zuul can be defined with something such as: [15:17:13] test: [15:17:16] - lint-job [15:17:19] ERRR [15:17:21] test: [15:17:25] - lint-job: [15:17:34] - heavy-job1 [15:17:36] - heavy-job2 [15:17:38] - heavy-job3 [15:17:38] etc [15:17:58] which is exactly what we do for integration/quibble which run the tox job first and only if it passed the fullrun will be triggered [15:18:38] but in our Zuul layout config we combine jobs coming from different templates and I don't think we can define those hierarchy in templates and have different tree of jobs sharing the same root to be merged together [15:19:23] an alternative is to add a `lint` pipeline in Zuul which vote on a new label in Gerrit Lint +1 [15:19:33] and when CI vote Lint +1 that would trigger the test pipeline [15:20:10] so send patch > lint pipeline > lint tests > vote Lint +1 in Gerrit > Zuul receive the event and add the change to test > rest of the flow [15:25:12] kids commuting & [20:33:37] re: thread error when I run it locally I get `AttributeError: module 'quibble.mediawiki' has no attribute 'maintenance'` [20:39:08] kostajh: :-\ [20:41:18] I am off [20:41:27] hopefully will be able to allocate time to Quibble stuff tomorrow! [21:32:38] (03PS4) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 [21:36:52] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan) [21:54:45] (03PS5) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 [21:57:36] (03PS6) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 [22:00:57] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan) [22:06:28] (03PS7) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 [22:09:15] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan)