[07:11:20] (03PS8) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 [07:15:01] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan) [10:02:29] ^ new failure I don't understand; it works fine locally :\ [10:07:43] (03PS9) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 [10:11:28] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan) [10:14:16] oh, fun [10:14:41] I am trying to get the dbserver from the MySQL database_backend variable, but that hasn't been evaluated yet, of course [10:20:00] so, I need a way to get the value of dbserver after MySQL (or other DB) has started, and be able to pass that to the InstallMediaWiki command. awight do you know how I might go about doing that? [10:41:09] (sorry, in a meeting for a bit longer and then I can look) [11:13:52] kostajh: The issue is that db installation is done in a bg thread after this patch, and we need some data returned from it? [11:15:04] awight: in `master`, the DB backend starts and then the `db` object is passed to InstallMediawiki; InstallMediaWiki then reads `db.dbserver` which contains the socket that was created when the DB backend started [11:15:17] Maybe this isn't worth the trouble--starting the db backend takes 3s and installmediawiki takes 2.5s, btw [11:15:50] in the patch, the DB backend starts, but then InstallMediaWiki is run in parallel with npm/composer test and phpunit:unit. We can't pass in the db object because of some issues with pickle and serializing a thread object [11:16:47] ha, fair. I remember it being longer but you're right [11:16:51] I see, yeah the dbserver is a socket path under the tmp dir [11:17:12] with the current arrangement of code in backend.py, we can't calculate until late in the startup [11:17:30] We could move the tmpdir step to initialization perhaps, and pass that *into* the db backend. [11:17:44] Seems that the best outcome would be to make the db object serializable, though. [11:20:17] okay so self.server is the problem. mebbe we can omit that from serialization. [11:24:04] This looks like a way to accomplish it: https://stackoverflow.com/questions/2345944/exclude-objects-field-from-pickling-in-python/2345985 [12:42:19] kostajh: I'll give this a try in a follow-up patch. [13:00:14] oops, most of the patch is unwinding the db_config change. Should be squashed, if it works. [13:03:50] awight: thanks! [13:04:59] that seems to have worked [13:05:01] kostajh: Looks like it's holding up to CI, so far :+1: [13:05:23] I'll leave for you to squash, then [13:06:20] (03PS10) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 [13:08:02] done, thx [13:08:12] btw, I realized that most of the Parallel machinery will be reusable by a more flexible, dependency tree visitor. "waiting for" becomes a progress report about the entire job (e.g. 7/23 steps completed), and the stream capture deinterleaves command output from each child. [13:09:08] kostajh: I'm wondering if this is a collaborative editor or if the URL just prepopulates each visitor's browser independently: [13:09:11] https://www.plantuml.com/plantuml/umla/dLEnafmm3Etz5IgXoGVOHQv8cxacCwcD4Qm68tEMVKd-FZPWShEJtUAssB6bz_HuJ_0YoSQKLw_sWzBE1qQmtaF4BOZgVhn-UzLmbDCDGOXdZtiN9egIUgDeWYwY1FzU6s-PokLhN-4CtH-KNW7eUu1Hw0MXuz0hv94cfVCsseGWULZ3c3qA40F-JiX2WGKiZo0BiHoI9E12n4kfNlfpwiM247TEVOOfn4L1-MaNrk8E8n1Bgcw9TqlchfrGaZcPO6TBV01uMnzGQwUnYgscgMTAu3nQDgVRwk3IaP2ZkvnSzJqu5x6cqcfVV73M6_KV5otIzw8G3l9nl00jwUssPQQawvzHUby-3DOjfI-b [13:09:18] qBnznv7iJrh8g7lFDMtpRz8uU1JvK2X07gqFVA86P7TxjhXd-_d4Nfxju-4xSiJJyeZXLnCZdZ_cyrOjddErtQWTSyMBYxIb0difwFfjgRy0 [13:09:23] (oh that sort of answers the question) [13:11:13] I think it's not collaborative. Maybe some time we have a meeting where we try to draw a few optimized process diagrams. [13:21:51] sure. or it might be a good start to write down which jobs exist and where there are duplication of efforts (cloning repos, running composer/npm install) to see if we can use dependent jobs to 1) reduce duplication and 2) implement parallelization) [13:37:14] (03CR) 10Awight: [C: 03+1] "Looks like it should give a small speed gain: the three steps might even be complementary in using cpu, network, and disk, but unfortunate" [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan) [13:39:30] (03CR) 10Kosta Harlan: [WIP] Run post-dependency install, pre-test steps in parallel (031 comment) [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (owner: 10Kosta Harlan) [14:06:49] I'll look at T211702 again, I started to suspect there's something we might be able to do with ZuulCloner [14:12:03] Another thing to consider when auditing the workflow is that we've already split it up into different segments as separate zuul jobs, many overlapping. [14:26:02] (03PS11) 10Kosta Harlan: Run post-dependency install, pre-test steps in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (https://phabricator.wikimedia.org/T225730) [15:23:41] kostajh: Very hard to say yet, but at 11m06s you may have shaved a minute from runtime? [15:34:15] hmm, yeah hard to say without aggregating a bunch of runs [16:01:44] looking for documentation about phpunit-standalone... [16:01:56] ah here https://www.mediawiki.org/wiki/Manual:PHP_unit_testing/Writing_unit_tests_for_extensions [16:02:28] okay these are excluded from the gated test [16:26:59] kostajh: here's a wacky diagram, the forks are a bad notation for what I'm trying to do: shorturl.at/bovzY [16:27:34] The basic idea is that very few of our tests should actually be waiting on other tests, only the linters should be "blocking" [16:37:46] hashar: ^ you might be interested in that shorturl as well [16:38:39] awight: link is broken:D apparently [16:39:03] the `@group Standalone` got introduced to let developers excludes some heavy tests [16:39:08] I think we could represent this even with the basic "Parallel", if the bigger branches are rolled up into a sequence like "browser tests" [16:39:20] so yeah they run only for changes proposed to that repo [16:39:51] very strange, the short url lasted only a few minutes [16:41:58] that uml diagrams looks a it like the doc I painted before implementing quibble [16:42:18] I guess we can generate one based on the build plan [16:42:26] hashar: here's the diagram I was trying to send, https://gerrit.wikimedia.org/r/c/integration/quibble/+/757682 [16:43:00] +1 an automatic diagram would be great, but I'm also using it to document the desired future state [16:43:16] +1 [16:44:47] http://blockdiag.com/en/seqdiag/examples.html is quite nice [16:44:57] there is sphinx plugin for it [16:45:21] so then the graph is generated on the fly when building the doc and end up incorporated in the Sphinx generated doc [16:45:23] cool, I'm happy with any language [16:45:49] +1 translating the test plan fixtures to a diagram would be fun [16:46:09] What do you think about all the "background threads" introduced in my proposal? [16:48:45] background threads? [16:49:36] the thing is that on top of that workflow [16:49:39] all the forking in my "ideal" diagram results in a single thread going forward, and the others finishing in the background (leads to "X") [16:49:52] we have multiple jobs running various variants of the flow due to --run / --skip [16:50:00] or jobs using different environments [16:51:33] Yeah I think some of this is being implemented at the zuul level, mostly it can stay like that but a structure like this can still be used internall, so e.g. if you run qunit and selenium tests together they'll be in parallel... [16:51:53] ie., there can be an "if" around most of these steps [16:52:31] phpunit:unit should be quite fast, for phpunit:dbless that has to happen before the database / mediawiki install [16:52:45] then for the others I don't quite know :] [16:53:21] one of the problem I faced is that we have the CI jobs for php 7.2 / 7.4 etc [16:53:40] and we repeat a lot of common tasks between the varioous flavors [16:53:43] Not sure I understand that last point about the phpunits--they can both run before the MW install? or maybe they need LocalSettings.php [16:53:56] typically cloning the repo / installing the dependencies. That should probably be done only once in a root job [16:54:10] then the resulting artifacts get dispatched to jobs having different php [16:54:15] +1 I would love to find a way to do this [16:54:31] phpunit:unit doesn't need anything installed [16:54:49] :dbless I don't know, maybe it requires some localsetitngs and has some logic to avoid hitting the db [16:54:52] can't remember :] [16:54:55] then [16:54:59] Does it make sense to even build a docker image for each patch, and use that as the environment to run the php* in separate containers? [16:55:29] no idea [16:55:37] hehe that's how I feel. [16:55:38] build the docker image is a bit of a pain though [16:55:48] and we would have to push them to a registry [16:56:21] What do you think about adding some instrumentation to quibble, allowing more fine-grained analysis of our jobs? [16:56:34] fine grained analysis? [16:57:26] if we could have a sequence diagram representing the CI jobs and the Quibble tasks run by each of those jobs, that can surely be helpful [16:57:41] I was thinking, like sending timestamps to statsd as quibble runs each step, then we can see that step X takes so much time for some repos, etc. [16:57:54] ah possibly [16:58:15] though the runtime varies a lot between instances or repositories/branch [16:58:48] anyway I am sorry I have to leave [16:58:54] gotta prepare dinner and take care of the kids :D [16:58:57] totally, and the variation might be interesting? although most of that is just how busy machines are, good point [16:59:03] same. I'm being mobbed o/ [16:59:19] mobbed? [16:59:52] anyway that UML graph is a great basis :] [17:00:19] the original one I made is https://docs.google.com/drawings/d/1PYTo8sMPIZ2CSRdDQ7qNcGpLmDOgmNbGeHYdYAUbebM/edit [17:00:23] dinner halfway prepped and kids are excited [17:00:30] (which is all serialize execution) [17:00:33] yes! I was inspired by that one [17:00:33] ahah good [17:00:53] going to get my kids excited as well. Have to figure out a dinner plan :D [17:01:04] see you tomorrow! [17:31:03] phpunit:dbless actually depends on the database in various ways. [17:41:17] ty [21:43:45] (03CR) 10Krinkle: "How does parallel handle output, does it buffer them internally and then flush as whole chunks whenever one of them finishes? Or continous" [integration/quibble] - 10https://gerrit.wikimedia.org/r/757411 (https://phabricator.wikimedia.org/T225730) (owner: 10Kosta Harlan)