[10:29:00] (03PS16) 10Kosta Harlan: [WIP] Run PHPUnit tests in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/742200 (https://phabricator.wikimedia.org/T50217) [10:33:28] !log applying schema changes from https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/743661 on deployment-prep by hand [10:33:29] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:46:43] (03PS34) 10Kosta Harlan: [DNM] CI full run with extensions [integration/quibble] - 10https://gerrit.wikimedia.org/r/742201 [10:47:08] (03PS35) 10Kosta Harlan: [DNM] CI full run with extensions [integration/quibble] - 10https://gerrit.wikimedia.org/r/742201 [10:56:03] (03CR) 10jerkins-bot: [V: 04-1] [DNM] CI full run with extensions [integration/quibble] - 10https://gerrit.wikimedia.org/r/742201 (owner: 10Kosta Harlan) [13:39:52] (Queue (Jenkins jobs + Zuul functions) alert) firing: Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [13:40:44] (03PS36) 10Kosta Harlan: [DNM] CI full run with extensions [integration/quibble] - 10https://gerrit.wikimedia.org/r/742201 [13:44:16] PROBLEM - Work requests waiting in Zuul Gearman server on contint2001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [400.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [14:05:01] (03PS1) 10Jforrester: Zuul: Add PHP81 as voting for libraries, PHP extensions etc. [integration/config] - 10https://gerrit.wikimedia.org/r/743995 (https://phabricator.wikimedia.org/T293509) [14:08:15] 10Beta-Cluster-Infrastructure, 10Maps, 10Product-Infrastructure-Team-Backlog: Puppet config is broken for the maps instance on deployment-prep - https://phabricator.wikimedia.org/T291624 (10Jgiannelos) This instance is used as an environment for testing changes to the maps stack so its useful to keep it as p... [14:31:40] (03CR) 10jerkins-bot: [V: 04-1] [DNM] CI full run with extensions [integration/quibble] - 10https://gerrit.wikimedia.org/r/742201 (owner: 10Kosta Harlan) [15:09:23] 10Phabricator, 10Data-Engineering, 10Data-Engineering-Kanban: Herald rule for Data-Engineering - https://phabricator.wikimedia.org/T295397 (10Ottomata) https://github.com/Ladsgroup/Phabricator-maintenance-bot/pull/42 It doesn't look like the bot can remove tags, only add them. [15:09:31] 10Phabricator, 10Data-Engineering, 10Data-Engineering-Kanban: Herald rule for Data-Engineering - https://phabricator.wikimedia.org/T295397 (10Ottomata) a:03Ottomata [15:20:10] Project beta-update-databases-eqiad build #55102: 04FAILURE in 9.7 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/55102/ [15:45:02] Amir1: ^ flaggedtemplates! [15:45:13] nooo [15:46:00] I'll take a look soon [15:46:48] that almost belongs into bash. “^ flaggedtemplates! — nooo” [15:47:05] no further context needed [15:55:05] 10Release-Engineering-Team (Doing), 10Scap, 10serviceops, 10Patch-For-Review: Deploy Scap version 4.0.2 - https://phabricator.wikimedia.org/T291095 (10dancy) >>! In T291095#7550012, @gerritbot wrote: > Change 744032 had a related patch set uploaded (by Simone Cuomo; author: Simone Cuomo): > %%%[mediawiki/e... [16:04:52] (Queue (Jenkins jobs + Zuul functions) alert) firing: (2) Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [16:09:32] (03CR) 10Ahmon Dancy: [C: 03+2] values-traindev.yaml: Enable php.devel_mode [tools/train-dev] - 10https://gerrit.wikimedia.org/r/743033 (owner: 10Ahmon Dancy) [16:10:36] (03Merged) 10jenkins-bot: values-traindev.yaml: Enable php.devel_mode [tools/train-dev] - 10https://gerrit.wikimedia.org/r/743033 (owner: 10Ahmon Dancy) [16:18:47] !log running ladsgroup@deployment-deploy03:~$ foreachwikiindblist all-labs mysql.php --write -- -e "delete from flaggedtemplates where ft_tmp_rev_id = 0;" [16:18:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:19:35] 10Beta-Cluster-Infrastructure, 10Maps, 10Product-Infrastructure-Team-Backlog: Puppet config is broken for the maps instance on deployment-prep - https://phabricator.wikimedia.org/T291624 (10Majavah) When are you planning on upgrading/fixing it? Puppet failures are causing it to get outdated compared other ho... [16:20:28] Project beta-update-databases-eqiad build #55103: 04STILL FAILING in 28 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/55103/ [16:22:03] "Data truncated for column 'ft_tmp_rev_id' at row 43132 (deployment-db07)" [16:22:07] wat [16:29:19] 10Beta-Cluster-Infrastructure, 10Discovery-Search: deployment-wdqs01 Puppet failure - https://phabricator.wikimedia.org/T296959 (10MPhamWMF) p:05Triage→03Medium [16:35:57] !log delete from dewiki.flaggedtemplates where ft_tmp_rev_id is NULL; [16:35:58] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:36:03] this should be fixed now [16:43:57] it looks like CI has got completely stuck [16:44:07] more of overwhelmed [16:45:31] Everything seems to be running. It's just busy. [16:53:05] RECOVERY - Work requests waiting in Zuul Gearman server on contint2001 is OK: OK: Less than 100.00% above the threshold [200.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [16:55:10] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Wikidata-Query-Service: deployment-wdqs01 Puppet failure - https://phabricator.wikimedia.org/T296959 (10MPhamWMF) [17:04:52] (Queue (Jenkins jobs + Zuul functions) alert) resolved: Queue (Jenkins jobs + Zuul functions) alert - https://alerts.wikimedia.org [17:04:56] the CI queue is at least going down, as far as I can tell [17:04:58] jinx! [17:05:06] haha [17:10:47] I'm noticing that mediawiki-docker seems to not do more than ~6 php requests in parallel. They end up waiting for another to complete once there are more than that. I'm noticing this when using load.php in debug mode (work in progress patch) where there could easily be upto ~20 parallel load.php requests. [17:11:03] Looking at `ps aux` from `docker-compose exec mediawiki bash` I see 10 php-fpm procs. [17:11:16] https://gitlab.wikimedia.org/repos/releng/dev-images/-/blob/main/common/base/www.conf#L8 [17:11:28] I'm guessing this has something to do with it, but I don't know much about how that work [17:11:31] works* [17:12:30] also keepalive perhaps, given locally we're on HTTP/1 without TLS; https://gitlab.wikimedia.org/repos/releng/dev-images/-/blob/main/dockerfiles/buster-apache2/apache2.conf#L88 [17:13:51] https://stackoverflow.com/a/985704/319266 [17:14:13] oh right, ofc, pipelining, under HTTP browsers generally only do 6 conns per host. [17:14:30] if only firefox devtools differentiated between local wait time vs server wait time [17:15:58] ... which it does. Briliant. [17:16:22] https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor/request_details#Timings https://usercontent.irccloud-cdn.com/file/xmdCiLpa/Screenshot%202021-12-06%20at%2018.15.44.png [17:18:15] compared to status quo / latest master https://usercontent.irccloud-cdn.com/file/CSNFN7cy/Screenshot%202021-12-06%20at%2018.16.57.png [17:19:36] it's 100x slower with my patch (2ms vs 200ms), but faster overall because 60 requests at 200ms with 6 in parallel apparantly beats 150-200 requests serially (1 at a time) at 2m each with synchronous JS parsing/execution between each request. [17:22:20] Yippee, build fixed! [17:22:20] Project beta-update-databases-eqiad build #55104: 09FIXED in 2 min 20 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/55104/ [18:18:06] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T293953 (10ssastry) Just so this doesn't get lost in the comment above, if the train is rolled back after deploy, we will need to purge RESTBase content (See T296425) to en... [18:19:09] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T293953 (10dancy) Thanks ssastry. [19:32:48] 10Phabricator, 10SRE: H34 adds an archived project - https://phabricator.wikimedia.org/T297141 (10RhinosF1) [19:39:06] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T293953 (10thcipriani) >>! In T293953#7550675, @ssastry wrote: > Just so this doesn't get lost in the comment above, if the train is rolled back after deploy, we will need... [19:44:51] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T293953 (10Majavah) I'll note that this train will have a large amount of patches in the CentralAuth extension. The extension in general is risky (old and complicated codeb... [19:45:34] over/under on the train actually making it to group0 this week [19:45:39] *group2 [19:46:41] hehe [19:47:42] * majavah continues seeking review for https://gerrit.wikimedia.org/r/c/integration/config/+/741752 [19:48:02] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T293953 (10dancy) Thank you Majavah. [19:48:21] majavah: I will check it out [19:51:33] 10Phabricator, 10SRE: H34 adds an archived project - https://phabricator.wikimedia.org/T297141 (10Aklapper) 05Open→03Resolved a:03Aklapper Thanks for catching that! Done. [19:51:54] 10Phabricator, 10SRE: H34 adds an archived project - https://phabricator.wikimedia.org/T297141 (10Aklapper) ...and backlinking to T101712 for the records [19:52:25] Majavah: If I deploy, do you have a quick job to run to verify it worked? [19:52:25] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T293953 (10ssastry) >>! In T293953#7550888, @thcipriani wrote: >>>! In T293953#7550675, @ssastry wrote: >> Just so this doesn't get lost in the comment above, if the train... [19:55:21] dancy: merge commits to most mediawiki repos will trigger it, I can merge something if needed [19:58:06] ok. stand by [20:08:34] !log Deploying https://gerrit.wikimedia.org/r/c/integration/config/+/741752 for testing [20:08:35] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:17:29] (03CR) 10Ahmon Dancy: [C: 03+1] publish: use discovery name (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/741752 (https://phabricator.wikimedia.org/T247653) (owner: 10Majavah) [20:22:42] majavah: Deployed. I updated all 720 jobs because I wasn't sure what filter to use. :-) [20:23:10] thank you!! I +2'd https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/743969 which should give us a test job in a few minutes [20:24:11] (03PS37) 10Kosta Harlan: [DNM] CI full run with extensions [integration/quibble] - 10https://gerrit.wikimedia.org/r/742201 [20:29:17] dancy: https://integration.wikimedia.org/ci/job/publish-to-doc/1/console was successful and I see the changes on doc.wikimedia.org [20:29:27] 👍🏾 [20:29:36] (03CR) 10Ahmon Dancy: [C: 03+2] publish: use discovery name [integration/config] - 10https://gerrit.wikimedia.org/r/741752 (https://phabricator.wikimedia.org/T247653) (owner: 10Majavah) [20:30:04] (03CR) 10Ahmon Dancy: [C: 03+2] "Deployed. I rebuilt all 720 jobs because I wasn't sure what filter to use." [integration/config] - 10https://gerrit.wikimedia.org/r/741752 (https://phabricator.wikimedia.org/T247653) (owner: 10Majavah) [20:30:37] !log Deleted publish-to-doc1001 Jenkins job [20:30:38] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:30:56] hopefully that makes remaining steps easy for removing the stretch vm [20:31:31] (03CR) 10jerkins-bot: [V: 04-1] [DNM] CI full run with extensions [integration/quibble] - 10https://gerrit.wikimedia.org/r/742201 (owner: 10Kosta Harlan) [20:31:39] (03Merged) 10jenkins-bot: publish: use discovery name [integration/config] - 10https://gerrit.wikimedia.org/r/741752 (https://phabricator.wikimedia.org/T247653) (owner: 10Majavah) [20:34:13] (03PS1) 10QChris: Allow “Gerrit Managers” to import history [wikimedia/developer-portal] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/744090 [20:34:15] (03CR) 10QChris: [V: 03+2 C: 03+2] Allow “Gerrit Managers” to import history [wikimedia/developer-portal] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/744090 (owner: 10QChris) [20:34:21] (03PS1) 10QChris: Import done. Revoke import grants [wikimedia/developer-portal] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/744091 [20:34:23] (03CR) 10QChris: [V: 03+2 C: 03+2] Import done. Revoke import grants [wikimedia/developer-portal] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/744091 (owner: 10QChris) [20:38:28] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T293953 (10ssastry) I also [[ https://meta.wikimedia.org/wiki/Meta_talk:Babylon#Updates_to_Parsoid_for_improved_support_for_translate_extension | dropped a note on metawiki... [20:57:33] 10Phabricator, 10SRE: H34 adds an archived project - https://phabricator.wikimedia.org/T297141 (10RhinosF1) Thanks for the quick fix! [21:13:04] (03CR) 10Krinkle: publish: use discovery name (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/741752 (https://phabricator.wikimedia.org/T247653) (owner: 10Majavah) [21:13:34] dancy: I was expecting (foo|bar) to work, but it didn't. from code, it's variadic fnmatch, doc'ed at https://www.mediawiki.org/w/index.php?title=Continuous_integration%2FJenkins_job_builder&type=revision&diff=4949257&oldid=4901802 [21:13:52] mising from the ./jenkins-jobs update --help screen, unfortunately. incl in latest upstream. [21:15:44] (03CR) 10Ahmon Dancy: publish: use discovery name (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/741752 (https://phabricator.wikimedia.org/T247653) (owner: 10Majavah) [21:28:10] 10GitLab (CI & Job Runners), 10Security Team AppSec, 10Security-Team, 10SecTeam-Processed, and 2 others: Finish node/npm initial tool ci templates for auditjs (Node 10, 12, 14) - https://phabricator.wikimedia.org/T294311 (10thcipriani) >>! In T294311#7507348, @sbassett wrote: >>>! In T294311#7505539, @thci... [22:01:23] 10Continuous-Integration-Config, 10LibUp, 10Security-Team, 10Tools, 10SecTeam-Processed: Move php-security-checker.wmcloud.org to Toolforge - https://phabricator.wikimedia.org/T296967 (10sbassett) Hey @Legoktm - thanks for the tag. To be honest, we don't do a lot with these reports at the moment. We ne... [22:13:58] (03CR) 10Dzahn: "nice :)" [integration/config] - 10https://gerrit.wikimedia.org/r/741752 (https://phabricator.wikimedia.org/T247653) (owner: 10Majavah) [23:25:32] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T293953 (10ssastry) I looked at [[ https://github.com/wikimedia/operations-mediawiki-config/blob/6dcc2c6d8db872b931e0eac4fe4e2569fc4e11d0/wmf-config/InitialiseSettings.php#... [23:29:16] 10Continuous-Integration-Infrastructure, 10Wikidata, 10wdwb-tech, 10Browser-Tests, 10User-zeljkofilipin: Centrally look for flakey browser tests - https://phabricator.wikimedia.org/T277205 (10thcipriani) I made a script that parses all Jenkins junit files and puts them in a database. The database (see [[... [23:44:00] 10Release-Engineering-Team (Next), 10Release, 10Train Deployments: 1.38.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T293953 (10thcipriani) >>! In T293953#7551761, @ssastry wrote: > All that said, if train operators prefer that we mitigate that for your sanity, I will happily work with my...