[01:59:48] hello, have you met TOML? [02:00:43] yes i hate it [02:01:58] (mostly because i don't like .ini) [02:03:48] fair I suppose [02:10:27] i always thought .ini was... pretty ok for a lot of things, really. [02:10:56] on reflection maybe a lot of that is because i think deep nesting / hierarchy is the enemy of comprehension. [02:11:54] so toml seems like it could be worse, but the meta-level thing i've learned about every language like this is that i won't fully know how much i hate it until i've used it a whole bunch. [02:31:36] imo TOML is a more JSON-like ini that's well standardized, unlike ini which had all these subtle variations based on implementation details [02:31:57] big +1 to deep nesting being the real problem [03:01:01] I've looked into the singularity, it is configured by array-returning php files and I don't not hate it. [07:46:33] 10GitLab (Project Migration), 10Machine-Learning-Team, 10ORES: Migrate ORES/Revscoring/etc. repos to Gitlab or Gerrit - https://phabricator.wikimedia.org/T264651 (10Aklapper) [07:53:54] 10Phabricator, 10Patch-For-Review: Remove unneeded translation overrides - https://phabricator.wikimedia.org/T309746 (10Aklapper) Reverting overwrites from T152, T257, T865 (mentioning just to have followup pings in those tasks). [07:56:06] 10Phabricator: Make sure anti-vandalism features are up to snuff - https://phabricator.wikimedia.org/T84 (10Aklapper) [08:02:25] 10Gerrit, 10Data-Engineering: Remove unused Gerrit repository mediawiki/services/aqs/deploy - https://phabricator.wikimedia.org/T309731 (10Aklapper) 05Resolved→03Open The repository [still exists and has content](https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/aqs/deploy/+/refs/heads/mast... [08:46:42] 10Phabricator: Herald rule for #serviceops-collab - https://phabricator.wikimedia.org/T311605 (10Aklapper) 05Open→03Resolved a:03Aklapper Done in H404. [08:46:52] 10GitLab (CI & Job Runners), 10serviceops, 10serviceops-collab: DNS/networking not working on Trusted Runners - https://phabricator.wikimedia.org/T311241 (10Aklapper) [08:48:36] 10Continuous-Integration-Config, 10Project-Admins, 10mediawiki-extensions-MultiMail: Publish MultiMail extension - https://phabricator.wikimedia.org/T311542 (10Mainframe98) [08:57:32] (03PS1) 10Mainframe98: Zuul: [mediawiki/extensions/MultiMail] add basic quibble and phan jobs [integration/config] - 10https://gerrit.wikimedia.org/r/809918 (https://phabricator.wikimedia.org/T311542) [08:59:29] (03CR) 10CI reject: [V: 04-1] Zuul: [mediawiki/extensions/MultiMail] add basic quibble and phan jobs [integration/config] - 10https://gerrit.wikimedia.org/r/809918 (https://phabricator.wikimedia.org/T311542) (owner: 10Mainframe98) [09:05:52] (03PS2) 10Mainframe98: Zuul: [mediawiki/extensions/MultiMail] add basic quibble and phan jobs [integration/config] - 10https://gerrit.wikimedia.org/r/809918 (https://phabricator.wikimedia.org/T311542) [09:07:51] 10Continuous-Integration-Config, 10Quality-and-Test-Engineering-Team (QTE), 10Sonarqubebot, 10Developer Productivity: SonarQube is unhelpfully suggesting ES6 features in ES5 code - https://phabricator.wikimedia.org/T289957 (10kostajh) 05Open→03Resolved >>! In T289957#8038543, @matmarex wrote: > Thanks... [11:15:55] 10Project-Admins: Requests for addition to the #acl*Project-Admins group (in comments) - https://phabricator.wikimedia.org/T706 (10MaryMunyoki) Hello @Aklapper. I am a Technical Program Manager for the Language & Inuka teams. I would like to be able to create milestones for the sprint work for the Language team.... [11:16:22] 10Project-Admins, 10Wikimedia-Site-requests: Archive "Config-to process " column at #wikimedia-site-requests - https://phabricator.wikimedia.org/T311089 (10MarcoAurelio) [11:24:12] 10Gerrit, 10Upstream: Cherry pick to multiple branches - https://phabricator.wikimedia.org/T311703 (10Reedy) [11:40:39] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments: 1.39.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T308071 (10Lucas_Werkmeister_WMDE) Regarding my risky patch (T308071#8032891), the logging seems to be working and the volume on the `Wi... [12:11:32] 10Gerrit, 10Upstream: Cherry pick to multiple branches - https://phabricator.wikimedia.org/T311703 (10hashar) The REST API is https://gerrit.wikimedia.org/r/Documentation/rest-api-changes.html#cherry-pick which takes as in put a [[ https://gerrit.wikimedia.org/r/Documentation/rest-api-changes.html#cherrypick-i... [12:12:55] 10Gerrit, 10Upstream: Cherry pick to multiple branches - https://phabricator.wikimedia.org/T311703 (10Reedy) p:05Triage→03Lowest Setting priority to match expectations :). [12:13:31] Reedy: for cherry-picks to multiple branches, the easiest is probably do write a command line scripts which asks Gerrit rest api to do them. [12:14:08] the devil is figuring out all the rest api endpoints and their parameters, then the api is well documented :] [12:15:54] heh [12:16:08] I imagine writing some sort of JS/greasemonkey script might be a way forward too [13:18:55] Does anyone know how to update Mathoid on the beta cluster? [13:24:02] The service, presumably? [13:24:29] Errr, is there another sort? [13:25:09] Some of the math code is in the math mw extension :P [13:25:16] I was just making sure it wasn't something else ;) [13:25:43] https://wikitech.wikimedia.org/wiki/Mathoid [13:25:52] >Mathoid is currently built and deployed using the Deployment pipeline. When a change is merged to the master branch, a Docker image is automatically built and pushed to the Docker-registry. A change is automatically generated (similar to this change), which itself is then merged upon receiving a +2. [13:26:20] I'm guessing a cherry pick of that deployment chart change needs putting over there [13:26:52] nope, deployment-charts isn't used at all on deployment-prep [13:27:05] likely a change to the hieradata of the VM running the mathoid docker container [13:27:07] taavi: /me pretends to be surprised [13:36:42] Thanks taavi, how does one go about doing that? [13:47:30] https://horizon.wikimedia.org/project/puppet/ [13:47:35] mathoid/deploy: [13:47:35] checkout_submodules: true [13:47:35] service_name: mathoid [13:47:35] upstream: https://gerrit.wikimedia.org/r/mediawiki/services/mathoid/deploy [13:47:39] Deprecated repo? [13:48:00] dwalden: Is it seemingly still using an ancient version? [13:49:07] https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+blame/master/deployment-prep/_.yaml#306 [13:53:29] beta + puppet always equals problems [13:53:59] <_joe_> mathoid should not be deployed from there [13:54:37] <_joe_> jnuche, dancy I have an issue that might need a scap release, still not sure though [13:56:12] 10Phabricator, 10Security-Team: Onboard Cleo to Security Team - https://phabricator.wikimedia.org/T311721 (10sbassett) [13:56:39] Reedy I don't know the exact version beta is using (I don't know how to find out) [13:57:00] but cloning https://gerrit.wikimedia.org/r/mediawiki/services/mathoid/deploy latest commit is from January [13:57:11] 2018 [13:57:26] Yeah, hence my question [13:57:32] Which is ooooold [13:57:56] _joe_: I presume you mean it should be deoloyed from the deploy repo? (which is why i was asking if it was an ancient version) [13:58:02] *shouldn't be deployed. ffs [13:58:03] <_joe_> no [13:58:18] <_joe_> wait, isn't mathoid running in a docker container in beta? [13:58:21] as I said, hieradata of the vm running the mathoid container, that would be this line: https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/refs/heads/master/deployment-prep/deployment-docker-mathoid01.deployment-prep.eqiad1.wikimedia.cloud.yaml#51 [13:58:23] <_joe_> like the rest of the services? [13:58:52] So we should remove those 4 lines from _.yaml ? [13:58:59] because it's old/redundant [13:59:12] <_joe_> taavi: yep, they should change that to a recent version, in theory it should be done by whomever maintains mathoid [14:00:35] it is still on an old version then [14:00:36] lol [14:02:10] it's beta, of course it's outdated [14:03:53] _joe_ I am happy to make the change as I have been making changes to Mathoid recently. [14:04:10] How do I make the change? instance-puppet repo says I have to do it via Horizon. [14:07:34] <_joe_> dwalden: at the line taavi indicated, you can change that with the latest image deployed in production [14:07:38] <_joe_> let me see [14:08:12] <_joe_> if gerrit worked... [14:08:37] <_joe_> https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/services/mathoid/values.yaml#16 [14:08:40] <_joe_> this version [14:09:05] <_joe_> just change the version in horizon and it should automatically apply [14:10:47] I am not sure how to change the version in horizon. Maybe I need special permissions. [14:15:13] 10Phabricator, 10Security-Team: Onboard Cleo to Security Team - https://phabricator.wikimedia.org/T311721 (10sbassett) [14:15:26] Can you get to https://horizon.wikimedia.org/project/puppet/ ? [14:16:18] Yep [14:18:53] Do you see edit buttons? [14:19:22] hmm [14:19:23] Nope [14:19:33] I think only projectadmins can Reedy ? [14:19:45] hauskatze: Yeah, I was being lazy to see if he had the rights ;) [14:20:19] > You have selected: "Dom Walden". Please confirm your selection. The selected user will gain the ability to modify settings in this project, including membership. [14:22:54] 10Beta-Cluster-Infrastructure, 10Maps, 10Product-Infrastructure-Team-Backlog: deployment-maps-master01 puppet failures - https://phabricator.wikimedia.org/T311609 (10Jgiannelos) [14:23:12] 10Beta-Cluster-Infrastructure, 10Maps, 10Product-Infrastructure-Team-Backlog: deployment-maps-master01 puppet failures - https://phabricator.wikimedia.org/T311609 (10Jgiannelos) p:05Triage→03Low [14:31:30] taavi: Is that mathoid page hidden in horizon somewhere? Or does it want changing in gerrit? [14:31:47] Reedy: it's a tab on the instance details page [14:32:17] duh, tag :) [14:32:19] *ta :) [14:32:24] dwalden: https://horizon.wikimedia.org/project/instances/eb0903b7-9c0e-42a0-9545-f3b3e3e82b21/ [14:32:29] You should have the rights now [14:32:33] Might need to log out and in again [14:41:53] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments: 1.39.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T308071 (10thcipriani) >>! In T308071#8039810, @Lucas_Werkmeister_WMDE wrote: > Regarding my risky patch (T308071#8032891), the logging... [14:42:32] Reedy: Thank you. That worked. Thanks also to _joe_ and taavi. [15:09:31] _joe_: OK (regarding another scap release). Lemme know what you need. [15:12:05] <_joe_> dancy: I think I found a better solution [15:12:14] <_joe_> which doesn't involve modifying scap [15:12:17] 👍🏾 [15:14:00] _joe_ While you're around: What would happen if say 80 nodes ran safe-service-restart of php-fpm at the same time. Would they somehow use the poolcounter stuff to regulate how many many are depooled from their respective cluster at a time? [15:14:18] <_joe_> yes [15:14:24] <_joe_> --concurrency 7 on the appservers [15:14:30] 10Phabricator, 10Data-Engineering-Kanban, 10Product-Analytics, 10wmfdata-python, 10Data Engineering Planning: Herald rule to add Product Analytics and Data Engineering tags to Wmfdata-Python tasks - https://phabricator.wikimedia.org/T304572 (10JArguello-WMF) [15:14:31] <_joe_> smaller on the other clusters [15:14:44] <_joe_> ah sigh I just realized I need to tweak my patch a bit more [15:15:13] So we don't need all the extra logic in _restart_php_hostgroups to separate by group [15:15:26] <_joe_> that speeds things up though [15:15:50] <_joe_> ah i see what you mean [15:15:53] 10GitLab (Project Migration), 10Data-Engineering-Kanban, 10Product-Analytics, 10wmfdata-python, 10Data Engineering Planning: Move Wmfdata-Python from Github to Gitlab - https://phabricator.wikimedia.org/T304544 (10JArguello-WMF) [15:16:24] <_joe_> we do need to select jyst the right ones though [15:16:49] Can you explain that more? [15:26:31] <_joe_> yes, so for instance we don't want to restart php on the jobrunners [15:28:04] Gotcha. [15:28:16] <_joe_> nor in servers where we deploy the code but don't run php-fpm [15:28:25] Alright. I'm going to make some hacks to scap that I'll run by you. [15:28:30] <_joe_> sorry, I was looking at the concurrency stuff :) [15:29:19] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, 10Patch-For-Review: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10akosiaris) [15:31:39] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, 10Patch-For-Review: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10akosiaris) As pointed out in T311732 (now merged as duplicate of... [15:34:19] 10GitLab (Project Migration), 10Data-Engineering-Kanban, 10Product-Analytics, 10wmfdata-python, 10Data Engineering Planning (Sprint 01): Move Wmfdata-Python from Github to Gitlab - https://phabricator.wikimedia.org/T304544 (10JArguello-WMF) [16:14:14] (03PS1) 10Giuseppe Lavagetto: php_fpm: simplify restart logic [tools/scap] - 10https://gerrit.wikimedia.org/r/810047 [16:18:47] (03CR) 10CI reject: [V: 04-1] php_fpm: simplify restart logic [tools/scap] - 10https://gerrit.wikimedia.org/r/810047 (owner: 10Giuseppe Lavagetto) [16:26:16] who's around to approve adding a new contributor to the zuul allow-list? (https://gerrit.wikimedia.org/r/c/integration/config/+/810053) [16:26:51] (03CR) 10Ori: "This change is ready for review." [integration/config] - 10https://gerrit.wikimedia.org/r/810053 (owner: 10Ori) [16:26:59] 10GitLab (CI & Job Runners), 10Security Team AppSec, 10Security-Team, 10Security: Create osv.dev ci includes - https://phabricator.wikimedia.org/T307514 (10mmartorana) 05In progress→03Resolved [16:27:05] 10GitLab (CI & Job Runners), 10Security Team AppSec, 10Security-Team, 10Security: Design and Build Application Security Pipeline Components for Gitlab - https://phabricator.wikimedia.org/T289290 (10mmartorana) [16:32:22] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10dduvall) [16:33:40] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10dduvall) @Dzahn helped troubleshoot this issue yesterday so CC'ing. @Jelto and @brennen too. [16:34:00] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10dduvall) [16:40:58] ori: looking [16:44:26] (03CR) 10Majavah: [C: 03+2] Zuul: Add Teleosteen to CI allow list [integration/config] - 10https://gerrit.wikimedia.org/r/810053 (owner: 10Ori) [16:46:39] (03Merged) 10jenkins-bot: Zuul: Add Teleosteen to CI allow list [integration/config] - 10https://gerrit.wikimedia.org/r/810053 (owner: 10Ori) [16:47:07] !log reloading zuul to deploy https://gerrit.wikimedia.org/r/810053 [16:47:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:47:32] ori: {{done}}! [16:50:11] taavi: thanks! [16:56:27] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10dduvall) Refactoring this will be tricky I think. Runners store their unique token in the configuration after the registrat... [17:11:27] 10GitLab (Project Migration), 10Data-Engineering-Kanban, 10Product-Analytics, 10wmfdata-python: Move Wmfdata-Python from Github to Gitlab - https://phabricator.wikimedia.org/T304544 (10EChetty) [18:08:29] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments: 1.39.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T308071 (10dduvall) >>! In T308071#8040560, @thcipriani wrote: > Thanks for following up here! It's always valuable for deployers to kno... [18:12:34] (03PS1) 10Ahmon Dancy: Move serializing_lock_file to a setgid directory [tools/scap] - 10https://gerrit.wikimedia.org/r/810069 (https://phabricator.wikimedia.org/T310395) [18:16:48] (03CR) 10CI reject: [V: 04-1] Move serializing_lock_file to a setgid directory [tools/scap] - 10https://gerrit.wikimedia.org/r/810069 (https://phabricator.wikimedia.org/T310395) (owner: 10Ahmon Dancy) [18:22:25] (03PS2) 10Ahmon Dancy: Move serializing_lock_file to a setgid directory [tools/scap] - 10https://gerrit.wikimedia.org/r/810069 (https://phabricator.wikimedia.org/T310395) [18:35:59] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10Dzahn) Or we could also say that it's unlikely we will often make changes to the config after the initial setup and for whe... [19:02:55] * Krinkle is in the water cooler at https://meet.google.com/sut-zxhw-jqy [19:08:54] what's the water cooler? [19:09:18] 10GitLab (CI & Job Runners), 10serviceops, 10serviceops-collab: DNS/networking not working on Trusted Runners - https://phabricator.wikimedia.org/T311241 (10Dzahn) currently the issue here is not DNS anymore. but it is now: 'This job is stuck because you don't have any active runners online or available wi... [19:13:18] 10GitLab (Infrastructure), 10Release-Engineering-Team, 10serviceops, 10serviceops-collab, 10User-brennen: GitLab major release: 15.x - https://phabricator.wikimedia.org/T309062 (10Dzahn) [19:29:14] (03CR) 10Chad: [C: 03+2] scap stage-train: Set umask to 002 [tools/scap] - 10https://gerrit.wikimedia.org/r/809713 (owner: 10Ahmon Dancy) [19:33:25] (03Merged) 10jenkins-bot: scap stage-train: Set umask to 002 [tools/scap] - 10https://gerrit.wikimedia.org/r/809713 (owner: 10Ahmon Dancy) [19:34:40] (03CR) 10Ahmon Dancy: "Thanks Chad!" [tools/scap] - 10https://gerrit.wikimedia.org/r/809713 (owner: 10Ahmon Dancy) [19:39:04] (03PS1) 10Stang: zuul: Adjust description for serveral gate pipelines [integration/config] - 10https://gerrit.wikimedia.org/r/810074 [19:48:38] 10GitLab (CI & Job Runners): DNS/networking not working on Trusted Runners - https://phabricator.wikimedia.org/T311241 (10Dzahn) [19:49:09] 10GitLab (CI & Job Runners), 10serviceops, 10serviceops-collab: DNS/networking not working on Trusted Runners - https://phabricator.wikimedia.org/T311241 (10Dzahn) [19:49:18] 10Phabricator: Herald rule for #serviceops-collab - https://phabricator.wikimedia.org/T311605 (10Dzahn) Thanks. Tested and works [20:00:30] (03PS1) 10Jeena Huneidi: Scap backport --list: Add mergeable column [tools/scap] - 10https://gerrit.wikimedia.org/r/810078 (https://phabricator.wikimedia.org/T303967) [20:19:34] (03PS1) 10Ahmon Dancy: train-dev ssh: Use login shell [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810082 [20:19:52] (03PS1) 10Ahmon Dancy: Disable build_mw_container_image to save testing time [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810083 [20:20:02] (03PS1) 10Ahmon Dancy: train-dev: Another round of removal of relative paths [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810084 [20:20:11] (03PS1) 10Ahmon Dancy: Only restart gerrit when the config actually changed [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810085 [20:20:22] (03PS1) 10Ahmon Dancy: Use pregenerated ssh host keys for gerrit container [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810086 [20:34:36] 10Beta-Cluster-Infrastructure, 10Abstract Wikipedia team, 10Patch-For-Review: Create a Beta Cluster version of Wikifunctions.org - https://phabricator.wikimedia.org/T284162 (10ori) 05In progress→03Resolved [20:43:50] (03CR) 10Dduvall: [C: 04-1] scap prep: Ensure umask is 002 before running (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/809708 (owner: 10Ahmon Dancy) [20:46:05] 10Phabricator: Delete account xcollazo - https://phabricator.wikimedia.org/T311772 (10XCollazo-WMF) [20:48:47] (03CR) 10Dduvall: [C: 04-1] Move serializing_lock_file to a setgid directory (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/810069 (https://phabricator.wikimedia.org/T310395) (owner: 10Ahmon Dancy) [20:56:21] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10dduvall) >>! In T311746#8042025, @Dzahn wrote: > Or we could also say that it's unlikely we will often make changes to the... [21:03:00] (03PS1) 10Reedy: makesecuritytasks.py: Add missing view and edit policies [tools/release] - 10https://gerrit.wikimedia.org/r/810098 [21:03:07] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10Dzahn) puppet already does the register command. It also is prepared to do the unregister command. It would do that if you... [21:03:25] (03CR) 10Reedy: [C: 03+2] "Tested and working" [tools/release] - 10https://gerrit.wikimedia.org/r/810098 (owner: 10Reedy) [21:04:36] Project beta-code-update-eqiad build #398138: 15ABORTED in 1 min 34 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/398138/ [21:05:36] (03Merged) 10jenkins-bot: makesecuritytasks.py: Add missing view and edit policies [tools/release] - 10https://gerrit.wikimedia.org/r/810098 (owner: 10Reedy) [21:05:55] !log cancelled beta-code-update-eqiad#398138 to make way for pending beta-scap-sync-world#57641, queued another beta-code-update-eqiad [21:05:56] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:07:28] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10dduvall) Or move the (de/re)-registration to the systemd service definition for restarts file and let puppet notify the ser... [21:16:10] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10dduvall) If we rely on a manual process, I think we're going to be dealing with consistency issues in our pool of runners q... [21:29:31] 10Scap: scap does not fully deploy its code in some cases - https://phabricator.wikimedia.org/T311788 (10Urbanecm) [21:33:34] 10Scap: scap does not fully deploy MW code in some cases - https://phabricator.wikimedia.org/T311788 (10Urbanecm) [21:41:16] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10dduvall) Interestingly, the `gitlab_ci_runner` module from puppet forge does the weird token grabbing dance. See https://gi... [21:47:13] 10GitLab, 10Release-Engineering-Team: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10brennen) I do think we wind up tweaking runner config more often than I would have expected at first, so this will likely k... [21:53:39] 10Scap: scap does not fully deploy MW code in some cases - https://phabricator.wikimedia.org/T311788 (10dancy) [22:02:37] !log unstuck beta-mediawiki-config-update-eqiad jobs, will comment at T72597 [22:02:39] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:02:39] T72597: Jenkins Gearman plugin has deadlock on executor threads (was: Beta Cluster stopped receiving code updates (beta-update-databases-eqiad hung) - https://phabricator.wikimedia.org/T72597 [22:03:59] 10GitLab, 10Release-Engineering-Team, 10serviceops, 10serviceops-collab: Changes to modules/gitlab_runner/templates/config-template.toml.erb have no effect on existing runners - https://phabricator.wikimedia.org/T311746 (10Dzahn) [22:07:00] 10Continuous-Integration-Infrastructure, 10Jenkins, 10Release-Engineering-Team (Radar), 10Patch-For-Review, 10Upstream: Jenkins Gearman plugin has deadlock on executor threads (was: Beta Cluster stopped receiving code updates (beta-update-databases-eqiad hung) - https://phabricator.wikimedia.org/T72597 (1... [22:09:45] Project beta-code-update-eqiad build #398142: 15ABORTED in 28 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/398142/ [22:10:01] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, and 2 others: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10Dzahn) [22:11:20] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, and 2 others: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10Dzahn) @Krinkle and I agreed on doing this tomorrow at 14:00 PST [22:11:42] 10Continuous-Integration-Infrastructure, 10OOUI: Demos page for OOUI in php is broken - https://phabricator.wikimedia.org/T297035 (10Dzahn) [22:11:47] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, and 2 others: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10Dzahn) 05Open→03In progress [22:15:02] 10Scap: scap does not fully deploy MW code in some cases - https://phabricator.wikimedia.org/T311788 (10dancy) [22:16:05] 10Scap: scap does not fully deploy MW code in some cases - https://phabricator.wikimedia.org/T311788 (10dancy) [22:23:17] 10Scap: scap does not fully deploy MW code in some cases - https://phabricator.wikimedia.org/T311788 (10dancy) [22:23:37] Project beta-code-update-eqiad build #398144: 15ABORTED in 36 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/398144/ [22:25:06] 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10Release, 10Train Deployments: 1.39.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T308071 (10dduvall) 05Open→03Resolved [22:33:05] 10Scap: scap does not fully deploy MW code in some cases - https://phabricator.wikimedia.org/T311788 (10jeena) checking /var/log/php7.2-fpm on mw1369 I see a restart around deployment time: ` Jun 30 20:57:41 mw1369 php7.2-fpm[31932]: [NOTICE] Terminating ... Jun 30 20:57:41 mw1369 php7.2-fpm[31932]: [NOTICE] ex... [22:33:48] 10Scap: scap does not fully deploy MW code in some cases - https://phabricator.wikimedia.org/T311788 (10dancy) [22:47:50] 10Scap: scap does not fully deploy MW code in some cases - https://phabricator.wikimedia.org/T311788 (10dancy) [22:47:52] 10Release-Engineering-Team (Radar), 10Scap, 10Patch-For-Review, 10User-jijiki: Update Scap to perform rolling restart for all MW deploy - https://phabricator.wikimedia.org/T266055 (10dancy) [22:52:05] 10Scap: scap does not fully deploy MW code in some cases - https://phabricator.wikimedia.org/T311788 (10dancy) p:05Triage→03High @Joe @Krinkle Jeena's evidence above shows that the php-fpm restart happened on the afflicted hosts. At this point I don't know what's going on but we're definitely in a state of... [23:12:43] dancy: okay, so I'm currently at a point where it seems logical to me that config gets outdated and stuck [23:13:03] Now I'm trying to figure out why 1) it sometimes works and 2) why it worked before [23:27:59] 10Scap: scap does not fully deploy MW code in some cases - https://phabricator.wikimedia.org/T311788 (10DDeSouza) I think deployment of [809961](https://gerrit.wikimedia.org/r/c/809961) experienced a similar issue. It's working on **mwdebug1002** but not on production. At a moment *it worked* but I'm assuming... [23:33:44] Krinkle: Good luck!! [23:33:59] I'm going offline for the day. I'll circle back tomorrow to see what has transpired [23:40:53] 10Scap: MW wmf-config tmp cache stays outdated after Scap deploy - https://phabricator.wikimedia.org/T311788 (10Krinkle) [23:41:27] 10Scap: MW wmf-config tmp cache stays outdated after Scap deploy (opcache revalidation is off) - https://phabricator.wikimedia.org/T311788 (10Krinkle) [23:42:03] 10Scap, 10Performance-Team, 10serviceops: MW wmf-config tmp cache stays outdated after Scap deploy (opcache revalidation is off) - https://phabricator.wikimedia.org/T311788 (10Krinkle) a:03Krinkle [23:42:11] 10Scap, 10Performance-Team, 10serviceops: MW wmf-config tmp cache stays outdated after Scap deploy (opcache revalidation is off) - https://phabricator.wikimedia.org/T311788 (10Krinkle) Okay, let's consider the following timeline for a typical config change as it was prior to last week. This means prior to do... [23:44:36] 10Scap, 10Performance-Team, 10serviceops: MW wmf-config tmp cache stays outdated after Scap deploy (opcache revalidation is off) - https://phabricator.wikimedia.org/T311788 (10Krinkle) >>! In T311788#8042832, @Krinkle wrote: > […] > > With the above information, and the knowledge that "turning off live opcac...