[09:17:27] (03PS1) 10Physikerwelt: Review access change [extensions/MathSearch] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/801200 [09:39:23] (03CR) 10Stegmujo: [C: 03+1] Review access change [extensions/MathSearch] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/801200 (owner: 10Physikerwelt) [10:01:40] 10Beta-Cluster-Infrastructure, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech, and 3 others: Move Termbox SSR for Beta Wikidata into deployment-prep project - https://phabricator.wikimedia.org/T304328 (10ItamarWMDE) Thank you @Majavah for pointing us in the right direction, but I think we will actually shut d... [10:06:35] 10Beta-Cluster-Infrastructure, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech, and 3 others: Move Termbox SSR for Beta Wikidata into deployment-prep project - https://phabricator.wikimedia.org/T304328 (10ItamarWMDE) In light of the decision above, the next steps for this are: - [ ] Shut down and remove the `... [10:43:15] 10Release-Engineering-Team (Radar), 10Scap, 10Patch-For-Review, 10User-jijiki: Update Scap to perform rolling restart for all MW deploy - https://phabricator.wikimedia.org/T266055 (10Joe) a:03Joe [11:03:50] <_joe_> jnuche: is scap still deployed as a deb? Or did you switch to local sync? [11:04:05] _joe_: it's still a deb [11:04:10] <_joe_> ack [11:04:21] <_joe_> I might need to deploy a new version today [11:04:46] roger [11:40:52] (03CR) 10Physikerwelt: [V: 03+2 C: 03+2] Review access change [extensions/MathSearch] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/801200 (owner: 10Physikerwelt) [12:14:42] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: bring new gitlab hardware servers into production - https://phabricator.wikimedia.org/T307142 (10Jelto) Checklist for todays gitlab-replica migation from `gitlab2001` to `gitlab1003`: **Preparations before downtime:** [x] register second service... [12:39:37] 10Release-Engineering-Team (Next), 10Scap, 10Patch-For-Review: New scap install-world command for self-deploy - https://phabricator.wikimedia.org/T307081 (10jnuche) 05Open→03In progress [12:39:41] 10Release-Engineering-Team (Priority Backlog 📥), 10Scap, 10Infrastructure-Foundations, 10serviceops: Use scap to deploy itself to scap targets - https://phabricator.wikimedia.org/T303559 (10jnuche) [12:44:00] (03PS1) 10Physikerwelt: Review access change [extensions/MathSearch] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/801672 [12:49:32] (03CR) 10Physikerwelt: [V: 03+2 C: 03+2] Review access change [extensions/MathSearch] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/801672 (owner: 10Physikerwelt) [13:15:57] jnuche: can we please add a progress bar to php-fpm-restarts if it's expected to take a while now? [13:17:09] taavi: that shouldn't be a problem I think, I'll create a ticket for that later today [13:17:21] thank you! [14:15:30] taavi: we already added the progress bar, I had forgotten: https://gerrit.wikimedia.org/r/c/mediawiki/tools/scap/+/793842 [14:15:36] it will be deployed with the next scap version [14:43:59] hihi, have I done something stupid by setting https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/35628/console off without specifying the nodes, because it looks like its going to do *a lot* https://puppet-compiler.wmflabs.org/pcc-worker1001/35628/ [14:46:49] 10Beta-Cluster-Infrastructure, 10Abstract Wikipedia team, 10Patch-For-Review: Create a Beta Cluster version of Wikifunctions.org - https://phabricator.wikimedia.org/T284162 (10ori) Logs from the function-* services are now shipped to Logstash, and I've created a simple dashboard: https://beta-logs.wmcloud.o... [15:37:10] <_joe_> jnuche, dancy: how can I run scap tests locally? [15:39:21] _joe_: I normally use my IDE, but you can also use the Makefile: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/tools/scap/+/refs/heads/master/Makefile [15:39:55] <_joe_> jnuche: how do you deal with the requirements in requirements.txt? just install in a venv? [15:40:11] _joe_: yeah [15:40:46] <_joe_> ah I see the makefile uses blubber basically [16:05:37] (03PS1) 10Giuseppe Lavagetto: Restart canaries before checking, remove the opcache invalidation function [tools/scap] - 10https://gerrit.wikimedia.org/r/801767 [16:06:18] <_joe_> jnuche, dancy ^^ is needed before we can stop revalidating opcache [16:06:45] _joe_: `make test` [16:06:55] ah you found that. :-) [16:06:58] <_joe_> dancy: yeah I saw :) [16:07:26] <_joe_> my problem was that pyyaml 3.13.0 doesn't compile anymore on debian sid [16:07:49] <_joe_> so I can't build a local venv for my ide [16:16:34] _joe_: Reviewing the commit now [16:17:24] <_joe_> dancy: thanks, tomorrow is my last day, but if we don't get to disable opcache revalidation by tomorrow, I'll leave jayme proper instructions to be able to continue with that work [16:17:34] OK [16:25:59] hm, would the removed opcache revalidation mean we could sync several files with interdependent changes at once? [16:26:20] or should we not rely on that and continue to sync compatible changes separately as much as possible? [16:32:28] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: bring new gitlab hardware servers into production - https://phabricator.wikimedia.org/T307142 (10Jelto) Checklist for gitlab migation from `gitlab1001` to `gitlab1004`: **Preparations before downtime:** [x] register second service IPs for `gitlab... [16:36:28] <_joe_> Lucas_WMDE: sadly there isn't a very simple answer [16:37:17] <_joe_> for instance, if you're adding new files, they might be evaluated before the restart happens [16:37:33] <_joe_> to do proper atomic changes we'd need to branch to a new directory [16:37:47] <_joe_> the answer will be a full "yes' once we move to k8s fully [16:37:54] *nod* [16:37:59] <_joe_> also, the jobrunners still do revalidation heh [16:38:22] then I’ll try to continue doing compatible syncs for now [16:38:35] (e.g. syncing files that add constants before files that use the new constant) [16:43:07] (03PS1) 10Ahmon Dancy: Add some additional packages to mediawiki_image_extra_packages [tools/train-dev] - 10https://gerrit.wikimedia.org/r/801772 [16:43:27] (03CR) 10Ahmon Dancy: [C: 03+2] Add some additional packages to mediawiki_image_extra_packages [tools/train-dev] - 10https://gerrit.wikimedia.org/r/801772 (owner: 10Ahmon Dancy) [16:43:57] (03Merged) 10jenkins-bot: Add some additional packages to mediawiki_image_extra_packages [tools/train-dev] - 10https://gerrit.wikimedia.org/r/801772 (owner: 10Ahmon Dancy) [16:44:19] <_joe_> Lucas_WMDE: yeah that's best [16:44:30] (03CR) 10Ahmon Dancy: Restart canaries before checking, remove the opcache invalidation function (033 comments) [tools/scap] - 10https://gerrit.wikimedia.org/r/801767 (owner: 10Giuseppe Lavagetto) [16:45:36] (03PS1) 10Ahmon Dancy: Set php_fpm_always_restart to True in deploy:/etc/scap.cfg [tools/train-dev] - 10https://gerrit.wikimedia.org/r/801773 (https://phabricator.wikimedia.org/T266055) [16:46:28] (03CR) 10Ahmon Dancy: [C: 03+2] Set php_fpm_always_restart to True in deploy:/etc/scap.cfg [tools/train-dev] - 10https://gerrit.wikimedia.org/r/801773 (https://phabricator.wikimedia.org/T266055) (owner: 10Ahmon Dancy) [16:46:56] (03Merged) 10jenkins-bot: Set php_fpm_always_restart to True in deploy:/etc/scap.cfg [tools/train-dev] - 10https://gerrit.wikimedia.org/r/801773 (https://phabricator.wikimedia.org/T266055) (owner: 10Ahmon Dancy) [16:48:09] _joe_: I guess there’s a reason why you only said “slightly more atomic” :P [16:49:31] (03CR) 10Ahmon Dancy: [V: 03+1] "Tested in train-dev." [tools/scap] - 10https://gerrit.wikimedia.org/r/801767 (owner: 10Giuseppe Lavagetto) [16:54:30] (03PS2) 10Giuseppe Lavagetto: Restart canaries before checking, remove the opcache invalidation function [tools/scap] - 10https://gerrit.wikimedia.org/r/801767 (https://phabricator.wikimedia.org/T266055) [16:54:35] (03CR) 10Giuseppe Lavagetto: Restart canaries before checking, remove the opcache invalidation function (033 comments) [tools/scap] - 10https://gerrit.wikimedia.org/r/801767 (https://phabricator.wikimedia.org/T266055) (owner: 10Giuseppe Lavagetto) [16:59:33] (03CR) 10Ahmon Dancy: [V: 03+1 C: 03+2] "LGTM. Tested in train-dev again." [tools/scap] - 10https://gerrit.wikimedia.org/r/801767 (https://phabricator.wikimedia.org/T266055) (owner: 10Giuseppe Lavagetto) [17:04:14] (03Merged) 10jenkins-bot: Restart canaries before checking, remove the opcache invalidation function [tools/scap] - 10https://gerrit.wikimedia.org/r/801767 (https://phabricator.wikimedia.org/T266055) (owner: 10Giuseppe Lavagetto) [17:07:23] !log Upgrading scap to 4.8.0-1+0~20220531170512.289~1.gbp143729 in beta cluster [17:07:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:11:49] 10Release-Engineering-Team (Priority Backlog 📥), 10Release, 10Train Deployments: 1.39.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T308068 (10thcipriani) a:05jeena→03dduvall [17:12:08] 10Release-Engineering-Team (Priority Backlog 📥), 10Release, 10Train Deployments: 1.39.0-wmf.14 deployment blockers - https://phabricator.wikimedia.org/T308067 (10thcipriani) a:05dduvall→03jeena [17:12:16] 10Release-Engineering-Team, 10Platform Engineering, 10Similar Editors, 10Anti-Harassment (AHaT Sprint 8: The Beret): Configure SimilarEditors in production with Similarusers credentials - https://phabricator.wikimedia.org/T308670 (10ARamirez_WMF) [17:15:03] 10Beta-Cluster-Infrastructure: Create deployment-deploy04 as future secondary/upgrade - https://phabricator.wikimedia.org/T309437 (10dancy) Note that 20GB of storage is inadequate for a deploy server. deploy03 has a total of 60GB of storage split between the root filesystem (20GB, ~6GB used) and the /srv files... [17:15:59] Project beta-scap-sync-world build #53472: 04FAILURE in 1 min 6 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/53472/ [17:17:55] 10Release-Engineering-Team, 10Scap, 10SRE, 10serviceops: Deploy Scap version 4.8.0 - https://phabricator.wikimedia.org/T309116 (10thcipriani) [17:18:49] 10Release-Engineering-Team (GitLab-a-thon 🦊), 10serviceops: Debianize releng/jwt-authorizer - https://phabricator.wikimedia.org/T309646 (10dduvall) [17:20:21] 10Beta-Cluster-Infrastructure: Create deployment-deploy04 as future secondary/upgrade - https://phabricator.wikimedia.org/T309437 (10TheresNoTime) >>! In T309437#7971429, @dancy wrote: > Note that 20GB of storage is inadequate for a deploy server. deploy03 has a total of 60GB of storage split between the root... [17:23:58] 10GitLab (CI & Job Runners), 10Release-Engineering-Team (GitLab-a-thon 🦊), 10Patch-For-Review, 10User-brennen: Authenticate trusted runners for registry access against GitLab using temporary JSON Web Token - https://phabricator.wikimedia.org/T308501 (10dduvall) 05In progress→03Stalled Blocked on review... [17:24:02] 10GitLab (CI & Job Runners), 10Release-Engineering-Team (GitLab-a-thon 🦊), 10Patch-For-Review, 10User-brennen: Deploy buildkitd to trusted GitLab runners - https://phabricator.wikimedia.org/T308271 (10dduvall) [17:26:02] Project beta-scap-sync-world build #53473: 04STILL FAILING in 1 min 6 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/53473/ [17:28:14] Project beta-scap-sync-world build #53474: 04STILL FAILING in 1 min 5 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/53474/ [17:30:29] (03PS1) 10Ahmon Dancy: Don't suppress backtrace if DshTargetList._get_targets_for_key() fails [tools/scap] - 10https://gerrit.wikimedia.org/r/801780 [17:33:09] !log Reverted to scap 4.8.0-1+0~20220524160924.288~1.gbp794a08 in beta cluster [17:33:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:34:50] (03CR) 10Ahmon Dancy: [C: 03+2] Don't suppress backtrace if DshTargetList._get_targets_for_key() fails [tools/scap] - 10https://gerrit.wikimedia.org/r/801780 (owner: 10Ahmon Dancy) [17:37:38] Yippee, build fixed! [17:37:39] Project beta-scap-sync-world build #53475: 09FIXED in 1 min 3 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/53475/ [17:38:58] (03Merged) 10jenkins-bot: Don't suppress backtrace if DshTargetList._get_targets_for_key() fails [tools/scap] - 10https://gerrit.wikimedia.org/r/801780 (owner: 10Ahmon Dancy) [17:40:43] !log Upgrading scap to 4.8.0-1+0~20220531173912.291~1.gbp21a7ef in beta cluster [17:40:52] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:45:51] Project beta-scap-sync-world build #53476: 04FAILURE in 1 min 4 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/53476/ [17:52:30] 10GitLab (CI & Job Runners), 10Release-Engineering-Team (GitLab-a-thon 🦊), 10Patch-For-Review, 10User-brennen: Authenticate trusted runners for registry access against GitLab using temporary JSON Web Token - https://phabricator.wikimedia.org/T308501 (10dduvall) @akosiaris and/or @Jelto do you have time to... [17:53:49] 10GitLab (CI & Job Runners), 10Release-Engineering-Team (GitLab-a-thon 🦊), 10Patch-For-Review, 10User-brennen: Authenticate trusted runners for registry access against GitLab using temporary JSON Web Token - https://phabricator.wikimedia.org/T308501 (10dduvall) [17:55:55] Yippee, build fixed! [17:55:55] Project beta-scap-sync-world build #53477: 09FIXED in 1 min 2 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/53477/ [19:34:06] 10Release-Engineering-Team, 10Data-Persistence (Consultation), 10Security-API-Service, 10Security-Team, and 4 others: Determine CI best practices for service which connects to MySQL - https://phabricator.wikimedia.org/T308789 (10sbassett) [20:18:27] (03PS1) 10Ahmon Dancy: Don't attempt php fpm restart if not configured [tools/scap] - 10https://gerrit.wikimedia.org/r/801807 (https://phabricator.wikimedia.org/T266055) [20:26:01] (03PS1) 10SBassett: Disable email reporting to security admin feed [integration/config] - 10https://gerrit.wikimedia.org/r/801808 (https://phabricator.wikimedia.org/T309655) [20:47:50] 10Beta-Cluster-Infrastructure, 10Horizon, 10cloud-services-team (Kanban): Two volumes not deleting/creating on deployment-prep - https://phabricator.wikimedia.org/T309659 (10bd808) >>! In T309659#7972123, @TheresNoTime wrote: > (wrong tag, apologies #wmcs-team) You had it right the first time. :) We have a... [21:07:04] (03CR) 10Ahmon Dancy: [C: 03+2] Don't attempt php fpm restart if not configured [tools/scap] - 10https://gerrit.wikimedia.org/r/801807 (https://phabricator.wikimedia.org/T266055) (owner: 10Ahmon Dancy) [21:11:07] (03Merged) 10jenkins-bot: Don't attempt php fpm restart if not configured [tools/scap] - 10https://gerrit.wikimedia.org/r/801807 (https://phabricator.wikimedia.org/T266055) (owner: 10Ahmon Dancy) [21:12:05] Project beta-update-databases-eqiad build #58982: 04FAILURE in 46 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/58982/ [21:16:02] !log Upgrading scap to 4.8.0-1+0~20220531211114.292~1.gbp8dbbcf in beta cluster [21:16:03] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:19:22] 10Release-Engineering-Team (Radar), 10tech-decision-forum: Beta Cluster Tech Decision Forum - https://phabricator.wikimedia.org/T308283 (10Ladsgroup) Regarding data and databases in production. You could potentially have a dedicated section in databases and make sure the appservers in test environment only be... [21:25:38] Yippee, build fixed! [21:25:38] Project beta-update-databases-eqiad build #58983: 09FIXED in 5 min 37 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/58983/ [21:31:25] Yay! [22:24:00] 10Release-Engineering-Team (GitLab-a-thon 🦊), 10serviceops: Debianize releng/jwt-authorizer - https://phabricator.wikimedia.org/T309646 (10Dzahn) We will need an upstream tarball that includes a ./vendor/ directory with all the needed artifacts. "go mod vendor" created such a directory for me when I tried. It... [23:00:06] 10Release-Engineering-Team, 10SRE, 10SRE-OnFire, 10Sustainability: Remove old scap repositories from deploy1002 - https://phabricator.wikimedia.org/T309162 (10Dzahn) @jcrespo You are correct. In that case I still don't understand what this ticket is really asking for, first I thought it was about both depl... [23:20:21] 10Beta-Cluster-Infrastructure, 10Wikistories: Call to undefined method ForeignDBFile::getExtendedMetadata() - https://phabricator.wikimedia.org/T309668 (10AlexisJazz) [23:32:12] 10GitLab (Infrastructure), 10Data-Persistence-Backup, 10serviceops, 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Dzahn) Also, after manually starting the "backup-restore" service on gitlab2001, which was still alerting in Icinga, we now have: 23:2... [23:36:57] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: gitlab-restore: version detection fail / restore fail - https://phabricator.wikimedia.org/T308089 (10Dzahn) After T274463#7966543 ff we now have a working backup-restore service again: <+icinga-wm> RECOVERY - Check systemd state on gitlab2001 is O... [23:37:19] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: gitlab-restore: version detection fail / restore fail - https://phabricator.wikimedia.org/T308089 (10Dzahn) 05In progress→03Resolved