[07:09:10] 10serviceops, 10SRE, 10Traffic, 10Performance-Team (Radar): Reconcile MediaWiki POST timeout and Varnish/ATS timeouts - https://phabricator.wikimedia.org/T294800 (10Legoktm) >>! In T294800#7473846, @Joe wrote: > If anything, I think we should go in the other direction, and progressively and drastically red... [07:32:34] 10serviceops, 10SRE, 10Traffic, 10Performance-Team (Radar): Reconcile MediaWiki POST timeout and Varnish/ATS timeouts - https://phabricator.wikimedia.org/T294800 (10Joe) >>! In T294800#7480542, @Legoktm wrote: > Sidenote, I wonder if we can get some basic stats from the envoy metrics about how many POST r... [07:34:57] 10serviceops, 10SRE, 10Traffic, 10Performance-Team (Radar): Reconcile MediaWiki POST timeout and Varnish/ATS timeouts - https://phabricator.wikimedia.org/T294800 (10Joe) Let me add another data point: Of those 8 requests over 175 seconds, only 2 were to POSTs to Special:Upload. [07:41:13] 10serviceops, 10SRE, 10MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), 10Patch-For-Review, 10Sustainability: Jobrunner timeouts on cross-DC file uploads because of HTTP/2 - https://phabricator.wikimedia.org/T275752 (10Legoktm) [07:41:28] 10serviceops, 10SRE, 10MW-1.38-notes (1.38.0-wmf.6; 2021-10-26), 10Patch-For-Review, 10Sustainability: Jobrunner timeouts on cross-DC file uploads because of HTTP/2 - https://phabricator.wikimedia.org/T275752 (10Legoktm) [07:58:41] 10serviceops, 10Community-Tech, 10SRE, 10wikidiff2, and 2 others: Deploy wikidiff2 1.13.0 - https://phabricator.wikimedia.org/T285857 (10Nardog) [08:09:11] 10serviceops, 10Internet-Archive, 10SRE: Improve download speed from archive.org on appservers - https://phabricator.wikimedia.org/T295009 (10Legoktm) [10:40:06] Hello. Is this the best channel to ask about if/when/how we might be able to instantiate a new group runner in GitLab? Many thanks. [10:54:44] btullis: Hello. Most work around GitLab Runner was done by RelEng. So you could head over to #wikimedia-releng and ping ddu.vall or bren.nen. I might be able to help as well, but RelEng folks have way more experienc with the current Runner setup and should be able to help better :) [11:00:24] jelto: Great, thanks. I'll try my luck in #wikimedia-releng for now. For background info, it relates to this idea/requirement for deploying Airflow DAGs to the Analytics servers. https://phabricator.wikimedia.org/T286958#7450771 [11:43:56] 10serviceops, 10Citoid, 10VisualEditor, 10WMSE-Bug-Reporting-and-Translation-2021, and 2 others: Automatic citation generation using ISBN on Wikipedia doesn't work - https://phabricator.wikimedia.org/T294010 (10Mvolz) > > It's ~70 rps at those peaks. They are most definitely violating https://www.mediawi... [12:07:19] btullis: most of releng is on USA west coast and are sleeping right now [12:08:06] btullis: on top of irc, you can file a task in Phabricator against #gitlab (it has a "CI & Jobrunners" column) [12:09:07] hashar: Ah, thanks. Good to know. I'll do that. 👍 [13:16:13] jelto: thanks for doing the comment templates upgrade stuff, sorry about eventstreams/eventgate. i really want to get them inline to use common template soon [14:02:42] ottomata: hey. Yes it would be nice if all services share the same templates and structure. I think it shouldn't be too much work to migrate to the common templates. If you need a review or help feel free to ping or tag me [14:44:25] 10serviceops, 10Prod-Kubernetes, 10Kubernetes, 10Patch-For-Review: Upgrade kubernetes clusters to a security supported (LTS) version - https://phabricator.wikimedia.org/T244335 (10Jelto) [14:45:20] 10serviceops, 10SRE, 10Kubernetes, 10Patch-For-Review: Migrate to helm v3 - https://phabricator.wikimedia.org/T251305 (10Jelto) 05Open→03In progress I migrated all services in `staging` to helm3 using the snippet https://phabricator.wikimedia.org/P17671. It took around 1 hour. helm2 list shows no relea... [14:59:07] what do you think of my suggestion to change the mw cumin aliases: https://gerrit.wikimedia.org/r/c/operations/puppet/+/736596 [15:01:40] don't want to unexpectedly change the meaning of an alias that people might use in their standard workflows, so I'll mention it in team meeting [15:02:16] mutante: if you change those please make sure to git grep the cookbooks repo to see if they are used in any cookbook [15:02:49] the switchdc ones come to mind that might use them [15:03:31] ACK, that was the other thought.. "it's possible they are used by other tools". will be careful before merging that [15:04:42] spotted syntax issue, one more PS [15:17:20] 10serviceops, 10Release-Engineering-Team, 10Scap: Deploy Scap version 4.0.3 - https://phabricator.wikimedia.org/T294966 (10dancy) @Legoktm Please roll forward to 4.0.3 again. The most recent problem in T294936 is not a bug in scap but a change in behavior in Python 3 that requires changes to some templates. [15:17:22] there is no reason parsoid servers would care about font packages, right? let's also purge the packages from parsoid-canary? [15:17:45] because wtp/parse are like appservers nowadays of course, so they all have them [15:45:32] <_joe_> yep [15:45:36] <_joe_> +1 [15:45:49] <_joe_> mutante: I would *not* change existing aliases, but create new ones [15:53:20] 10serviceops, 10Citoid, 10VisualEditor, 10WMSE-Bug-Reporting-and-Translation-2021, and 2 others: Automatic citation generation using ISBN on Wikipedia doesn't work - https://phabricator.wikimedia.org/T294010 (10akosiaris) >>! In T294010#7481262, @Mvolz wrote: > >> >> It's ~70 rps at those peaks. They are... [15:53:31] _joe_: hmm, I was trying to keep "mw" but change its meaning and drop instead the "all-mw-*" ones, I'll wait on that one. and thanks for the confirmation on the parsoid part [15:54:01] <_joe_> mutante: I'll try to see if I can come up with an alternative on the patch [15:54:56] :) cool! no rush, I am more into the LVS one [15:55:18] 10serviceops, 10Scap, 10Patch-For-Review, 10Release-Engineering-Team (Doing): RESTBase deployment fails with scap internal error - https://phabricator.wikimedia.org/T294936 (10Pchelolo) 05Open→03Resolved a:03Pchelolo Success! Thank you @dancy [15:55:25] <_joe_> yeah sorry I have to finish this mammoth patchset that keeps growing like an avalance [15:55:29] <_joe_> *avalanche [15:56:00] no worries!:) [15:56:25] 10serviceops, 10Scap, 10Patch-For-Review, 10Release-Engineering-Team (Doing): RESTBase deployment fails with scap internal error - https://phabricator.wikimedia.org/T294936 (10dancy) Great news! [16:14:53] 10serviceops, 10Internet-Archive, 10SRE: Improve download speed from archive.org on appservers - https://phabricator.wikimedia.org/T295009 (10Yann) Slow bandwith from IA seems indeed the issue. I expected that upload-by-url (i.e. direct transfer from IA servers to WM servers) would just trump any limit to an... [16:48:46] FYI serviceopsen :) https://gerrit.wikimedia.org/r/c/operations/puppet/+/736823/ [16:50:14] <_joe_> heh arrived late for my +1 :D [16:52:37] 🎉🎉🎉 [16:53:01] ah :) [16:54:38] ✨ [17:17:14] 10serviceops, 10Data-Engineering, 10Platform Engineering, 10Wikibase change dispatching scripts to jobs: Better observability/visualization for MediaWiki jobs - https://phabricator.wikimedia.org/T291620 (10odimitrijevic) [19:10:36] 10serviceops, 10SRE, 10Patch-For-Review, 10Release-Engineering-Team (Radar): Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10Legoktm) 05Open→03Resolved [19:10:47] 10serviceops, 10Performance-Team (Radar): Migrate WMF Production from PHP 7.2 to PHP 7.4 - https://phabricator.wikimedia.org/T271736 (10Legoktm) [20:59:56] 10serviceops, 10MediaWiki-General, 10SRE, 10MW-1.35-notes (1.35.0-wmf.28; 2020-04-14), and 5 others: Some pages will become completely unreachable after PHP7 update due to Unicode changes - https://phabricator.wikimedia.org/T219279 (10Pchelolo) 05Open→03Resolved Ok, fixed all the stuck renames. Done. [21:13:21] 10serviceops, 10Data-Engineering, 10Platform Engineering, 10Wikibase change dispatching scripts to jobs: Better observability/visualization for MediaWiki jobs - https://phabricator.wikimedia.org/T291620 (10Ottomata) @Ladsgroup @Michael let us know if having the MW job event data in Hadoop would be useful,...