[06:28:55] <elukey>	 hello folks!
[06:31:41] <elukey>	 I have some changeprop things to propose
[06:32:22] <elukey>	 1) Upgrade to Buster - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/943037. The next step will be to upgrade to nodejs12/bullseye, but as intermediate step migrating to buster and keep nodejs10 seems ok as well.
[06:34:31] <elukey>	 2) https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/943038/1 and next - afaics the prometheus statsd exporter in changeprop is heavily throttled (almost constantly), I fear that this is the issue why we loose metrics
[06:56:51] <elukey>	 for example, https://grafana.wikimedia.org/d/hyl18XgMk/kubernetes-container-details?orgId=1&var-datasource=eqiad%20prometheus%2Fk8s&var-namespace=changeprop&var-pod=changeprop-production-76447bc6bf-nwdds&var-container=All is one of the pod mostly affected
[07:20:09] <elukey>	 ah wow afaics eventgate runs on buster + nodejs 10 sigh
[07:20:25] <_joe_>	 elukey: sigh
[07:20:46] <_joe_>	 elukey: now a lot of service owners will have an interesting wakeup call regarding that
[07:21:22] <_joe_>	 buster, old versions of nodejs
[07:21:58] <_joe_>	 elukey: apart from that, I don't know if I agree with upgrading to buster or anything else before there's someone with ownership
[07:22:14] <_joe_>	 actually, I disagree but I want to take this to the team
[07:23:37] <elukey>	 _joe_ I think a bare minimum upgrade is needed, I am just a worried "customer" since ML is planning on using it for streams.. I think that upgrading to buster + fixing the metrics is overdue, all the rest can become something to discuss for sure
[07:23:54] <elukey>	 (well ML is already using it for streams)
[07:23:58] <_joe_>	 elukey: I agree it's overdue
[07:24:09] <_joe_>	 and please tell your manager that changeprop is unowned
[07:24:14] <_joe_>	 and that it's blocking you
[07:24:24] <_joe_>	 see why things never get properly fixed here?
[07:24:39] <_joe_>	 you do some work that's not on you out of goodwill/need to unblock yourself
[07:24:49] <_joe_>	 and the problem gets kicked down the road by a year
[07:28:50] <elukey>	 sure sure I 100% agree
[09:05:01] <Amir1>	 https://usercontent.irccloud-cdn.com/file/1EG9jNo7/7u9wk4.jpg
[09:12:40] <_joe_>	 ahahahah
[09:14:09] <_joe_>	 now the other patches require a bit more scrutiny I think
[09:14:20] <_joe_>	 although I tested all I could locally
[09:17:50] <Amir1>	 is it possible to load them in mwdebug and test them?
[09:31:16] <_joe_>	 I guess so, but we also need to add the vhost for noc
[09:31:54] <_joe_>	 Amir1: tell me when you want to run such a test
[09:32:06] <_joe_>	 I added tests for most things I could think of at least
[09:32:21] <Amir1>	 sounds good
[09:36:46] <wikibugs>	 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: hw troubleshooting: CPU machine check failure for parse1002.eqiad.wmnet - https://phabricator.wikimedia.org/T339340 (10Clement_Goubert)
[09:38:00] <claime>	 Amir1: lmao
[09:38:55] <Amir1>	 xD
[10:25:34] <elukey>	 just discovered that somebody outside the WMF runs a changeprop instance (with UA SampleChangePropInstance, the default in the repo's config)
[10:25:45] <elukey>	 and hits the /precache ores URI as well
[10:25:46] <elukey>	 lol
[10:26:07] <hnowlan>	 O_O 
[10:26:09] <hnowlan>	 lmao 
[10:26:18] <hnowlan>	 I wonder how old it is 
[10:26:57] <elukey>	 hnowlan: https://logstash.wikimedia.org/goto/d616b4f5a215b46be5d14f2baa37e121
[10:27:02] <elukey>	 same IP
[10:28:43] <elukey>	 no idea how to reach out
[10:28:55] <elukey>	 maybe I can add a requestctl rule
[10:29:12] <hnowlan>	 That's a lot more requests than I was expecting. 
[10:31:18] <elukey>	 interestingly, it seems that it started around the 21st
[10:31:24] <elukey>	 and ramped up since then
[10:37:51] <wikibugs>	 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: hw troubleshooting: CPU machine check failure for parse1002.eqiad.wmnet - https://phabricator.wikimedia.org/T339340 (10Clement_Goubert) 05Open→03Resolved Resolving for now, we will reopen if issues reappear.
[11:09:22] <wikibugs>	 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10Fuzzy) >>! In T275319#9054277, @stjn wrote: > Wikisource editors can absolutely split pages into smaller ones, since those longer...
[11:29:38] <wikibugs>	 10serviceops, 10wikidiff2, 10Better-Diffs-2023, 10Community-Tech (CommTech-Kanban): Deploy wikidiff2 1.14.1 - https://phabricator.wikimedia.org/T340087 (10MoritzMuehlenhoff) There were some hosts still on 1.13 (cloudweb, mwmaint, deployment servers, scandium, snapshot) and parse1002 (which was down during...
[11:31:34] <wikibugs>	 10serviceops, 10wikidiff2, 10Better-Diffs-2023, 10Community-Tech (CommTech-Kanban): Deploy wikidiff2 1.14.1 - https://phabricator.wikimedia.org/T340087 (10Clement_Goubert) >>! In T340087#9054939, @MoritzMuehlenhoff wrote: > There were some hosts still on 1.13 (cloudweb, mwmaint, deployment servers, scandiu...
[12:12:06] <_joe_>	 elukey: yeah let's block this?
[12:42:35] <elukey>	 _joe_ back sorry, I am thinking a requestctl rule targeting the IP, does it make sense?
[12:50:41] <wikibugs>	 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10Alexey_Skripnik) >>! In T275319#9054275, @stjn wrote: > For the record, I don't think that the need to be able to build even long...
[12:57:41] <wikibugs>	 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10stjn) ‘Readers expect us to dump everything on one page’ is just your opinion, and so is ‘from usability standpoint, it’s better...
[13:08:14] <wikibugs>	 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10Alexey_Skripnik) >>! In T275319#9055247, @stjn wrote: > From usability standpoint, it’s better to have a page that doesn’t weigh...
[13:29:39] <_joe_>	 elukey: the UA actually
[13:29:57] <_joe_>	 create a per-ip throttle for people calling us with that generic UA
[13:30:09] <_joe_>	 that was my idea
[13:30:13] <_joe_>	 I can get on it later
[13:30:33] <elukey>	 yep yep, I'll use superset's requestctl rule generator later on
[13:52:33] <_joe_>	 elukey: oh no that's verbose and creates rules that are not properly linted
[13:52:45] <_joe_>	 it shows the author has put no spicerack in that implementation
[13:54:55] <volans>	 you don't have to follow it by the letter, aggregation is left for the user as an exercise
[13:55:13] <volans>	 as for linting... what's the issue? patches are welcome you know ;)
[14:00:15] <elukey>	 Joe still has it, I cannot catch Riccardo like this
[14:07:47] <_joe_>	 the difference is you don't first make him feel like he's on the wrong side of a linter
[14:13:14] <claime>	 x)
[14:38:43] <elukey>	 Ok so this is the baseline: https://superset.wikimedia.org/requestctl-generator?q=d7VvNGwB1X0
[14:39:10] <elukey>	 one more note - This particular UA acts as changeprop and hits /v3/precache, that in turn warms up the Redis cache
[14:39:30] <elukey>	 a complete block may also be ok
[14:39:36] <elukey>	 but we can start with throttling
[14:40:23] <elukey>	 I can stage the above if people are ok, then some soul can review and I'll apply
[14:58:03] <_joe_>	 elukey: seems legit
[15:08:33] <wikibugs>	 10serviceops, 10Abstract Wikipedia team, 10Service-deployment-requests: New Service Request memcached-wikifunctions - https://phabricator.wikimedia.org/T297815 (10Jdforrester-WMF) p:05Triage→03Medium
[15:52:48] <wikibugs>	 10serviceops, 10Abstract Wikipedia team, 10Service-deployment-requests: New Service Request memcached-wikifunctions - https://phabricator.wikimedia.org/T297815 (10Joe) We have already procured the servers for this work, and they're set up already.
[17:19:59] <wikibugs>	 10serviceops, 10Abstract Wikipedia team, 10Service-deployment-requests: New Service Request memcached-wikifunctions - https://phabricator.wikimedia.org/T297815 (10Jdforrester-WMF) >>! In T297815#9056015, @Joe wrote: > We have already procured the servers for this work, and they're set up already.   Yup, I ju...
[18:30:15] <wikibugs>	 10serviceops, 10Abstract Wikipedia team, 10Service-deployment-requests: New Service Request memcached-wikifunctions - https://phabricator.wikimedia.org/T297815 (10Joe) >>! In T297815#9056348, @Jdforrester-WMF wrote: >>>! In T297815#9056015, @Joe wrote: >> We have already procured the servers for this work, a...
[21:23:24] <wikibugs>	 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10Vladis13) >>! In T275319#9054277, @stjn wrote: > Wikisource editors can absolutely split pages into smaller ones, since those lon...
[22:02:35] <wikibugs>	 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10stjn) >>! In T275319#9055301, @Alexey_Skripnik wrote: > Could you elaborate on why serving 2.3 Mb of HTML is bad from a usability...
[22:49:36] <wikibugs>	 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10Alexey_Skripnik) >>! In T275319#9057235, @stjn wrote: > Because heavy pages load worse for readers, especially on poorer connecti...
[23:15:15] <wikibugs>	 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10Vladis13) >>! In T275319#9057235, @stjn wrote: > (@Vladis13 please keep in mind https://www.mediawiki.org/wiki/Bug_management/Pha...
[23:44:29] <wikibugs>	 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10Reedy) None of this is helping move the discussion forward.  Timo's comment in T275319#7947012 is still relevant.  And at the sam...
[23:45:58] <wikibugs>	 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10Vladis13) >>! In T275319#9057297, @Alexey_Skripnik wrote: > Readers don't care directly about the weight of a webpage's HTML. Wha...