[06:04:05] 10serviceops, 10GitLab, 10Release-Engineering-Team: LDAP group sync fails on gitlab1004: EOFError: EOF when reading a line - https://phabricator.wikimedia.org/T347132 (10Jelto) [06:24:28] 10serviceops, 10observability, 10SRE Observability (FY2023/2024-Q1): Hardcode the SLO time windows in Grafana dashboards generated via Grizzly - https://phabricator.wikimedia.org/T346144 (10elukey) 05Open→03Resolved a:03elukey Change merged! Thanks to all for the feedback :) I'd say that we can close,... [08:09:30] 10serviceops, 10MW-on-K8s, 10MediaWiki-Configuration, 10MediaWiki-Platform-Team (Radar), and 3 others: Uncaught ConfigException: Failed to load configuration from etcd - https://phabricator.wikimedia.org/T346971 (10JMeybohm) p:05Triage→03Low I've checked the logs from September (https://logstash.wikime... [08:57:59] 10serviceops, 10ChangeProp, 10Prod-Kubernetes, 10SRE-Sprint-Week-Sustainability-March2023, and 2 others: Raise an alarm on container restarts/OOMs in kubernetes - https://phabricator.wikimedia.org/T256256 (10JMeybohm) [08:59:11] 10serviceops, 10Kubernetes: Document how k8s logging works - https://phabricator.wikimedia.org/T289639 (10JMeybohm) 05Open→03Resolved a:03JMeybohm [08:59:56] 10serviceops, 10Machine-Learning-Team, 10Observability-Metrics, 10Kubernetes: Don't scrape every containerPort for metrics - https://phabricator.wikimedia.org/T318707 (10JMeybohm) 05Open→03Resolved a:03JMeybohm [09:00:29] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: cfssl-issuer: Generate Kubernetes Events - https://phabricator.wikimedia.org/T337928 (10JMeybohm) [09:01:23] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Create kube-state-metrics docker image - https://phabricator.wikimedia.org/T343801 (10JMeybohm) [09:01:42] 10serviceops, 10Prod-Kubernetes, 10Kubernetes, 10Patch-For-Review: Helm chart packaging should update dependencies - https://phabricator.wikimedia.org/T316347 (10JMeybohm) [09:02:01] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Integrate kube-metrics-server into our infrastructure - https://phabricator.wikimedia.org/T249929 (10JMeybohm) [09:07:53] 10serviceops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, and 4 others: Create a cookbook to perform a rolling reboot of a kubernetes cluster - https://phabricator.wikimedia.org/T260661 (10JMeybohm) [09:10:34] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Reduction of Secret-based Service Account Tokens - https://phabricator.wikimedia.org/T345892 (10JMeybohm) [09:12:19] 10serviceops, 10Prod-Kubernetes, 10Shared-Data-Infrastructure, 10Kubernetes: Update Kubernetes clusters to >1.25 - https://phabricator.wikimedia.org/T341984 (10JMeybohm) p:05Medium→03High [09:13:46] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: [EPIC] Docker deprecation as a container runtime enginer for kubernetes. - https://phabricator.wikimedia.org/T269684 (10JMeybohm) p:05Triage→03High [09:45:25] 10serviceops, 10CX-cxserver, 10RESTBase Sunsetting, 10Language-Team (Language-2023-July-September): Make cxserver call parsoid endpoints on MediaWiki, instead of going through RESTbase - https://phabricator.wikimedia.org/T344982 (10MSantos) @Nikerabbit I would like to know if the Language team can prioriti... [09:54:35] 10serviceops, 10CX-cxserver, 10RESTBase Sunsetting, 10Language-Team (Language-2023-July-September): Make cxserver call parsoid endpoints on MediaWiki, instead of going through RESTbase - https://phabricator.wikimedia.org/T344982 (10daniel) Note that there are two options: 1) call the parsoid endpoints exp... [10:26:36] Hi serviceops! Growth's having a mysterious issue with one of its jobs, which is basically unable to find `Cdb\Exception`, even though the class can be found well in normal circumstances. I summarized what is known about the issue so far at https://phabricator.wikimedia.org/T344428#9173651, but we'd appreciate if serviceops could have a look and advise what might be happening here. Thanks! [10:35:08] <_joe_> urbanecm: I saw the task but didn't have time to dig into it [10:35:43] <_joe_> one thing that wasn't clear to me in a cursory read is [10:35:58] <_joe_> that error comes from the jobrunners or from mwmaint? [10:37:28] <_joe_> because if it comes from mwmaint, it's one class of potential issues [10:37:49] <_joe_> if it's coming from the jobs themselves, I fear there's something more subtle going on [10:38:39] _joe_: It's coming from the jobrunners. I tried to reproduce the error locally at mwmaint and I didn't succeed. [10:39:19] (The script has a parameter that controls whether it should fire Jobs or do the work locally) [10:40:02] <_joe_> urbanecm: ok then the two differences I see are that a job is run by php-fpm [10:40:16] <_joe_> and if you don't use the jobqueue you're on cli [10:40:28] <_joe_> which means no opcache and no apc [10:47:25] Good to know [11:06:17] <_joe_> so the problem is most probably somewhere around those [11:30:33] <_joe_> urbanecm: I left a comment on the phab task [11:32:02] ty [12:56:13] 10serviceops, 10AQS2.0, 10Cassandra, 10SRE, 10Service-deployment-requests: AQS 2.0 differentially private pageviews deploy API - https://phabricator.wikimedia.org/T343855 (10JAllemandou) It feels wrong to me to be willing to return all page views on a date: the result set would be enormous and wouldn't b... [14:10:18] 10serviceops, 10AQS2.0, 10Cassandra, 10SRE, 10Service-deployment-requests: AQS 2.0 differentially private pageviews deploy API - https://phabricator.wikimedia.org/T343855 (10Eevans) >>! In T343855#9190791, @JAllemandou wrote: > It feels wrong to me to be willing to return all page views on a date: the re... [16:36:12] 10serviceops, 10Wikimedia-Site-requests, 10Campaign-Registration, 10Campaign-Tools (Campaign-Tools-Current-Sprint), 10Patch-For-Review: Configure the aggregation job to run periodically on Wikimedia wikis - https://phabricator.wikimedia.org/T339984 (10ifried)