[06:43:45] 10serviceops, 10Prod-Kubernetes, 10Patch-For-Review: Better management for helm charts - https://phabricator.wikimedia.org/T320782 (10Joe) [07:06:16] good morning folks [07:06:30] powercycled parse1002, it was stuck in weird state [07:07:55] <_joe_> elukey: thanks, I was about to go check [08:09:42] Morning [08:35:04] 10serviceops, 10Performance-Team: Migrate WMF production from PHP 7.4 to PHP 8.1 - https://phabricator.wikimedia.org/T319432 (10hashar) >>! In T319432#8315497, @Jdforrester-WMF wrote: > The wmf-quibble jobs are rather expensive. Do we really need to run these for 8.0 as well as 8.1 if we're not going planning... [08:49:41] 10serviceops, 10SRE, 10observability, 10Maps (Kartotherian): Get Kartotherian SLO metrics into Prometheus - https://phabricator.wikimedia.org/T320748 (10hnowlan) I'd be curious to hear @Jgiannelos's input on this one - if we want to not bother rewriting Kartotherian to speak to Prometheus directly via the... [08:58:20] 10serviceops, 10Data Engineering Planning, 10SRE, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventgate chart should use common_templates - https://phabricator.wikimedia.org/T303543 (10Clement_Goubert) In preparation of the redeploy, I lowered the TTL for service discovery to 30 second... [09:00:16] 10serviceops, 10SRE, 10observability, 10Maps (Kartotherian): Get Kartotherian SLO metrics into Prometheus - https://phabricator.wikimedia.org/T320748 (10Jgiannelos) The effort required to configure service runner to migrate from statsd to prometheus is not that much (its abstracted so its a matter of confi... [09:46:27] 10serviceops, 10Security-Team, 10serviceops-collab, 10GitLab (CI & Job Runners), and 3 others: Setup GitLab Runner in trusted environment - https://phabricator.wikimedia.org/T295481 (10Jelto) [10:01:15] 10serviceops, 10Security-Team, 10serviceops-collab, 10GitLab (CI & Job Runners), and 3 others: Setup GitLab Runner in trusted environment - https://phabricator.wikimedia.org/T295481 (10Jelto) 05In progress→03Resolved Firewall issues are solved with the patch above. So the Trusted Runners are functional... [10:41:07] 10serviceops, 10SRE, 10observability, 10Maps (Kartotherian): Get Kartotherian SLO metrics into Prometheus - https://phabricator.wikimedia.org/T320748 (10hnowlan) I hadn't considered how we get traffic to Kartotherian - for the most part we just directly rewrite requests for maps.wikimedia.org to kartotheri... [10:56:28] Morning all. I'm still seeking reviews on the spark production-image change, if anyone has any time: https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/838151 [12:28:01] 10serviceops, 10Data Engineering Planning, 10SRE, 10Event-Platform Value Stream (Sprint 03), 10Patch-For-Review: eventgate chart should use common_templates - https://phabricator.wikimedia.org/T303543 (10lbowmaker) [12:32:33] claime: o/ i'm here, doing meetings, emails etc. but am here! :) [12:33:48] ottomata: Awesome :) I've reduced the TTLs preemptively so we don't have to wait 5 minutes at each switch. lmk if there's a specific order you want me to do the redeploys in. [12:34:21] ottomata: I can wait for you to be out of meetings if you want to pair [12:46:13] claime: order doesn't matter, i usually do eventgate-main last tho. [12:46:21] ack [12:46:41] yeah lets wait, i'd love to pair, especially for the depool [12:46:46] ty [12:48:12] np [13:16:47] ok claime les gooo [13:16:57] ottomata: You can join me on meet, I'm there :) [13:18:55] oh oh the cal [13:18:55] k [14:17:26] 10serviceops, 10SRE, 10Thumbor, 10Thumbor Migration, and 2 others: Migrate thumbor to Kubernetes - https://phabricator.wikimedia.org/T233196 (10hnowlan) [14:20:37] 10serviceops, 10SRE, 10User-WDoran, 10User-brennen: Canaries canaries canaries - https://phabricator.wikimedia.org/T210143 (10jijiki) [14:21:08] 10serviceops, 10Patch-For-Review: Improve Scap2 testing - https://phabricator.wikimedia.org/T216518 (10jijiki) 05Open→03Resolved a:03jijiki Bluntly closing this task. [14:46:48] 10serviceops, 10Data Engineering Planning, 10SRE, 10Event-Platform Value Stream (Sprint 03), 10Patch-For-Review: eventgate chart should use common_templates - https://phabricator.wikimedia.org/T303543 (10Clement_Goubert) 05Open→03Resolved All eventgate services redeployed, including staging environme... [14:46:54] 10serviceops, 10SRE, 10Patch-For-Review, 10good first task: Upgrade all deployment charts to use the latest version of common_templates - https://phabricator.wikimedia.org/T292390 (10Clement_Goubert) [15:09:07] 10serviceops, 10Data Engineering Planning, 10SRE, 10Event-Platform Value Stream (Sprint 03), 10Patch-For-Review: eventgate chart should use common_templates - https://phabricator.wikimedia.org/T303543 (10Ottomata) Yeehaw thank you so much Clem! [15:53:39] 10serviceops, 10SRE, 10Thumbor, 10Service-deployment-requests: New Service Request Wikimedia-Thumbor - https://phabricator.wikimedia.org/T304436 (10hnowlan) 05Open→03Invalid [15:53:55] 10serviceops, 10SRE, 10Thumbor, 10Service-deployment-requests: New Service Request Wikimedia-Thumbor - https://phabricator.wikimedia.org/T304436 (10hnowlan) Closing in favour of T233196 for main tracking [16:07:49] 10serviceops, 10Observability-Logging, 10SRE: rsyslogd: omkafka: action will suspended due to kafka error -187: Local: All broker connections are down - https://phabricator.wikimedia.org/T240560 (10jijiki) 05Open→03Resolved a:03jijiki I am closing this as it appears that it is not an issue any more, wi... [16:11:46] 10serviceops, 10Scap, 10Release-Engineering-Team (Seen): Missing annotations for sync-wikiversions - https://phabricator.wikimedia.org/T235787 (10jijiki) 05Open→03Resolved a:03jijiki [16:19:58] 10serviceops, 10SRE, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10MW-1.38-notes (1.38.0-wmf.19; 2022-01-24), and 2 others: Undeploy graphoid - https://phabricator.wikimedia.org/T242855 (10jijiki) [17:53:53] 10serviceops, 10Discovery-Search, 10SRE, 10serviceops-collab, and 2 others: Sunset search.wikimedia.org service - https://phabricator.wikimedia.org/T316296 (10mpopov) > Just for clarification, we are talking about the service named `apple-search` in service discovery and not `search` or `search-https`, as... [18:23:17] If I remove a service from hieradata/common/service.yaml then on which hosts will there be an actual change? [18:25:24] LVS I suppose.. and alertmanager.. hmm [18:36:45] no. none of those but conf*. right... [18:40:45] 10serviceops, 10Phabricator, 10serviceops-collab, 10Patch-For-Review, 10Release-Engineering-Team (Bonus Level 🕹️): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Dzahn) shutdown was announced in today's SRE meeting [19:06:25] 10serviceops, 10Parsoid (Tracking): Parsoid deb: Error with apt-get update - https://phabricator.wikimedia.org/T242757 (10jijiki) 05Open→03Invalid ParsoidJS is no more [19:15:52] 10serviceops, 10Performance-Team: Migrate WMF production from PHP 7.4 to PHP 8.1 - https://phabricator.wikimedia.org/T319432 (10Krinkle) p:05Triage→03Medium