[05:53:56] Hello 👋 anyone around for reviewing a admin helmfile.d change? :) https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/927998 [06:25:03] jelto: +1ed [06:25:22] thanks! [06:36:38] bitte [06:51:55] Deployment of admin helm chart to add additional tlsExtraSANs to miscweb worked fine in staging. [06:51:55] Any objections to continue in eqiad/codfw after MediaWiki infrastructure window (ending in 10 minutes)? Diff in production shows only change in miscweb certificate (similar to staging). [07:17:27] jelto: no objections [07:23:52] thanks, done! [11:22:00] Hi! Is there any way we can check for a given k8s deployment in the past what was applied by helm? [11:33:53] You can get the manifest for a release if it's one of the last 10 [11:34:11] CI job output may help if it's still in retention [11:34:27] nemo-yiannis: ^ [11:35:07] It is one of the last 10, any hint on how ? [11:36:08] nemo-yiannis: You can get the release revision with helm history [11:36:15] thanks! [11:36:19] Then helm get manifest [11:36:24] Then helm get manifest --revision [11:36:28] sorry [11:37:01] If it doesn't work it's because of perms, I think there's a kubeconfig trick [12:57:40] hi! looking for review on https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/684855 - using vendor egress templates for eventgate [13:02:11] ottomata: LGTM [13:05:04] same :) [13:15:05] 10serviceops, 10Mobile-Content-Service, 10Product-Infrastructure-Team-Backlog-Deprecated, 10SRE, 10Traffic: Setup allowed list for MCS decom - https://phabricator.wikimedia.org/T340036 (10MSantos) [13:15:43] 10serviceops, 10Mobile-Content-Service, 10Product-Infrastructure-Team-Backlog-Deprecated, 10SRE, 10Traffic: Setup allowed list for MCS decom - https://phabricator.wikimedia.org/T340036 (10MSantos) p:05Triage→03High [13:16:53] 10serviceops, 10Mobile-Content-Service, 10Product-Infrastructure-Team-Backlog-Deprecated, 10SRE, 10Traffic: Setup allowed list for MCS decom - https://phabricator.wikimedia.org/T340036 (10MSantos) [13:17:53] 10serviceops, 10Mobile-Content-Service, 10Product-Infrastructure-Team-Backlog-Deprecated, 10SRE, 10Traffic: Setup allowed list for MCS decom - https://phabricator.wikimedia.org/T340036 (10MSantos) [13:19:19] 10serviceops, 10Wikifeeds: Wikifeeds error: upstream connect error or disconnect/reset before headers. reset reason: connection failure - https://phabricator.wikimedia.org/T340037 (10Jgiannelos) [13:19:31] 10serviceops, 10Content-Transform-Team, 10Wikifeeds: Wikifeeds error: upstream connect error or disconnect/reset before headers. reset reason: connection failure - https://phabricator.wikimedia.org/T340037 (10Jgiannelos) [13:19:43] 10serviceops, 10Mobile-Content-Service, 10Product-Infrastructure-Team-Backlog-Deprecated, 10RESTbase Sunsetting, and 2 others: Setup allowed list for MCS decom - https://phabricator.wikimedia.org/T340036 (10MSantos) [13:38:31] 10serviceops, 10Mobile-Content-Service, 10Product-Infrastructure-Team-Backlog-Deprecated, 10RESTbase Sunsetting, and 2 others: Setup allowed list for MCS decom - https://phabricator.wikimedia.org/T340036 (10akosiaris) So, we need something to identify those users. wikiwand, if I understand the usage of the... [13:42:35] nemo-yiannis: I was looking at your ticket on wikifeeds error, it seems the errors have stopped with your last deployment at 1330UTC? [13:42:56] Yeah, i think a rolling restart of the pods did the trick. [13:43:19] I was keeping an eye to see if there are any errors left and i am gonna close it [13:43:38] ack [13:45:13] 10serviceops, 10Content-Transform-Team, 10Wikifeeds: Wikifeeds error: upstream connect error or disconnect/reset before headers. reset reason: connection failure - https://phabricator.wikimedia.org/T340037 (10Jgiannelos) 05Open→03Resolved a:03Jgiannelos It looks like after restarting the pods service s... [13:52:56] 10serviceops, 10Mobile-Content-Service, 10Product-Infrastructure-Team-Backlog-Deprecated, 10RESTbase Sunsetting, and 2 others: Setup allowed list for MCS decom - https://phabricator.wikimedia.org/T340036 (10MSantos) >>! In T340036#8952804, @akosiaris wrote: > So, we need something to identify those users.... [13:55:27] 10serviceops, 10Mobile-Content-Service, 10Product-Infrastructure-Team-Backlog-Deprecated, 10RESTbase Sunsetting, and 2 others: Setup allowed list for MCS decom - https://phabricator.wikimedia.org/T340036 (10akosiaris) >>! In T340036#8952836, @MSantos wrote: > Sounds great to me, no objections. Cool. Do we... [14:05:08] thnak syall! [14:05:16] and this one? [14:05:19] https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/684856/8 [14:18:18] ottomata: CI says no ? [14:37:58] i think that's because the chart isn't merged yet? [14:44:32] yeah something funky okay... [14:50:48] 10serviceops: Setup kubernetes namespaces for wikifuctions - https://phabricator.wikimedia.org/T340041 (10akosiaris) [14:59:10] needed fixtures [15:00:29] okay akosiaris fixed if you find a sec to check it out [15:02:30] 10serviceops: Setup kubernetes namespaces for wikifuctions - https://phabricator.wikimedia.org/T340041 (10Jdforrester-WMF) [15:06:40] 10serviceops, 10Service-deployment-requests: New Service Request memcached-wikifunctions - https://phabricator.wikimedia.org/T297815 (10Jdforrester-WMF) [15:06:45] 10serviceops, 10SRE, 10Abstract Wikipedia team (Phase λ – Launch), 10Service-deployment-requests: New Service Request: function-orchestrator and function-evaluator (for Wikifunctions launch) - https://phabricator.wikimedia.org/T297314 (10Jdforrester-WMF) [15:14:46] 10serviceops: Setup kubernetes namespaces for wikifuctions - https://phabricator.wikimedia.org/T340041 (10Jdforrester-WMF) [15:14:51] 10serviceops, 10SRE, 10Abstract Wikipedia team (Phase λ – Launch), 10Service-deployment-requests: New Service Request: function-orchestrator and function-evaluator (for Wikifunctions launch) - https://phabricator.wikimedia.org/T297314 (10Jdforrester-WMF) [15:41:07] 10serviceops: Setup kubernetes namespaces for wikifunctions - https://phabricator.wikimedia.org/T340041 (10Aklapper) [16:03:44] 10serviceops, 10All-and-every-Wikisource, 10Thumbor, 10MW-1.41-notes (1.41.0-wmf.13; 2023-06-13): Thumbor fails to render thumbnails of djvu/tiff/pdf files quite often in eqiad - https://phabricator.wikimedia.org/T337649 (10hnowlan) Thanks to Amir's change and a change in jobqueue's concurrency for Thumbna... [17:45:32] 10serviceops, 10Data-Engineering, 10Event-Platform Value Stream, 10SRE, 10Patch-For-Review: DRY kafka broker declaration in helmfiles - https://phabricator.wikimedia.org/T253058 (10Ottomata) Status update: networkpolicy for Kafka brokers has been DRY, but referencing the hostnames for Kafka brokers for... [17:47:15] Is anyone aware of problems with wikikube staging? all my helmfile/kubectl commands seem to be getting ignored, no permission denied so I don't think it's perms. Using namespace `mw-page-content-change-enrich` if that helps [18:40:49] inflatador: o/ gmodena and I were working on that yesterday, thought we fixed it with a reboot of the flink operator? [18:41:27] ^ dcausse [18:41:31] ottomata gmodena , dcausse and I were working on it about 4 hrs ago and it was in a bad state then [18:44:32] ? [18:44:36] so weird [18:45:08] ottomata rebooting fixed it for a while [18:45:19] but then we hit a regression [18:45:22] now it sback to how it was before? [18:45:41] i see there is no helm release atm, but the pods are still running, so same problem as yesterday? [18:50:13] inflatador: gmodena should we jump in a huddle again and troubleshoot? [18:51:02] ottomata i'm around for an huddle, but out of ideas. [18:51:32] inflatador: are you messing with it right now? [18:51:47] ottomata nope, but I'm up at https://meet.google.com/ueo-bznt-igw if anyone wants to join [19:43:15] 10serviceops, 10Data-Engineering, 10Event-Platform Value Stream: Flink k8s operator in staging sometimes will not sync changes to FlinkDeployments - https://phabricator.wikimedia.org/T340059 (10Ottomata) [19:43:30] 10serviceops, 10Data-Engineering, 10Event-Platform Value Stream: Flink k8s operator in staging sometimes will not sync changes to FlinkDeployments - https://phabricator.wikimedia.org/T340059 (10Ottomata) [19:53:54] https://phabricator.wikimedia.org/T340059 [20:26:56] 10serviceops, 10GitLab (CI & Job Runners), 10Patch-For-Review, 10Release-Engineering-Team (They Live 🕶️🧟): Provide ability to tag GitLab CI built images with a datetime format, set as default in pipeline-to-gitlab conversion - https://phabricator.wikimedia.org/T338224 (10CodeReviewBot) dancy closed https:/... [20:42:44] 10serviceops, 10GitLab (CI & Job Runners), 10Patch-For-Review, 10Release-Engineering-Team (They Live 🕶️🧟): Provide ability to tag GitLab CI built images with a datetime format, set as default in pipeline-to-gitlab conversion - https://phabricator.wikimedia.org/T338224 (10CodeReviewBot) dancy opened https:/... [20:52:00] 10serviceops, 10GitLab (CI & Job Runners), 10Patch-For-Review, 10Release-Engineering-Team (They Live 🕶️🧟): Provide ability to tag GitLab CI built images with a datetime format, set as default in pipeline-to-gitlab conversion - https://phabricator.wikimedia.org/T338224 (10CodeReviewBot) kharlan closed https... [21:34:23] 10serviceops, 10SRE, 10Wikimedia-Site-requests, 10Performance-Team (Radar): Raise limit of $wgMaxArticleSize for Hebrew Wikisource - https://phabricator.wikimedia.org/T275319 (10neriah) @Aklapper Seems like a good idea. can you send it there?