[08:05:37] 10serviceops, 10Patch-For-Review, 10Service-deployment-requests: New Service Request 'iPoid' - https://phabricator.wikimedia.org/T325147 (10kostajh) [08:27:05] jayme: I'm going to add the new miscweb SAN to tlsExtraSANs in admin staging with helmfile -e staging-eqiad -l name=namespace-certificates -i apply. Change https://gerrit.wikimedia.org/r/q/924898. [08:27:47] jelto: thanks for the heads up, please go ahead :) [09:42:46] 10serviceops, 10SRE, 10Thumbor, 10Thumbor Migration, 10Platform Team Workboards (Platform Engineering Reliability): Final steps for fully-Kubernetes Thumbor - https://phabricator.wikimedia.org/T334488 (10hnowlan) [09:50:38] 10serviceops, 10SRE, 10Thumbor, 10Thumbor Migration, 10Platform Team Workboards (Platform Engineering Reliability): Future of Thumbor's memcached backend - https://phabricator.wikimedia.org/T318695 (10jijiki) [09:52:04] 10serviceops, 10SRE, 10Thumbor, 10Thumbor Migration, 10Platform Team Workboards (Platform Engineering Reliability): Future of Thumbor's memcached backend - https://phabricator.wikimedia.org/T318695 (10jijiki) [10:07:34] 10serviceops, 10Infrastructure-Foundations, 10Mail, 10TranslationNotifications: Investigate if TranslationNotification's DigestEmailer.php is really sending emails and what happens to them - https://phabricator.wikimedia.org/T333899 (10Nikerabbit) 05Open→03Resolved a:03Nikerabbit Closing this as pare... [10:53:21] 10serviceops, 10RESTbase Sunsetting, 10Parsoid (Tracking): Enable WarmParsoidParserCache on all wikis - https://phabricator.wikimedia.org/T329366 (10daniel) Note to self: ` 12:51 <_joe_> so the changeprop change - as a quick pointer - you need to edit operations/deployment-charts:helmfile.d/services/change... [10:53:41] 10serviceops, 10All-and-every-Wikisource, 10Thumbor: Thumbor fails to render thumbnails of djvu/tiff/pdf files quite often in eqiad - https://phabricator.wikimedia.org/T337649 (10Yann) Here are 2 big PDF files for which thumbnails failed: * https://commons.wikimedia.org/wiki/File:The_Century_Dictionary_and_C... [11:16:30] 10serviceops, 10Observability-Metrics, 10Prod-Kubernetes, 10Kubernetes, 10Patch-For-Review: Migrate use of infrastructure_users tokens to client certificates - https://phabricator.wikimedia.org/T325268 (10JMeybohm) [11:19:42] 10serviceops, 10SRE, 10Thumbor, 10Thumbor Migration, 10Platform Team Workboards (Platform Engineering Reliability): Future of Thumbor's memcached backend - https://phabricator.wikimedia.org/T318695 (10hnowlan) [11:20:06] 10serviceops, 10SRE, 10Thumbor, 10Thumbor Migration, 10Platform Team Workboards (Platform Engineering Reliability): Future of Thumbor's memcached backend - https://phabricator.wikimedia.org/T318695 (10hnowlan) fwiw I see no reason not to move to mcrouter [12:02:15] ottomata: there are a bunch of probe fails for flink-operator across all clusters... https://logstash.wikimedia.org/goto/3f95905b26c079f41fc70dc2e6e44585 [12:08:41] Hello! I'd like to deploy admin helm chart to codfw and eqiad to add additional tlsExtraSANs to namespace certificates (miscweb). I've done this in staging using helmfile -e staging-eqiad -l name=namespace-certificates -i apply and it worked fine. [12:09:37] jayme: Any objections? Diff in production shows one lines in certificate/miscweb (similar to staging). [12:15:29] jelto: gogo [12:17:21] thanks done! [12:37:11] 10serviceops, 10Abstract Wikipedia team, 10Abstract Wikipedia Fix-It tasks: Please hide from the docker registry two no-longer-used Abstract Wiki images (now moved to GitLab) - https://phabricator.wikimedia.org/T337505 (10Clement_Goubert) a:03Clement_Goubert [12:37:39] 10serviceops, 10Kubernetes: cfssl-issuer: Generate Kubernetes Events - https://phabricator.wikimedia.org/T337928 (10JMeybohm) [12:39:21] 10serviceops, 10Abstract Wikipedia team, 10Abstract Wikipedia Fix-It tasks: Please hide from the docker registry two no-longer-used Abstract Wiki images (now moved to GitLab) - https://phabricator.wikimedia.org/T337505 (10Clement_Goubert) For clarification, can I just delete them from the registry? [13:11:40] 10serviceops, 10Abstract Wikipedia team, 10Abstract Wikipedia Fix-It tasks: Please hide from the docker registry two no-longer-used Abstract Wiki images (now moved to GitLab) - https://phabricator.wikimedia.org/T337505 (10Jdforrester-WMF) >>! In T337505#8895312, @Clement_Goubert wrote: > For clarification, c... [13:12:35] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Drop the use of nonexisting groups in kubernetes infrastructure_users - https://phabricator.wikimedia.org/T290963 (10JMeybohm) 05Open→03Resolved a:03JMeybohm infrastructure_users is no more [13:12:41] 10serviceops, 10Foundational Technology Requests, 10Prod-Kubernetes, 10Shared-Data-Infrastructure, and 2 others: Post Kubernetes v1.23 cleanup - https://phabricator.wikimedia.org/T328291 (10JMeybohm) [13:15:10] 10serviceops, 10Foundational Technology Requests, 10Prod-Kubernetes, 10Shared-Data-Infrastructure, and 2 others: Post Kubernetes v1.23 cleanup - https://phabricator.wikimedia.org/T328291 (10JMeybohm) [13:15:19] 10serviceops, 10Observability-Metrics, 10Prod-Kubernetes, 10Kubernetes: Migrate use of infrastructure_users tokens to client certificates - https://phabricator.wikimedia.org/T325268 (10JMeybohm) 05Open→03Resolved infrastructure_users is no more [13:34:06] 10serviceops, 10Abstract Wikipedia team, 10Abstract Wikipedia Fix-It tasks: Delete obsolete Abstract Wiki images mediawiki-services-function-orchestrator and mediawiki-services-function-evaluator (moved to GitLab) - https://phabricator.wikimedia.org/T337505 (10Clement_Goubert) [13:35:17] 10serviceops, 10Abstract Wikipedia team, 10Abstract Wikipedia Fix-It tasks: Delete obsolete Abstract Wiki images mediawiki-services-function-orchestrator and mediawiki-services-function-evaluator from registry - https://phabricator.wikimedia.org/T337505 (10Clement_Goubert) [13:35:21] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Refactor common_templates/0.2/default-network-policy-conf.yaml into a GlobalNetworkPolicy - https://phabricator.wikimedia.org/T275035 (10JMeybohm) 05Open→03Resolved A bunch of deploys already happened since the change so I'm resolving this. [13:44:22] 10serviceops: Upgrade ICU version for MediaWiki in preparation to move to debian bullseye - https://phabricator.wikimedia.org/T324447 (10Joe) Looks like a different task was used in the end, closing as duplicate. [13:44:35] 10serviceops: Upgrade ICU version for MediaWiki in preparation to move to debian bullseye - https://phabricator.wikimedia.org/T324447 (10Joe) [14:11:33] 10serviceops, 10Abstract Wikipedia team, 10Abstract Wikipedia Fix-It tasks: Delete obsolete Abstract Wiki images mediawiki-services-function-orchestrator and mediawiki-services-function-evaluator from registry - https://phabricator.wikimedia.org/T337505 (10Clement_Goubert) [14:11:39] 10serviceops, 10Abstract Wikipedia team, 10Abstract Wikipedia Fix-It tasks: Delete obsolete Abstract Wiki images mediawiki-services-function-orchestrator and mediawiki-services-function-evaluator from registry - https://phabricator.wikimedia.org/T337505 (10Clement_Goubert) 05Open→03In progress [14:25:44] 10serviceops, 10Abstract Wikipedia team, 10Abstract Wikipedia Fix-It tasks: Delete obsolete Abstract Wiki images mediawiki-services-function-orchestrator and mediawiki-services-function-evaluator from registry - https://phabricator.wikimedia.org/T337505 (10Clement_Goubert) 05In progress→03Resolved [14:25:51] 10serviceops, 10Abstract Wikipedia team, 10Abstract Wikipedia Fix-It tasks: Delete obsolete Abstract Wiki images mediawiki-services-function-orchestrator and mediawiki-services-function-evaluator from registry - https://phabricator.wikimedia.org/T337505 (10Clement_Goubert) All done. [14:26:41] 10serviceops, 10Abstract Wikipedia team, 10Abstract Wikipedia Fix-It tasks: Delete obsolete Abstract Wiki images mediawiki-services-function-orchestrator and mediawiki-services-function-evaluator from registry - https://phabricator.wikimedia.org/T337505 (10Jdforrester-WMF) >>! In T337505#8895783, @Clement_Go... [15:55:10] Re: https://www.irccloud.com/pastebin/1gvNR5oB/ [15:55:14] <_joe_> so mszabo it seems we're trying to solve the same issue [15:55:34] I'm using the same pattern and can trigger drain, but I see you just wait on connexions to be 0 ? [15:56:00] <_joe_> mszabo: ^^ we can try to join forces I guess :) [15:57:33] hmm, trigger drain may be better [15:57:44] because the problem with that script is that it does not currently work [15:57:54] the trick may be to do both - triggering drain and then waiting [15:58:17] <_joe_> yes, in our case we want to wait what we do for normal releases [15:58:25] <_joe_> which is 5 seconds of grace period from depooling [15:58:53] <_joe_> we might give k8s a bit more time as I'm not sure kube-proxy is as efficient as our fine-tuned scripts in catching up with changes [15:59:01] Triggering drain is the same but POST /drain_listeners?graceful [15:59:12] if you have nestat or equivalent you can also do [15:59:16] https://www.irccloud.com/pastebin/LsbyTPcu/ [15:59:31] wish I did :p [15:59:33] this one is proven to work but then the upstream envoy image removed netstat [16:00:13] <_joe_> claime: niet! [16:00:33] _joe_: I feel declawed [16:00:35] <_joe_> the docker police doesn't allow such hippie tools in our images. [16:01:38] draining probably also requires tuning --drain-strategy and --drain-time-s [16:01:52] since those default to ramping it up over 10 minutes [16:02:08] yes [16:02:37] I am not at practical tests yet, but it'd be a very short drain-time-s and drain-strategy immediate [16:03:06] yea [16:36:45] <_joe_> I still can't believe mszabo talked to me about his preStop pains the second we were discussing the same problem ourselves :P [16:37:01] Yeah, quite a coincidence [16:47:00] 10serviceops, 10Dumps-Generation, 10Performance-Team (Radar): Migrate WMF production from PHP 7.2 to PHP 7.4 - https://phabricator.wikimedia.org/T271736 (10Func) [16:56:31] jayme: probe fails eh? hmmm...; [17:51:04] _joe_: yeah, perfect timing :) [17:56:01] 10serviceops, 10SRE, 10Abstract Wikipedia team (Phase λ – Launch), 10Service-deployment-requests: New Service Request: function-orchestrator and function-evaluator (for Wikifunctions launch) - https://phabricator.wikimedia.org/T297314 (10Jdforrester-WMF) [19:02:59] jayme: does the liveness/readiness probe port need ingress rules? [19:49:08] 10serviceops, 10Observability-Alerting, 10observability, 10Patch-For-Review: Port openapi/swagger checks/alerts to Prometheus - https://phabricator.wikimedia.org/T320620 (10colewhite) 05Open→03In progress p:05Triage→03Medium a:03colewhite