[05:32:27] 06serviceops, 10MW-on-K8s: Add a way to suspend CronJobs - https://phabricator.wikimedia.org/T394409#10828644 (10Clement_Goubert) 05Open→03Resolved [05:32:27] 06serviceops, 10MW-on-K8s: Investigate startingDeadlineSeconds setting for kubernetes CronJobs - https://phabricator.wikimedia.org/T394423#10828641 (10Clement_Goubert) [05:33:00] 06serviceops, 10MW-on-K8s: Investigate startingDeadlineSeconds setting for kubernetes CronJobs - https://phabricator.wikimedia.org/T394423#10828648 (10Clement_Goubert) [05:34:19] 06serviceops, 10MW-on-K8s: Investigate startingDeadlineSeconds setting for kubernetes CronJobs - https://phabricator.wikimedia.org/T394423#10828649 (10Clement_Goubert) [06:55:17] 06serviceops, 06Growth-Team, 10MW-on-K8s, 10Notifications (Echo): Migrate Notifications (Echo) periodic jobs - https://phabricator.wikimedia.org/T394471 (10Clement_Goubert) 03NEW [06:55:25] 06serviceops, 06Growth-Team, 10MW-on-K8s, 10Notifications (Echo): Migrate Notifications (Echo) periodic jobs - https://phabricator.wikimedia.org/T394471#10828731 (10Clement_Goubert) p:05Triage→03High [07:10:46] 06serviceops, 06Growth-Team, 10MW-on-K8s, 10Notifications (Echo), 13Patch-For-Review: Migrate Notifications (Echo) periodic jobs - https://phabricator.wikimedia.org/T394471#10828740 (10Clement_Goubert) Hmm I see that Herald added #growth-team to subscribers, @Michael should we merge this one into {T38578... [08:00:09] 06serviceops, 06Growth-Team, 10MW-on-K8s, 10Notifications (Echo), 13Patch-For-Review: Migrate Notifications (Echo) periodic jobs - https://phabricator.wikimedia.org/T394471#10828772 (10Michael) >>! In T394471#10828738, @Clement_Goubert wrote: > Hmm I see that Herald added #growth-team to subscribers, @Mi... [08:00:49] 06serviceops, 10Prod-Kubernetes, 07Epic, 07Kubernetes: [EPIC] Docker deprecation as a container runtime enginer for kubernetes. - https://phabricator.wikimedia.org/T269684#10828775 (10JMeybohm) 05Open→03Resolved Most of the migration is done, only the ml cluster is not yet complete, see T387854 Cle... [08:01:15] 06serviceops, 06Growth-Team, 10MW-on-K8s, 10Notifications (Echo), 13Patch-For-Review: Migrate Notifications (Echo) periodic jobs - https://phabricator.wikimedia.org/T394471#10828782 (10Clement_Goubert) [08:23:04] 06serviceops, 06Data-Persistence: Onboard the Docker Registry to apus - https://phabricator.wikimedia.org/T394476 (10elukey) 03NEW [08:24:15] 06serviceops, 13Patch-For-Review: docker-registry.wikimedia.org keeps serving bad blobs - https://phabricator.wikimedia.org/T390251#10828862 (10elukey) Opened T394476 to see if apus can take over the current load that we have on Swift. After the sign-off we'll be able to reason about concrete next steps. [08:45:05] 06serviceops, 06Data-Persistence: Onboard the Docker Registry to apus - https://phabricator.wikimedia.org/T394476#10828930 (10akosiaris) > Storage-wise, we are currently approaching 7TB on Swift (the dashboard shows the metrics as well). We added ~1TB in the past 3 months I should note that per the proposed p... [08:45:34] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 10Data-Platform-SRE (2025.05.02 - 2025.05.23), 07Kubernetes: Check/update grafana dashboards for k8s 1.31 - https://phabricator.wikimedia.org/T389084#10828932 (10Gehel) [09:29:07] 06serviceops, 10Page Content Service: mobileapps consistently 503s when a summary of an image is requested - https://phabricator.wikimedia.org/T394433#10829123 (10hnowlan) [09:31:37] 06serviceops, 10MW-on-K8s: Investigate startingDeadlineSeconds setting for kubernetes CronJobs - https://phabricator.wikimedia.org/T394423#10829130 (10Clement_Goubert) Confirming that: # Suspending a CronJob through helmfile is an edit, and not a delete/create of the CronJob object # It will start a Job for a... [10:18:04] 06serviceops, 06Growth-Team, 10GrowthExperiments, 10MW-on-K8s, 13Patch-For-Review: Migrate GrowthExperiments maintenance jobs to mw-cron - https://phabricator.wikimedia.org/T385782#10829272 (10Clement_Goubert) >>! In T394018#10829189, @Michael wrote: > Regardless, I would like to emphasize that the migra... [11:38:32] 06serviceops, 06Growth-Team, 10GrowthExperiments, 10MW-on-K8s, 13Patch-For-Review: Migrate GrowthExperiments maintenance jobs to mw-cron - https://phabricator.wikimedia.org/T385782#10829521 (10Urbanecm_WMF) @Clement_Goubert FWIW, the refreshLinkRecommendations job is closer to a daemon/service rather tha... [14:23:53] 06serviceops, 06Infrastructure-Foundations, 06SRE: Clean up the Docker Registry catalog and Swift storage from old images - https://phabricator.wikimedia.org/T375645#10830133 (10elukey) >>! In T375645#10194826, @elukey wrote: > It failed with: > > ` > failed to garbage collect: failed to mark: swift: sw... [14:40:06] 06serviceops, 06Growth-Team, 10GrowthExperiments, 10MW-on-K8s, 13Patch-For-Review: Migrate GrowthExperiments maintenance jobs to mw-cron - https://phabricator.wikimedia.org/T385782#10830193 (10hnowlan) >>! In T385782#10829521, @Urbanecm_WMF wrote: > @Clement_Goubert FWIW, the refreshLinkRecommendations j... [15:55:52] 06serviceops, 06DBA, 10Editing-team (Tracking), 10MW-1.45-notes (1.45.0-wmf.1; 2025-05-13), and 2 others: Fatal exception of type "Wikimedia\Rdbms\DBUnexpectedError: Database servers in extension1 are overloaded. In order to protect application servers, t... - https://phabricator.wikimedia.org/T393513#10830531 [16:46:12] 06serviceops, 06DBA, 10Editing-team (Tracking), 10MW-1.45-notes (1.45.0-wmf.1; 2025-05-13), and 2 others: Fatal exception of type "Wikimedia\Rdbms\DBUnexpectedError: Database servers in extension1 are overloaded. In order to protect application servers, t... - https://phabricator.wikimedia.org/T393513#10830636 [17:04:47] 06serviceops: low rate of mw-memcached errors - https://phabricator.wikimedia.org/T371881#10830693 (10jijiki) Adding thos graphs as notes, though I do not think they are the cause. Chatted with @cmooney as well if network issues could be the culprit, though we deduced it may not be it. {F59360185} {F59360187} [17:18:06] 06serviceops: low rate of mw-memcached errors - https://phabricator.wikimedia.org/T371881#10830716 (10jijiki) p:05Low→03Medium I have observed that the rate those errors surface has increased, I will keep working on it and update. [17:28:09] 06serviceops, 10MW-on-K8s: Visualise mw-script jobs - https://phabricator.wikimedia.org/T394534 (10jijiki) 03NEW [17:30:38] 06serviceops, 10MW-on-K8s: Visualise mw-script jobs - https://phabricator.wikimedia.org/T394534#10830780 (10RLazarus) [17:30:45] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Allow running one-off scripts manually - https://phabricator.wikimedia.org/T341553#10830781 (10RLazarus) [17:30:56] 06serviceops, 10MW-on-K8s: Visualise mw-script jobs - https://phabricator.wikimedia.org/T394534#10830784 (10RLazarus) See also T387268. [17:48:37] 06serviceops, 13Patch-For-Review: Turn down MediaWiki image builds for PHP 7.4 - https://phabricator.wikimedia.org/T391057#10830830 (10Scott_French) Given that we're well on our way to completing the periodic jobs migration and have not run into showstopper 8.1-compatibility issues, and we've removed the 7.4 f... [21:28:13] 06serviceops: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition - https://phabricator.wikimedia.org/T394556 (10Scott_French) 03NEW [21:29:19] 06serviceops: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition - https://phabricator.wikimedia.org/T394556#10831567 (10Scott_French) [21:30:04] 06serviceops: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition - https://phabricator.wikimedia.org/T394556#10831568 (10Scott_French) [21:30:13] 06serviceops, 06Data-Engineering, 06Data-Engineering-Radar, 10Dumps-Generation, 06MediaWiki-Platform-Team: Migrate WMF production from PHP 7.4 to PHP 8.1 - https://phabricator.wikimedia.org/T319432#10831569 (10Scott_French) [21:46:11] 06serviceops: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition - https://phabricator.wikimedia.org/T394556#10831592 (10Scott_French) Going by the git history on `UcfirstOverrides.php`, {T292552} appears to contain the most recent prior art on this process. Notably, that wasn't //just// cleaning...