[00:43:38] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, 10Release-Engineering-Team (Seen): Create mw-web helmfile deployment - https://phabricator.wikimedia.org/T321900 (10Krinkle) [00:43:48] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, 10Release-Engineering-Team (Seen): Create mw-jobrunner helmfile deployment - https://phabricator.wikimedia.org/T321897 (10Krinkle) [00:43:58] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, 10Release-Engineering-Team (Seen): Create mw-videoscaler helmfile deployment - https://phabricator.wikimedia.org/T321899 (10Krinkle) [00:44:07] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, 10Release-Engineering-Team (Seen): Create mw-api-int helmfile deployment - https://phabricator.wikimedia.org/T321895 (10Krinkle) [00:44:29] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, 10Release-Engineering-Team (Seen): Create mw-api-ext helmfile deployment - https://phabricator.wikimedia.org/T321896 (10Krinkle) [02:10:37] 10serviceops, 10MediaWiki-Authentication-and-authorization, 10Platform Engineering, 10SRE: Increased session loss since 20221001 - https://phabricator.wikimedia.org/T319279 (10jijiki) p:05Triage→03High [08:12:49] 10serviceops, 10Prod-Kubernetes, 10Kubernetes, 10Patch-For-Review: Target Sources (component/kubernetes-future/source/Sources) is configured multiple times - https://phabricator.wikimedia.org/T270271 (10JMeybohm) 05Open→03Resolved a:03JMeybohm Thanks @jbond ! [08:12:51] 10serviceops, 10Foundational Technology Requests, 10Prod-Kubernetes, 10Shared-Data-Infrastructure, and 2 others: Update Kubernetes clusters to v1.23 - https://phabricator.wikimedia.org/T307943 (10JMeybohm) [09:22:59] 10serviceops, 10Prod-Kubernetes, 10Kubernetes, 10Patch-For-Review: Migrate from command line flags to config files for kubernetes components - https://phabricator.wikimedia.org/T300499 (10JMeybohm) 05Open→03Resolved [09:23:04] 10serviceops, 10Foundational Technology Requests, 10Prod-Kubernetes, 10Shared-Data-Infrastructure, and 2 others: Update Kubernetes clusters to v1.23 - https://phabricator.wikimedia.org/T307943 (10JMeybohm) [10:38:51] 10serviceops, 10Dumps-Generation, 10SRE, 10MW-1.40-notes (1.40.0-wmf.10; 2022-11-14), and 2 others: conf* hosts ran out of disk space due to log spam - https://phabricator.wikimedia.org/T322360 (10jcrespo) [10:43:27] 10serviceops, 10Dumps-Generation, 10SRE, 10MW-1.40-notes (1.40.0-wmf.10; 2022-11-14), and 2 others: conf* hosts ran out of disk space due to log spam - https://phabricator.wikimedia.org/T322360 (10ArielGlenn) After the patch https://gerrit.wikimedia.org/r/c/mediawiki/core/+/852990 was backported and deplo... [10:54:40] 10serviceops, 10Dumps-Generation, 10SRE, 10MW-1.40-notes (1.40.0-wmf.10; 2022-11-14), and 2 others: conf* hosts ran out of disk space due to log spam - https://phabricator.wikimedia.org/T322360 (10jcrespo) [11:04:22] 10serviceops, 10observability: Monitor high load on etcd/conf* hosts to prevent incidents of software requiring config reload too often - https://phabricator.wikimedia.org/T322400 (10jcrespo) [11:06:05] 10serviceops, 10Dumps-Generation, 10SRE, 10MW-1.40-notes (1.40.0-wmf.10; 2022-11-14), and 2 others: conf* hosts ran out of disk space due to log spam - https://phabricator.wikimedia.org/T322360 (10jcrespo) DB maintenance is back to normal/no longer affected, as far as I understood from @Marostegui and @Lad... [11:08:22] 10serviceops, 10Dumps-Generation, 10SRE, 10MW-1.40-notes (1.40.0-wmf.10; 2022-11-14), and 2 others: conf* hosts ran out of disk space due to log spam - https://phabricator.wikimedia.org/T322360 (10jcrespo) A small incident report summary should happen soon at: https://wikitech.wikimedia.org/wiki/Incident_s... [11:12:25] 10serviceops, 10observability: Monitor high load on etcd/conf* hosts to prevent incidents of software requiring config reload too often - https://phabricator.wikimedia.org/T322400 (10jcrespo) [11:12:32] 10serviceops, 10Dumps-Generation, 10SRE, 10MW-1.40-notes (1.40.0-wmf.10; 2022-11-14), and 2 others: conf* hosts ran out of disk space due to log spam - https://phabricator.wikimedia.org/T322360 (10jcrespo) [11:13:00] 10serviceops, 10Dumps-Generation, 10SRE, 10MW-1.40-notes (1.40.0-wmf.10; 2022-11-14), and 2 others: conf* hosts ran out of disk space due to log spam - https://phabricator.wikimedia.org/T322360 (10jcrespo) [11:13:15] 10serviceops, 10Dumps-Generation, 10SRE, 10MW-1.40-notes (1.40.0-wmf.10; 2022-11-14), and 2 others: conf* hosts ran out of disk space due to log spam - https://phabricator.wikimedia.org/T322360 (10jcrespo) p:05Triage→03High [11:33:11] 10serviceops, 10Release-Engineering-Team (Radar), 10User-Joe: Create jenkins job for creating deployment artifacts for `docker-pkg-deploy` - https://phabricator.wikimedia.org/T179562 (10jbond) [13:26:44] 10serviceops, 10Release-Engineering-Team, 10SRE, 10Continuous-Integration-Config: operations/docker-images/production-images has no CI - https://phabricator.wikimedia.org/T283855 (10jbond) [13:32:22] 10serviceops, 10Scap, 10Release-Engineering-Team (Priority Backlog 📥): Re-imaged mw app servers can end up with missing l10n cache for old versions of MW needed for rollback - https://phabricator.wikimedia.org/T273334 (10jbond) [13:47:32] 10serviceops, 10SRE, 10Documentation: document redis upgrade/restart procedures - https://phabricator.wikimedia.org/T101585 (10jbond) [13:52:25] 10serviceops, 10SRE, 10conftool: Not all confd errors throw icinga alerts - https://phabricator.wikimedia.org/T110933 (10jbond) [13:58:46] 10serviceops, 10Sustainability: Automate the provisioning and management of MediaWiki clusters - https://phabricator.wikimedia.org/T118829 (10jbond) [14:01:07] 10serviceops: Split the API MediaWiki appserver pool into two external/internal pools - https://phabricator.wikimedia.org/T125085 (10jbond) 05Open→03Resolved a:03jbond Im going to boldly close this as i believe the infrastructure has changed significantly since this was raise and is likley no longer valid.... [14:14:30] 10serviceops, 10SRE: Turn on etcd TLS for intra-cluster communications - https://phabricator.wikimedia.org/T135128 (10jbond) 05Open→03Resolved a:03jbond I believe this is now in place but please re-open if im wrong [14:16:00] 10serviceops, 10observability: Monitor high load on etcd/conf* hosts to prevent incidents of software requiring config reload too often - https://phabricator.wikimedia.org/T322400 (10andrea.denisse) a:03andrea.denisse [14:30:59] 10serviceops, 10SRE, 10conftool: confctl no longer logs a non-changing state change - https://phabricator.wikimedia.org/T161096 (10jbond) [15:20:37] 10serviceops, 10SRE, 10Security: Filter potentially harmful PostScript commands in Commons upload/thumbor - https://phabricator.wikimedia.org/T210833 (10jbond) [15:28:42] 10serviceops, 10SRE, 10Release Pipeline (Blubber): blubber template for nodejs should allow defining configuration files to copy to the container - https://phabricator.wikimedia.org/T211580 (10jbond) [15:36:05] 10serviceops, 10Continuous-Integration-Infrastructure, 10SRE, 10serviceops-collab: contint1002 service implementation tracking - https://phabricator.wikimedia.org/T313832 (10LSobanski) p:05Triage→03High [15:38:48] 10serviceops, 10Observability-Logging, 10SRE, 10WMF-General-or-Unknown: Re-consider ` >/dev/null 2>&1` as output of many cron'd MW maintenance scripts - https://phabricator.wikimedia.org/T187078 (10jbond) [15:43:42] 10serviceops, 10Continuous-Integration-Infrastructure, 10SRE, 10serviceops-collab: contint1002 service implementation tracking - https://phabricator.wikimedia.org/T313832 (10hashar) @Jnuche we will have to setup a spare Jenkins and a Zuul merger on this new host contint1002 :-) [15:49:44] 10serviceops, 10Continuous-Integration-Infrastructure, 10SRE, 10serviceops-collab: contint1002 service implementation tracking - https://phabricator.wikimedia.org/T313832 (10jnuche) @hashar sounds like a good opportunity to pair! [16:52:50] 10serviceops, 10VisualEditor, 10MW-1.39-notes (1.39.0-wmf.21; 2022-07-18), 10Parsoid (Tracking), and 4 others: Preemptively warm caches for Parsoid output - https://phabricator.wikimedia.org/T301371 (10daniel) [17:10:16] 10serviceops, 10Infrastructure-Foundations, 10SRE, 10ARM support: SRE Summit 2022 Outcome of Session "Adoption of aarch64 (aka arm64) in WMF production?" - https://phabricator.wikimedia.org/T320811 (10jbond) [18:50:43] I have a service currently based on docker-registry.wikimedia.org/wikimedia-buster, is there an equivalent image for bullseye? [18:54:49] yes, /bullseye [18:56:03] taavi: cool; is that just a change in naming convention, or indicative of something else? [23:03:23] 10serviceops, 10Release Pipeline (Blubber), 10Release-Engineering-Team (Priority Backlog 📥): WMF container registry does not accept a manifest list (aka OCI manifest index, or "fat" manifest) - https://phabricator.wikimedia.org/T322453 (10dduvall)