[08:30:22] 10Data-Platform-SRE, 10sre-alert-triage: Alert in need of triage: Updater process (instance wdqs1022) - https://phabricator.wikimedia.org/T357496 (10LSobanski) [08:30:45] 10Data-Platform-SRE, 10sre-alert-triage: Alert in need of triage: Updater process (instance wdqs1022) - https://phabricator.wikimedia.org/T357496 (10LSobanski) There are also alerts for wdqs1023 and wdqs1024. [10:12:57] 10Data-Engineering (Sprint 8): [Maintenance] Delete sanitized events removed from sanitization list - https://phabricator.wikimedia.org/T347586 (10gmodena) a:03gmodena [11:07:25] 10Data-Engineering (Sprint 8): [Maintenance] Delete sanitized events removed from sanitization list - https://phabricator.wikimedia.org/T347586 (10gmodena) May I proceed with deleting the tables from the Hive metastore for the impacted datasets? [12:05:05] 10Data-Engineering, 10CX-cxserver, 10Citoid, 10Content-Transform-Team-WIP, and 11 others: Migrate node-based services in production to node18 - https://phabricator.wikimedia.org/T349118 (10Lucas_Werkmeister_WMDE) [12:05:37] 10Data-Engineering: Turn off ReportUpdater jobs no longer used - https://phabricator.wikimedia.org/T357419 (10lbowmaker) [12:06:06] 10Data-Engineering, 10Wikidata, 10Wikidata-Termbox, 10serviceops, and 4 others: Migrate Termbox SSR from Node 16 to 18 - https://phabricator.wikimedia.org/T355685 (10Lucas_Werkmeister_WMDE) 05Open→03Resolved a:03Lucas_Werkmeister_WMDE I deployed the update and it’s working as far as I can tell – I th... [13:22:24] 10Data-Engineering, 10Data-Platform-SRE, 10Event-Platform: Upgrade eventlogging VM to bullseye (or bookworm) - https://phabricator.wikimedia.org/T349289 (10brouberol) a:03brouberol [13:22:32] 10Data-Engineering, 10Data-Platform-SRE ( 2024.02.12 - 2024.03.03), 10Event-Platform: Upgrade eventlogging VM to bullseye (or bookworm) - https://phabricator.wikimedia.org/T349289 (10brouberol) [13:24:16] 10Data-Engineering, 10Data-Platform-SRE ( 2024.02.12 - 2024.03.03), 10Event-Platform: Upgrade eventlogging VM to bullseye (or bookworm) - https://phabricator.wikimedia.org/T349289 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brouberol@cumin1002 for host eventlog1003.eqiad.wmnet... [13:59:18] 10Data-Engineering, 10Data-Platform-SRE ( 2024.02.12 - 2024.03.03), 10Event-Platform: Upgrade eventlogging VM to bullseye (or bookworm) - https://phabricator.wikimedia.org/T349289 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brouberol@cumin1002 for host eventlog1003.eqiad.wmnet with... [14:24:05] 10Data-Platform-SRE, 10observability, 10Epic: [Epic] Review alerting strategy for Data Platform SRE - https://phabricator.wikimedia.org/T346438 (10bking) [14:43:08] 10Data-Engineering, 10Data-Platform-SRE ( 2024.02.12 - 2024.03.03), 10Event-Platform, 10Patch-For-Review: Upgrade eventlogging VM to bullseye (or bookworm) - https://phabricator.wikimedia.org/T349289 (10brouberol) 05Open→03Resolved [14:43:11] 10Data-Platform-SRE, 10Epic: Upgrade the Data Engineering infrastructure to Debian Bullseye - https://phabricator.wikimedia.org/T288804 (10brouberol) [14:45:08] 10Data-Engineering, 10Data-Platform-SRE ( 2024.02.12 - 2024.03.03): Configure ingress internal DNS records - https://phabricator.wikimedia.org/T356481 (10brouberol) We figured it out. ATS was sending a request with the header `Host: superset-next-k8s.wikimedia.org`, which was not part of the Ingress alternativ... [14:45:19] 10Data-Engineering, 10Data-Platform-SRE ( 2024.02.12 - 2024.03.03): Configure ingress internal DNS records - https://phabricator.wikimedia.org/T356481 (10brouberol) 05Open→03Resolved [14:45:27] 10Data-Engineering, 10Data-Platform-SRE, 10Epic: Migrate the Analytics Superset instances to our DSE Kubernetes cluster - https://phabricator.wikimedia.org/T347710 (10brouberol) [14:46:46] 10Data-Platform-SRE, 10Epic: Upgrade the Data Engineering infrastructure to Debian Bullseye - https://phabricator.wikimedia.org/T288804 (10brouberol) [14:47:23] 10Data-Engineering, 10Data-Platform-SRE: Alerts Review: determine if we can use Prometheus to alert based on historical datasets - https://phabricator.wikimedia.org/T357537 (10bking) [15:44:15] 10Data-Engineering, 10Data Pipelines: [PLACEHOLDER] Pipelines Rep Structure Changes after RFC - https://phabricator.wikimedia.org/T295364 (10lbowmaker) 05Open→03Declined Declining as we have moved forward with Airflow/changed teams [15:48:25] 10Data-Engineering, 10Data Pipelines: Add support for repository artifacts in Airflow - https://phabricator.wikimedia.org/T322690 (10lbowmaker) 05Open→03Resolved a:03lbowmaker Implemented here: https://phabricator.wikimedia.org/T333001 [15:51:44] 10Data-Engineering, 10Data-Platform-SRE, 10Data Products: Generate a list of Superset users affected by changes to IP masking/temp users - https://phabricator.wikimedia.org/T347510 (10lbowmaker) Once we start the changes for IP masking we can run this query, I think it might need SRE to run against the DB. [15:52:47] 10Data-Engineering, 10Data-Engineering-Wikistats, 10Data Products: Add Farsi/Persian to WikiStats interface languages - https://phabricator.wikimedia.org/T348674 (10lbowmaker) [15:54:23] 10Data-Engineering, 10Data Pipelines, 10Data Products: Add Ukrainian Wikipedia to Clickstream dataset - https://phabricator.wikimedia.org/T310972 (10lbowmaker) [16:28:03] 10Data-Engineering, 10Data Pipelines: Add support for repository artifacts in Airflow - https://phabricator.wikimedia.org/T322690 (10mforns) @lbowmaker I think the solution offered by T333001 is sightly different from what this task proposes. The linked task allows to include SQL query file paths directly in S... [16:33:51] 10Data-Engineering: Turn off ReportUpdater jobs no longer used - https://phabricator.wikimedia.org/T357419 (10lbowmaker) [20:05:05] 10Data-Platform-SRE ( 2024.02.12 - 2024.03.03), 10Patch-For-Review: Migrate cloudelastic from public to private IPs - https://phabricator.wikimedia.org/T355617 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by bking@cumin2002 for hosts: `cloudelastic1007.wikimedia.org` - cloudelastic1007.wiki... [21:37:09] 10Data-Engineering, 10Data-Platform, 10Movement-Insights: Add movement insights group/users to MWH denormalize job alerts - https://phabricator.wikimedia.org/T357472 (10Mayakp.wiki) p:05Triage→03Medium [21:52:30] 10Data-Engineering, 10Community-Tech, 10Multiblocks, 10Data Products (Data Products Sprint 09), 10Event-Platform: Investigate if the new 'Multiblocks' user blocks feature affects the mediawiki.user-blocks-change event stream - https://phabricator.wikimedia.org/T356597 (10JWheeler-WMF) @VirginiaPoundstone... [22:34:18] 10Data-Platform-SRE ( 2024.02.12 - 2024.03.03), 10Discovery-Search (Current work): Review wikitech:Search and write processes for k8s world - https://phabricator.wikimedia.org/T356303 (10EBernhardson) I've been reviewing our options for backfilling and trying to come up with a plan, i think the following will... [22:59:00] (03PS1) 10Clare Ming: Update app base schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1003564 (https://phabricator.wikimedia.org/T357371) [23:02:16] (03CR) 10Clare Ming: "hi Santi, hi Surbhi -- Sam is out for the rest of the week -- my UBN fixes depend on this schema update." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1003564 (https://phabricator.wikimedia.org/T357371) (owner: 10Clare Ming) [23:07:38] (03CR) 10Clare Ming: Update app base schema (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1003564 (https://phabricator.wikimedia.org/T357371) (owner: 10Clare Ming)