[01:48:10] 14Analytics, 06Data-Engineering, 06Data-Engineering-Icebox: Count the number of video plays - https://phabricator.wikimedia.org/T198628#11037995 (10Doc_James) This is something we at VideoWiki would love to see. Accurate metrics for number of plays of videos and duration of video played. [02:49:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [02:54:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [03:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [03:53:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [04:18:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [04:22:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [04:57:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [05:02:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [05:12:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [05:18:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [05:38:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [05:47:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [06:02:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [06:19:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [06:34:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [06:38:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [06:42:05] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Add cl_timestamp_id index to categorylinks table - https://phabricator.wikimedia.org/T399249#11038121 (10Marostegui) [06:53:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [06:58:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [07:08:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [07:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [07:52:45] 06Data-Engineering, 06Data-Engineering-Radar, 10CheckUser, 06DBA, and 2 others: Add '*_actor_ip_hex_time' indexes to 'cu_changes', 'cu_log_event', and 'cu_private_event' on WMF wikis - https://phabricator.wikimedia.org/T399728#11038218 (10FCeratto-WMF) [08:46:41] FIRING: MediawikiPageContentChangeEnrichHighKafkaConsumerLag: ... [08:46:41] High Kafka consumer lag for mw_page_content_change_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichHighKafkaConsumerLag [09:01:41] RESOLVED: MediawikiPageContentChangeEnrichHighKafkaConsumerLag: ... [09:01:41] High Kafka consumer lag for mw_page_content_change_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichHighKafkaConsumerLag [09:56:12] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Data-Platform-SRE, 10Event-Platform: Build Flink docker image on bookwork - https://phabricator.wikimedia.org/T400600 (10gmodena) 03NEW [10:02:09] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Data-Platform-SRE, 10Event-Platform: Build Flink docker image on bookworm - https://phabricator.wikimedia.org/T400600#11038552 (10gmodena) [10:22:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [10:27:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [10:58:30] 10Data-Engineering-Roadmap, 10DPE-Mediawiki-Content, 10Data-Platform-SRE (2025.07.05 - 2025.07.25), 10Discovery-Search (2025.07.04 - 2025.07.25): Flink: Update k8s operator to 1.12.0 - https://phabricator.wikimedia.org/T398162#11038681 (10BTullis) The new operator image has been built and published. ` (bas... [11:12:15] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Dumps-Generation, 10Wikidata, 10Data-Platform-SRE (2025.07.05 - 2025.07.25), 07Essential-Work: wikidata-20250707-all.json.gz is corrupted - https://phabricator.wikimedia.org/T399077#11038698 (10DVrandecic) Thank you, @BTullis ! [11:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [12:09:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [12:14:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [12:24:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [12:26:20] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform: EventGate: Log unparseable X-Experiment-Enrollments headers to a distinct error stream - https://phabricator.wikimedia.org/T396359#11038940 (10phuedx) >>! In T396359#11035912, @dr0ptp4kt wrote: > @phuedx do yo... [12:34:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [12:57:45] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 13Patch-For-Review: NEW BUG REPORT wmf.interlanguage_navigation missing mobile data - https://phabricator.wikimedia.org/T396514#11039075 (10Isaac) Thanks @Milimetric ! One thought: would it be easier to just record whether the previous URL is `mobile`... [13:04:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [13:09:41] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 13Patch-For-Review: NEW BUG REPORT wmf.interlanguage_navigation missing mobile data - https://phabricator.wikimedia.org/T396514#11039151 (10CMyrick-WMF) Thanks! Since we're dealing with webrequest data, I was thinking that the addition of an **access_m... [13:39:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [13:54:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [14:09:01] 06Data-Engineering, 06Data-Engineering-Icebox, 10Data Pipelines: Add support for repository artifacts in Airflow - https://phabricator.wikimedia.org/T322690#11039361 (10Ottomata) Okay, sounds like we leave it open for now then, thanks. [14:12:35] 06Data-Engineering, 06Data-Platform-SRE: FAIL: refinery-drop-raw-event alerting - https://phabricator.wikimedia.org/T400393#11039378 (10BTullis) This has been continuing to happen. Here is today's alert email: https://groups.google.com/a/wikimedia.org/g/data-engineering-alerts/c/HdaJDq4hxms/m/8MjIlTktFAAJ I t... [14:17:25] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform: EventGate: Log unparseable X-Experiment-Enrollments headers to a distinct error stream - https://phabricator.wikimedia.org/T396359#11039392 (10Ottomata) > is there a quick way at this for queries from a stat b... [14:20:39] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Add cl_timestamp_id index to categorylinks table - https://phabricator.wikimedia.org/T399249#11039410 (10Marostegui) [14:22:06] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Add cl_timestamp_id index to categorylinks table - https://phabricator.wikimedia.org/T399249#11039430 (10Marostegui) [14:29:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [14:40:22] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 13Patch-For-Review: NEW BUG REPORT wmf.interlanguage_navigation missing mobile data - https://phabricator.wikimedia.org/T396514#11039524 (10Isaac) > I was thinking that the addition of an access_method column, which would then be populated with desktop... [14:51:48] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Data-Platform-SRE, 10Event-Platform, 13Patch-For-Review: Build Flink docker image on bookworm - https://phabricator.wikimedia.org/T400600#11039594 (10gmodena) a:05gmodena→03None [14:59:03] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th): update mediawiki-event-enrichment base docker image - https://phabricator.wikimedia.org/T400623 (10gmodena) 03NEW [14:59:17] FIRING: [2x] EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [15:09:17] FIRING: [2x] EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [15:19:17] FIRING: [2x] EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [15:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [15:24:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [15:31:05] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10EventStreams, 06SRE Observability, 10Event-Platform: Eventstreams 'assignments' logstash field type - https://phabricator.wikimedia.org/T390140#11039766 (10tchin) `assignments` [[ https://gitlab.wikimedia.org/repos/data-engineering/kafkasse/-/blo... [15:36:25] (03CR) 10Milimetric: "k, updated based on comments in the phab task and tested with:" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1172085 (https://phabricator.wikimedia.org/T396514) (owner: 10Milimetric) [15:37:06] (03CR) 10Milimetric: "`" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1172085 (https://phabricator.wikimedia.org/T396514) (owner: 10Milimetric) [15:53:44] (03CR) 10Milimetric: "Third try's a charm. So the checking query shows that we get lots of new data, 14 thousand new unique combinations of current/previous in" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1172085 (https://phabricator.wikimedia.org/T396514) (owner: 10Milimetric) [15:57:06] (03PS2) 10Milimetric: Migrate, update, and test interlanguage_navigation data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1172085 (https://phabricator.wikimedia.org/T396514) [16:02:27] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Event-Platform, 10MW-1.45-notes (1.45.0-wmf.12; 2025-07-29), 13Patch-For-Review: Update event-producing tools to overwrite `meta.dt` - https://phabricator.wikimedia.org/T376026#11039888 (10Ottomata) Deployed meta.dt change to eventgate-analytics-... [16:02:49] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Event-Platform, 10MW-1.45-notes (1.45.0-wmf.12; 2025-07-29), 13Patch-For-Review: Update event-producing tools to overwrite `meta.dt` - https://phabricator.wikimedia.org/T376026#11039889 (10Ottomata) a:03Ottomata [16:03:11] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 13Patch-For-Review: NEW BUG REPORT wmf.interlanguage_navigation missing mobile data - https://phabricator.wikimedia.org/T396514#11039891 (10Milimetric) k, works for me, updated code in [[ https://gerrit.wikimedia.org/r/c/analytics/refinery/+/1172085/2/... [16:16:18] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform: EventGate: Log unparseable X-Experiment-Enrollments headers to a distinct error stream - https://phabricator.wikimedia.org/T396359#11039912 (10Ottomata) > I wasn't sure if there may be a way to cheaply ksql o... [16:46:44] 06Data-Engineering, 06cloud-services-team, 06Data-Persistence, 10Data-Services, 10Data-Platform-SRE (2025.07.26 - 2025.08.15): Create wiki replicas views for globaljsonlinks tables - https://phabricator.wikimedia.org/T387419#11040008 (10BTullis) [16:47:14] 06Data-Engineering, 06Data-Engineering-Radar, 10Data-Platform-SRE (2025.07.26 - 2025.08.15), 13Patch-For-Review: Fix the hard dependency between the Airflow scheduler and the DataHub GMS service - https://phabricator.wikimedia.org/T395106#11040028 (10BTullis) [16:47:22] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Dumps-Generation, 10Wikidata, 10Data-Platform-SRE (2025.07.26 - 2025.08.15), 07Essential-Work: wikidata-20250707-all.json.gz is corrupted - https://phabricator.wikimedia.org/T399077#11040024 (10BTullis) [16:47:32] 06Data-Engineering, 06Discovery-Search, 06Java-Scala-Standardization, 10Data-Platform-SRE (2025.07.26 - 2025.08.15), 07Epic: [Epic] Replace Archiva with Gitlab artifact repositories - https://phabricator.wikimedia.org/T367315#11040034 (10BTullis) [16:48:03] 10Data-Engineering-Roadmap, 10DPE-Mediawiki-Content, 10Data-Platform-SRE (2025.07.26 - 2025.08.15), 10Discovery-Search (2025.07.04 - 2025.07.25), 13Patch-For-Review: Flink: Update k8s operator to 1.12.0 - https://phabricator.wikimedia.org/T398162#11040044 (10BTullis) [16:48:27] 06Data-Engineering, 06Data-Engineering-Radar, 06Discovery-Search, 06Infrastructure-Foundations, and 2 others: Elasticsearch dependency upgrade in spicerack - https://phabricator.wikimedia.org/T390860#11040062 (10BTullis) [16:50:23] 06Data-Engineering, 10Data-Platform-SRE (2025.07.26 - 2025.08.15), 07Documentation: https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log should be on Wikitech - https://phabricator.wikimedia.org/T387878#11040112 (10BTullis) [16:51:46] 06Data-Engineering, 10Data-Platform-SRE (2025.07.26 - 2025.08.15): Request for dedicated Airflow instance for WME - https://phabricator.wikimedia.org/T396672#11040146 (10BTullis) [16:52:26] 06Data-Engineering, 06Data-Engineering-Radar, 10CirrusSearch, 10Structured Data Engineering, and 4 others: Migrate image recommendation to use page_weighted_tags_changed stream - https://phabricator.wikimedia.org/T372912#11040156 (10BTullis) [16:53:40] 06Data-Engineering, 10Technical-blog-posts, 10Data-Platform-SRE (2025.07.26 - 2025.08.15): Write a blog post about the recent Airflow migration to Kubernetes - https://phabricator.wikimedia.org/T393603#11040180 (10BTullis) [16:58:54] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform: EventGate: Add Prometheus metric for hoisting errors - https://phabricator.wikimedia.org/T398922#11040247 (10Ottomata) > Change HoistingError to extend from ContextualError Hm, When extended from ValidationEr... [17:27:54] 06Data-Engineering, 10Data-Platform-SRE (2025.07.26 - 2025.08.15): FAIL: refinery-drop-raw-event alerting - https://phabricator.wikimedia.org/T400393#11040382 (10BTullis) [17:27:59] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Data-Platform-SRE (2025.07.26 - 2025.08.15), 10Event-Platform, 13Patch-For-Review: Build Flink docker image on bookworm - https://phabricator.wikimedia.org/T400600#11040380 (10BTullis) [17:33:02] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform: EventGate: Add Prometheus metric for hoisting errors - https://phabricator.wikimedia.org/T398922#11040416 (10Ottomata) > Shall we just add a new error_type label to the existent eventgate_validation_errors_tot... [17:42:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [17:47:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [19:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [20:09:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [20:12:58] 06Data-Engineering, 10Data-Platform-SRE (2025.07.26 - 2025.08.15): Request for dedicated Airflow instance for WME - https://phabricator.wikimedia.org/T396672#11041048 (10dr0ptp4kt) Update here, I took the access patch for Haroon, Ricardo, and me out of WIP: >>! In T399899#11041030, @dr0ptp4kt wrote: > Hi @ssi... [20:25:48] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform, 13Patch-For-Review: EventGate: Add Prometheus metric for hoisting errors - https://phabricator.wikimedia.org/T398922#11041108 (10Ottomata) Alright! https://gitlab.wikimedia.org/repos/data-engineering/eventg... [20:29:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [20:34:58] 06Data-Engineering, 10Data-Platform-SRE (2025.07.26 - 2025.08.15), 13Patch-For-Review: Request for dedicated Airflow instance for WME - https://phabricator.wikimedia.org/T396672#11041138 (10ssingh) >>! In T396672#11041046, @dr0ptp4kt wrote: > Update here, I took the access patch for Haroon, Ricardo, and me o... [20:39:22] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [20:49:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [22:07:19] 06Data-Engineering, 10Data-Platform-SRE (2025.07.26 - 2025.08.15): FAIL: refinery-drop-raw-event alerting - https://phabricator.wikimedia.org/T400393#11041372 (10amastilovic) > I can't explain why it didn't hit the permission errors when I ran it manually. Could it be the difference between HDFS users (or Ker... [23:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [23:46:28] 06Data-Engineering, 06Data-Engineering-Radar, 10Dumps-Generation, 06MediaWiki-Platform-Team, 06serviceops: Migrate WMF production from PHP 7.4 to PHP 8.1 - https://phabricator.wikimedia.org/T319432#11041567 (10Krinkle)