[01:23:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [01:28:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [01:51:51] 06Data-Engineering, 10Commons-Impact-Metrics, 10Commons-Impact-Metrics-Requests: Update Commons Impact Metrics allow-list July 2025 - https://phabricator.wikimedia.org/T400665 (10GFontenelle_WMF) 03NEW [02:10:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [02:19:23] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [03:12:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [03:17:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [03:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [03:31:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [03:36:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [04:42:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [05:12:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [06:57:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [07:12:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [07:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [07:21:12] 06Data-Engineering, 06Data-Engineering-Radar, 10CheckUser, 06DBA, and 2 others: Add '*_actor_ip_hex_time' indexes to 'cu_changes', 'cu_log_event', and 'cu_private_event' on WMF wikis - https://phabricator.wikimedia.org/T399728#11041919 (10FCeratto-WMF) [07:26:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [08:06:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [08:49:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [08:54:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [09:00:19] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform, 13Patch-For-Review: EventGate: Add Prometheus metric for hoisting errors - https://phabricator.wikimedia.org/T398922#11042114 (10phuedx) >>! In T398922#11040416, @Ottomata wrote: > If we made the error_type... [09:17:38] 14Analytics, 06Data-Engineering, 06Data-Engineering-Icebox: Count the number of video plays - https://phabricator.wikimedia.org/T198628#11042172 (10AndrewTavis_WMDE) Hey @Doc_James 👋 Would be nice if we could have these kinds of high level metrics, but that's out of the scope of the current project which is... [09:25:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [09:30:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [09:37:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [09:42:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [09:58:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [10:08:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [10:14:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [10:34:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [10:40:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [10:45:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [10:54:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [10:59:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [11:04:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [11:09:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [11:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [12:06:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [12:11:30] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th): Investigate mw-page-content-change memory alerts - https://phabricator.wikimedia.org/T397336#11042604 (10gmodena) Regarding the alert linked in OP. That issue was fixed by increasing Job Manager container memory. The applications has been stable since... [12:26:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [12:33:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [12:53:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [13:17:55] 06Data-Engineering, 10Data-Platform-SRE (2025.07.26 - 2025.08.15), 13Patch-For-Review: Request for dedicated Airflow instance for WME - https://phabricator.wikimedia.org/T396672#11042891 (10Ottomata) Approved. [13:22:14] 06Data-Engineering, 10Data-Platform-SRE (2025.07.26 - 2025.08.15), 13Patch-For-Review: Request for dedicated Airflow instance for WME - https://phabricator.wikimedia.org/T396672#11042901 (10ssingh) Thanks @Ottomata! @dr0ptp4kt: Access request merged. Please try in ~30 minutes and let us know if there any i... [13:24:45] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th): Spike: Assess feasibility of integrating the new file export with the legacy Dumps1 UI - https://phabricator.wikimedia.org/T400507#11042907 (10xcollazo) 05Open→03In progress [13:45:12] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11043014 (10Ottomata) p:05Triage→03High FYI this is causing data-engineering-alerts email spam every hour. [13:51:04] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform, 13Patch-For-Review: EventGate: Log unparseable X-Experiment-Enrollments headers to an error stream - https://phabricator.wikimedia.org/T396359#11043047 (10Ottomata) [14:26:52] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11043291 (10Ottomata) FYI, the Refine Yarn application id that evolved the event.development_network_probe table is , and the ALTER that was run is: `application_1750705250302_912703` `lan... [14:33:18] 14Analytics-Radar, 06Data-Engineering, 06Data-Engineering-Icebox, 10MediaWiki-Action-API: Add Application errors for MediaWiki API to x-analytics - https://phabricator.wikimedia.org/T116658#11043360 (10Reedy) [14:33:56] 06Data-Engineering, 10MobileFrontend, 07PHP 8.1 support, 07Wikimedia-production-error: PHP Deprecated: urldecode(): Passing null to parameter #1 ($string) of type string is deprecated / PHP Warning: Undefined array key 1 - https://phabricator.wikimedia.org/T400522#11043363 (10Reedy) Code seems related to s... [14:35:07] 06Data-Engineering, 10MobileFrontend, 10XAnalytics, 07PHP 8.1 support, 07Wikimedia-production-error: PHP Deprecated: urldecode(): Passing null to parameter #1 ($string) of type string is deprecated / PHP Warning: Undefined array key 1 - https://phabricator.wikimedia.org/T400522#11043367 (10Reedy) [14:46:46] 06Data-Engineering, 10Release-Engineering-Team (Radar): Create a GitLab CI/CD Component project for WMF CI/CD templates and components - https://phabricator.wikimedia.org/T382430#11043433 (10Ottomata) I heard @amastilovic say today that GitLab CI components can't be used for manual CI job runs? If so, that is... [14:54:38] 14Analytics, 06Data-Engineering, 06Data-Engineering-Icebox: Count the number of video plays - https://phabricator.wikimedia.org/T198628#11043478 (10Doc_James) Okay thanks, we are funding Yaron to help with this work and hopefully make some headway with more video specific metrics. [15:07:45] 14Analytics-Radar, 06Data-Engineering, 06Data-Engineering-Icebox, 10MediaWiki-Action-API: Add Application errors for MediaWiki API to x-analytics - https://phabricator.wikimedia.org/T116658#11043628 (10Ottomata) 05Open→03Declined Hm, I'm not sure if this is a task we'd ever do? Probably better wou... [15:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [15:20:40] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10EventStreams, 06SRE Observability, 10Event-Platform: Eventstreams 'assignments' logstash field type - https://phabricator.wikimedia.org/T390140#11043685 (10colewhite) >>! In T390140#11039766, @tchin wrote: > `assignments` [[ https://gitlab.wikime... [15:28:51] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11043731 (10Ottomata) Also for reference, here is a link to the airflow mapped task (116) that evolved this table. https://airflow.wikimedia.org/task?dag_id=refine_to_hive_hourly&task_id=r... [15:35:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [15:36:31] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11043769 (10Ottomata) I spent a little time today trying to figure out the right thing to do. If @cdanis etc. is okay with missing some historical data, I think we should: - drop the `eve... [15:37:27] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11043771 (10CDanis) >>! In T400360#11043769, @Ottomata wrote: > I believe this would preserve the past 60(?) days of data, as well as keep the newly added `host` field in 1.1.0 of the schem... [15:40:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [15:42:26] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11043815 (10mforns) @Ottomata Sounds good! If the record is formatted correctly we can assume the re-recreation of the table from scratch will work fine. [15:54:40] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11043862 (10Ottomata) Testing. Ran ` spark3-submit --driver-cores 1 --master 'local[1]' --conf spark.yarn.maxAppAttempts=1 --conf write.spark.accept-any-schema=true --conf spark.hadoo... [16:07:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [16:12:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [16:14:42] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform, 13Patch-For-Review: EventGate: Add Prometheus metric for hoisting errors - https://phabricator.wikimedia.org/T398922#11043958 (10Ottomata) [16:15:13] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform, 13Patch-For-Review: EventGate: Add Prometheus metric for hoisting errors - https://phabricator.wikimedia.org/T398922#11043964 (10Ottomata) Deployed and tested code in beta, but I can't directly test Hoisting... [16:15:20] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform, 13Patch-For-Review: EventGate: Log unparseable X-Experiment-Enrollments headers to an error stream - https://phabricator.wikimedia.org/T396359#11043965 (10Ottomata) Deployed and tested code in beta, but I ca... [16:28:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [16:33:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [16:45:11] !ran `kerberos-run-command hdfs hdfs dfs -cp /user/xcollazo/artifacts/spark-3.4.1-assembly.zip /user/spark/share/lib` to make Spark 3.4.1 assembly available for prod use. [16:45:27] !log ran `kerberos-run-command hdfs hdfs dfs -cp /user/xcollazo/artifacts/spark-3.4.1-assembly.zip /user/spark/share/lib` to make Spark 3.4.1 assembly available for prod use. [16:45:29] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:17:10] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10DPE-Mediawiki-Content: MediaWiki Content History alerts too much for minor reconcile issues - https://phabricator.wikimedia.org/T395139#11044435 (10xcollazo) Recent runs should show improvements after MR 1574, but they did not: ` spark.sql(""" SELECT... [18:21:04] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10DPE-Mediawiki-Content: MediaWiki Content History alerts too much for minor reconcile issues - https://phabricator.wikimedia.org/T395139#11044448 (10xcollazo) Killed currently running one via: ` $ kerberos-run-command analytics yarn application -kill... [18:34:12] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11044504 (10Ottomata) @mforns and I worked on this today, and it turned out that backfilling this data would be quite difficult. RefineToHiveDataset (the new Refine on Airflow CLI) only wo... [18:39:06] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11044508 (10Ottomata) So status: - old data is in `event`.`development_network_probe_T400360_original`; Data in this table since the schema change was merged (2025-07-23T11:00:00 UTC ?) i... [18:45:25] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11044553 (10Ottomata) [19:07:57] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Event-Platform, 13Patch-For-Review: [Event Platform] eventutilites-python: improve consistency guarantees of async process functions - https://phabricator.wikimedia.org/T347282#11044683 (10gmodena) >>! In T347282#10990516, @gmodena wrote: > Re-open... [19:19:28] FIRING: [2x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [20:34:11] FIRING: [3x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [20:53:31] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-DomainEvents: Finalize and move Cross-Service Integration events design document to mediawiki.org - https://phabricator.wikimedia.org/T400095#11045134 (10Ottomata) p:05Triage→03Medium [21:04:11] FIRING: [3x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [21:14:32] 06Data-Engineering, 06Traffic: Fix Hive event.development_network_probe table - https://phabricator.wikimedia.org/T400360#11045172 (10CDanis) >>! In T400360#11044508, @Ottomata wrote: > I'll update the task description, and maybe set a scheduled reminder message in slack? :) That sounds good, thanks Andrew.... [21:17:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [21:22:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [21:28:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [21:33:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [21:34:11] FIRING: [3x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [21:43:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [21:48:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [21:51:41] 06Data-Engineering, 10Release-Engineering-Team (Radar): Create a GitLab CI/CD Component project for WMF CI/CD templates and components - https://phabricator.wikimedia.org/T382430#11045258 (10amastilovic) >>! In T382430#11043433, @Ottomata wrote: > I heard @amastilovic say today that GitLab CI components can't... [21:54:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [21:59:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [22:04:11] FIRING: [3x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [22:07:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [22:17:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [22:21:26] 10Analytics-Canonical-Data, 10MediaWiki-Configuration, 10MediaWiki-extensions-Wikibase-Client, 10MediaWiki-extensions-Wikibase-Repo, and 3 others: Completely distinguish various concepts including "global site ID", "global site key", "site DB name", "site ... - https://phabricator.wikimedia.org/T329424#11045366 [22:34:11] FIRING: [3x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [22:44:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [22:59:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [23:04:11] FIRING: [3x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [23:06:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [23:34:11] FIRING: [3x] AlertLintProblem: Linting problems found for HaproxyKafkaDeliveryErrors - https://wikitech.wikimedia.org/wiki/Alertmanager#Alert_linting_found_problems - TODO - https://alerts.wikimedia.org/?q=alertname%3DAlertLintProblem [23:34:19] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Experimentation Lab, 10Event-Platform: EventGate: Add Prometheus metric for hoisting errors - https://phabricator.wikimedia.org/T398922#11045553 (10dr0ptp4kt) Varnish will not allow one to fake out one's `X-Experiment-Enrollments` header from the o... [23:46:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-main in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=eqiad%2Bprometheus/k8s&var-service=eventgate-main - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly