[04:36:04] (03PS4) 10TChin: Support inserting ResultKey into DeequVerificationSuiteToDataQualityAlerts [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1127964 (https://phabricator.wikimedia.org/T384962) [04:48:03] (03PS5) 10TChin: Support inserting ResultKey into DeequVerificationSuiteToDataQualityAlerts [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1127964 (https://phabricator.wikimedia.org/T384962) [04:49:07] (03CR) 10TChin: Support inserting ResultKey into DeequVerificationSuiteToDataQualityAlerts (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1127964 (https://phabricator.wikimedia.org/T384962) (owner: 10TChin) [08:32:53] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 13Patch-For-Review: Migrate Benthos `webrequest_sampled_live` to feed from HAProxy data - https://phabricator.wikimedia.org/T390029#10685623 (10elukey) The upgrade went fine, but there is a big difference in behavior. From the Benthos graphs,... [08:42:05] 10Data-Engineering (Q3 2025 January 1st - March 31th), 13Patch-For-Review: Deprecate `webrequest_sampled_128` druid datasource - https://phabricator.wikimedia.org/T385198#10685630 (10Antoine_Quhen) We need to remove the deletion job. Received those error emails: ` [data engineering alerts] FAIL: refinery-drop-... [09:53:27] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic: Migrate Benthos `webrequest_sampled_live` to feed from HAProxy data - https://phabricator.wikimedia.org/T390029#10685814 (10elukey) Worked nicely, no need to bump threads :) [10:36:47] 06Data-Engineering, 06Machine-Learning-Team, 06Research, 10Event-Platform: Emit revision revert risk scores as a stream and expose in EventStreams API - https://phabricator.wikimedia.org/T326179#10685922 (10achou) > * revision is for any revision of any page For the revertrisk model, revision is for any re... [10:52:17] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06MediaWiki-Engineering, 10TemplateData, 10VisualEditor, and 4 others: PHP Unknown error: EventLoggingLegacyConverter: Failed proxying legacy EventLogging event query string to WMF Event Platform JSON... - https://phabricator.wikimedia.org/T383939#10685943 [15:24:42] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06MediaWiki-Engineering, 10TemplateData, 10VisualEditor, and 4 others: PHP Unknown error: EventLoggingLegacyConverter: Failed proxying legacy EventLogging event query string to WMF Event Platform JSON... - https://phabricator.wikimedia.org/T383939#10686678 [16:05:32] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06MediaWiki-Engineering, 10TemplateData, 10VisualEditor, and 4 others: PHP Unknown error: EventLoggingLegacyConverter: Failed proxying legacy EventLogging event query string to WMF Event Platform JSON... - https://phabricator.wikimedia.org/T383939#10686800 [16:19:21] 06Data-Engineering, 06Machine-Learning-Team, 06Research, 10Event-Platform: Emit revision revert risk scores as a stream and expose in EventStreams API - https://phabricator.wikimedia.org/T326179#10686852 (10Ottomata) > vs. they can only get a score for the latest revision of a page (article country). Just... [16:24:54] 06Data-Engineering, 06Machine-Learning-Team, 06Research, 10Event-Platform: Emit revision revert risk scores as a stream and expose in EventStreams API - https://phabricator.wikimedia.org/T326179#10686873 (10Ottomata) Another Q about revertrisk. Are visibilty settings relevant to possible revert risk? E.g... [16:34:49] 06Data-Engineering, 06Machine-Learning-Team, 06Research, 10Event-Platform: Emit revision revert risk scores as a stream and expose in EventStreams API - https://phabricator.wikimedia.org/T326179#10686919 (10Ottomata) Re naming thoughts: We currently have a `mediawiki.page_change.v1` stream, in which the e... [16:35:09] 06Data-Engineering, 06Machine-Learning-Team, 06Research, 10Event-Platform: Emit revision revert risk scores as a stream and expose in EventStreams API - https://phabricator.wikimedia.org/T326179#10686920 (10Ottomata) cc also @gmodena for the (undefined) mediawiki entity stream naming convention discussion [17:07:06] 10Data-Engineering (Q3 2025 January 1st - March 31th), 10Commons-Impact-Metrics, 13Patch-For-Review: [CIM] Skewed ranking with the top Editors monthly API - https://phabricator.wikimedia.org/T370470#10687053 (10Eevans) Notes for production (when we're ready): Step 1: Upgrade the data-gateway to v1.0.12 Ste... [17:31:23] 06Data-Engineering, 10Event-Platform: Event Platform - Support JSON Schema draft-2019-09 schemas - https://phabricator.wikimedia.org/T390232#10687136 (10mpopov) [18:54:52] 10Data-Engineering (Q3 2025 January 1st - March 31th), 10Wikidata: The latest wikidata entity dump (latest-all.ttl.bz2) contains each triple twice - https://phabricator.wikimedia.org/T389787#10687328 (10Hannah_Bast) Good news, the latest version of `latest-all.ttl.bz2` in https://dumps.wikimedia.org/wikidatawi... [19:26:20] 10Data-Engineering (Q3 2025 January 1st - March 31th): Assess data platform implications for RFC domain unification - https://phabricator.wikimedia.org/T389696#10687473 (10mforns) Heya! I've put together a spreadsheet with all the datasets we Data Engineering maintain, with information about whether they would... [20:27:08] 06Data-Engineering, 10observability, 10Event-Platform: Data Platform, SRE Observability, overlaps, use cases, and potential - https://phabricator.wikimedia.org/T390323#10687654 (10Ottomata) [20:27:24] 06Data-Engineering, 10observability, 10Event-Platform: Data Platform, SRE Observability, overlaps, use cases, and potential - https://phabricator.wikimedia.org/T390323#10687660 (10Ottomata) [20:27:27] 14Analytics, 06Data-Engineering, 10Observability-Logging, 06SRE, and 2 others: Produce ECS formatted logstash logs to Event Platform, allowing them to be queried in the WMF Data Lake with SQL - https://phabricator.wikimedia.org/T291645#10687659 (10Ottomata) [20:29:53] 06Data-Engineering, 10observability, 10Observability-Metrics, 10Event-Platform: Produce MediaWiki client emitted operational metrics into Event Platform, allowing them to be queried in the WMF Data Lake with SQL - https://phabricator.wikimedia.org/T390328#10687668 (10Ottomata) [21:03:32] (03PS1) 10Aleksandar Mastilovic: Add new gobblin jar and update symlinks [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1132031 (https://phabricator.wikimedia.org/T390247) [21:31:32] 06Data-Engineering, 10observability, 10Observability-Metrics, 10Event-Platform: Produce MediaWiki client emitted operational metrics into Event Platform, allowing them to be queried in the WMF Data Lake with SQL - https://phabricator.wikimedia.org/T390328#10687810 (10VirginiaPoundstone) CC @kappakayala let... [21:33:48] 06Data-Engineering, 10observability, 10Event-Platform: Data Platform, SRE Observability, overlaps, use cases, and potential - https://phabricator.wikimedia.org/T390323#10687811 (10VirginiaPoundstone) CC @kappakayala this is a parent task that we can use to collect various use cases for observability use case... [23:22:38] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10EventStreams, 06SRE Observability, 13Patch-For-Review: Eventstreams 'assignments' logstash field type - https://phabricator.wikimedia.org/T390140#10688011 (10colewhite) We've rolled out a logstash filter to check for name KafkaSSE and to cast the assign...