[11:35:41] 06Data-Engineering, 06Traffic: Reduce noise from duplicate sequence-gap alerts on HaProxy-webrequests - https://phabricator.wikimedia.org/T401383 (10Antoine_Quhen) 03NEW [12:40:24] (03CR) 10Joal: [V:03+2 C:03+2] "LGTM - merging" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1176294 (https://phabricator.wikimedia.org/T398236) (owner: 10CDanis) [12:41:36] (03CR) 10Joal: [V:03+2 C:03+2] "LGTM - merging" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1176298 (https://phabricator.wikimedia.org/T400753) (owner: 10CDanis) [12:48:59] !log Reload webrequest druid realtime ingestion job with client_port removal and wmfuniq addition [12:49:00] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:49:05] cdanis: for you to know --^ [12:51:39] thanks joal! [13:20:26] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE: Investigate Hadoop 3 container support with reference to Airflow deployment pipelines - https://phabricator.wikimedia.org/T288247#11068260 (10BTullis) I have written a design document for this: [[https://docs.google.com/document/d/1kQJMCoEdBL... [13:26:11] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE: Investigate Hadoop 3 container support with reference to Airflow deployment pipelines - https://phabricator.wikimedia.org/T288247#11068298 (10BTullis) [13:39:38] (03PS1) 10Brouberol: druid/kafka: update kafka-jumbo-eqiad broker list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1176482 (https://phabricator.wikimedia.org/T397447) [13:42:13] (03CR) 10Joal: [C:03+2] "LGTM! merge at will" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1176482 (https://phabricator.wikimedia.org/T397447) (owner: 10Brouberol) [13:50:14] (03CR) 10Brouberol: [V:03+2] druid/kafka: update kafka-jumbo-eqiad broker list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1176482 (https://phabricator.wikimedia.org/T397447) (owner: 10Brouberol) [14:47:06] 14Analytics-Radar, 10observability, 10Vector 2022, 10Wikimedia-Logstash, 07Epic: Client side error logging production launch - https://phabricator.wikimedia.org/T226986#11068606 (10dr0ptp4kt) @Krinkle I failed to link to the ticket where we're looking at this (thanks @Ottomata for helping me realize!... [14:49:30] 14Analytics, 06Data-Engineering, 10Event-Platform, 07Wikimedia-Performance-recommendation: Avoid extra HTTPS connections for most Event Platform beacons - https://phabricator.wikimedia.org/T263049#11068617 (10dr0ptp4kt) I should have put the comment on this here ticket instead of the old closed ticket, so... [14:52:20] 06Data-Engineering, 06Data-Engineering-Radar, 10Data-Platform-SRE (2025.07.26 - 2025.08.15): Investigate Hadoop 3 container support with reference to Airflow deployment pipelines - https://phabricator.wikimedia.org/T288247#11068633 (10BTullis) 05Open→03Resolved I will resolve this ticket, based on th... [14:55:50] (03CR) 10Btullis: [C:03+1] druid/kafka: update kafka-jumbo-eqiad broker list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1176482 (https://phabricator.wikimedia.org/T397447) (owner: 10Brouberol) [15:09:39] !log failing over hadoop nameserver to an-master1004 for T397160 [15:09:42] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:09:43] T397160: Set 52 hadoop workers into decommissioning status - https://phabricator.wikimedia.org/T397160 [15:14:19] 14Analytics, 06Data-Engineering, 10Event-Platform, 07Wikimedia-Performance-recommendation: Avoid extra HTTPS connections for most Event Platform beacons - https://phabricator.wikimedia.org/T263049#11068809 (10CDanis) You got me curious, so I spent a little time digging on this -- as far as I can tell not... [15:19:06] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Data-Platform-SRE (2025.07.26 - 2025.08.15): FAIL: refinery-drop-raw-event alerting - https://phabricator.wikimedia.org/T400393#11068843 (10BTullis) 05Open→03Resolved Thanks @Ottomata - I didn't think to check the test cluster :-) [15:48:21] 06Data-Engineering, 10Data-Platform-SRE (2025.07.26 - 2025.08.15): Request for dedicated Airflow instance for WME - https://phabricator.wikimedia.org/T396672#11068946 (10BTullis) @dr0ptp4kt The most useful thing to check on each instance is the [[https://airflow-platform-eng.wikimedia.org/users/userinfo/|Your... [17:54:22] 06Data-Engineering-Radar, 10CheckUser, 06DBA, 06Trust and Safety Product Team, 07Schema-change-in-production: Add '*_actor_ip_hex_time' indexes to 'cu_changes', 'cu_log_event', and 'cu_private_event' on WMF wikis - https://phabricator.wikimedia.org/T399728#11069426 (10FCeratto-WMF) [18:32:49] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-DomainEvents: Finalize and move Cross-Service Integration events design document to mediawiki.org - https://phabricator.wikimedia.org/T400095#11069502 (10Ottomata) [18:33:33] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-DomainEvents: Finalize and move Cross-Service Integration events design document to mediawiki.org - https://phabricator.wikimedia.org/T400095#11069505 (10Ottomata) The document is now at https://www.mediawiki.org/wiki/Manual:Domain_events/O... [18:34:06] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10EventStreams, 10Event-Platform, 13Patch-For-Review: EventStreams: duplicate events from double compute (wdqs/rdf) streams - https://phabricator.wikimedia.org/T396564#11069518 (10Ottomata) a:05Ottomata→03dcausse [19:20:56] hello, just wanted to report this one: etcd fails to start on 3 dse-k8s-etcd servers. we noticed this while looking at other puppet failures. errors here: [19:20:59] https://puppetboard.wikimedia.org/failures [19:21:23] puppet repo tells me the owners are Data Platform and Machine Learning. so dropping it here [19:29:01] joal: milimetric: RE implementation of T389696 - is there a new task I may refer to? It's okay if it takes a bit longer to complete, but that way going forward the chain of comms can be shorter between the task and everyone asking me about it :) [19:29:02] T389696: Analyze impact for webrequest and unique devices pipelines to derive access_method without m-dot domain - https://phabricator.wikimedia.org/T389696 [19:53:16] 06Data-Engineering, 06Product-Analytics: Most edits to subsequently deleted pages have Null event_comment data in mediawiki_history - https://phabricator.wikimedia.org/T401436 (10Samwalton9-WMF) 03NEW [19:54:20] 06Data-Engineering, 06Product-Analytics: Most edits to subsequently deleted pages have Null event_comment data in mediawiki_history - https://phabricator.wikimedia.org/T401436#11069685 (10Samwalton9-WMF) [19:54:28] 06Data-Engineering, 06Product-Analytics: Most edits to subsequently deleted pages have Null event_comment data in mediawiki_history - https://phabricator.wikimedia.org/T401436#11069686 (10Samwalton9-WMF) [23:23:06] 06Data-Engineering, 10Data-Platform-SRE (2025.07.26 - 2025.08.15): Request for dedicated Airflow instance for WME - https://phabricator.wikimedia.org/T396672#11070066 (10dr0ptp4kt) @BTullis thanks! IIRC I had previously logged out and logged back in post the LDAP grant and //not// seen the additional functiona...