[01:13:39] 06Data-Engineering, 10AQS2.0, 10Cassandra: Golang-based Cassandra clients do not perform TLS host verification - https://phabricator.wikimedia.org/T361964#11105012 (10Eevans) [01:14:26] 06Data-Engineering, 10AQS2.0, 10Cassandra: Golang-based Cassandra clients do not perform TLS host verification - https://phabricator.wikimedia.org/T361964#11105013 (10Eevans) @elukey as I recall, you didn't want to go the IP SAN route, is that correct? [11:09:19] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Add cl_timestamp_id index to categorylinks table - https://phabricator.wikimedia.org/T399249#11105951 (10Ladsgroup) [11:52:41] (03CR) 10Joal: [C:03+2] Update parent POM [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1147771 (https://phabricator.wikimedia.org/T367405) (owner: 10Peter Fischer) [12:06:16] (03Merged) 10jenkins-bot: Update parent POM [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1147771 (https://phabricator.wikimedia.org/T367405) (owner: 10Peter Fischer) [12:13:50] (03CR) 10Joal: [V:03+2 C:03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1178532 (https://phabricator.wikimedia.org/T401665) (owner: 10Joal) [12:14:32] (03CR) 10Joal: [V:03+2 C:03+2] "Merge for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1178895 (https://phabricator.wikimedia.org/T367405) (owner: 10Joal) [12:36:11] 06Data-Engineering, 06Traffic: Export development_network_probe data to Puppet servers for CDN deployment - https://phabricator.wikimedia.org/T402512 (10Vgutierrez) 03NEW [12:36:25] 06Data-Engineering, 06Traffic: Export development_network_probe data to Puppet servers for CDN deployment - https://phabricator.wikimedia.org/T402512#11106139 (10Vgutierrez) p:05Triage→03Medium [12:36:54] (03PS1) 10Joal: Update project to release v0.3.0 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1180854 [12:40:45] 06Data-Engineering, 06Traffic: Export development_network_probe data to Puppet servers for CDN deployment - https://phabricator.wikimedia.org/T402512#11106154 (10Vgutierrez) [13:16:24] stevemunene: internet is currently down at home, causing me to be ooo for now. Would you mind relaying on slack. Thanks! [13:32:20] (03CR) 10Xcollazo: [C:03+1] "Code LGTM, but can we associate a phab to this ticket?" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1180854 (owner: 10Joal) [13:32:25] sure np brouberol hopefully back soon [13:35:25] (03PS2) 10Joal: Update project to release v0.3.0 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1180854 (https://phabricator.wikimedia.org/T367405) [13:37:07] (03CR) 10Joal: [C:03+2] "Merging for deploy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1180854 (https://phabricator.wikimedia.org/T367405) (owner: 10Joal) [13:47:38] (03Merged) 10jenkins-bot: Update project to release v0.3.0 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1180854 (https://phabricator.wikimedia.org/T367405) (owner: 10Joal) [13:53:17] 06Data-Engineering, 13Patch-For-Review: Improve handling of new stream onboarding in Refine - https://phabricator.wikimedia.org/T402186#11106544 (10Antoine_Quhen) 05Open→03In progress a:03Antoine_Quhen We now have an implemented solution. The problem is, with the new stream, event if Gobblin ingest sinc... [13:53:41] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 13Patch-For-Review: Improve handling of new stream onboarding in Refine - https://phabricator.wikimedia.org/T402186#11106549 (10Antoine_Quhen) [14:00:31] Starting build #42 for job analytics-refinery-maven-release [14:02:52] don't panic! :-P [14:19:01] everybody get a towel [14:22:33] Project analytics-refinery-maven-release build #42: 09SUCCESS in 22 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release/42/ [14:29:59] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Data-Platform, 06Movement-Insights: Consider making the Automata heuristics private - https://phabricator.wikimedia.org/T402336#11106750 (10Ahoelzl) To consider: would it be sufficient to parameterize the classification algorithm and store the para... [14:59:55] (03CR) 10Mforns: Add support for multiple compression algorithms to MW Dumper. (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1180212 (https://phabricator.wikimedia.org/T402209) (owner: 10Xcollazo) [15:09:23] !log Deploy refinery using scap [15:09:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:20:45] Starting build #32 for job analytics-refinery-update-jars [15:21:23] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.3.0 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1180891 [15:21:23] Project analytics-refinery-update-jars build #32: 09SUCCESS in 37 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars/32/ [15:21:59] (03PS5) 10Xcollazo: Add support for multiple compression algorithms to MW Dumper. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1180212 (https://phabricator.wikimedia.org/T402209) [15:23:05] (03CR) 10Xcollazo: Add support for multiple compression algorithms to MW Dumper. (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1180212 (https://phabricator.wikimedia.org/T402209) (owner: 10Xcollazo) [15:25:02] (03CR) 10Joal: [V:03+2 C:03+2] Add refinery-source jars for v0.3.0 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1180891 (owner: 10Maven-release-user) [15:26:08] (03CR) 10Xcollazo: Add support for multiple compression algorithms to MW Dumper. (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1180212 (https://phabricator.wikimedia.org/T402209) (owner: 10Xcollazo) [15:37:25] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th): Refine to Hive with Airflow – Update Refine Documentation on Wikitech - https://phabricator.wikimedia.org/T392697#11107223 (10Antoine_Quhen) https://wikitech.wikimedia.org/wiki/Data_Platform/Systems/Refine [15:47:59] !log Deploying refinery onto HDFS [15:48:00] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:51:04] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th): Refine to Hive with Airflow – Update Refine Documentation on Wikitech - https://phabricator.wikimedia.org/T392697#11107300 (10Antoine_Quhen) https://wikitech.wikimedia.org/wiki/Data_Platform_Engineering/Ops_week#Refine_failure_report [16:32:50] 06Data-Engineering, 10CirrusSearch, 10DPE-Mediawiki-Content, 10Discovery-Search (2025.08.15 - 2025.09.03), 13Patch-For-Review: Source the CirrusSearch index dumps from hadoop instead of a MW maintenance script - https://phabricator.wikimedia.org/T366248#11107620 (10EBernhardson) It was fairly trivial to... [17:22:55] 06Data-Engineering, 06Traffic: Reduce noise from duplicate sequence-gap alerts on HaProxy-webrequests - https://phabricator.wikimedia.org/T401383#11107896 (10Vgutierrez) I think I've identified the issue, right now haproxy always log `sequence: 0` for `` requests [17:30:23] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10DPE-Mediawiki-Content: Compare outputs between XML File Export and MediaWiki's Special:Export - https://phabricator.wikimedia.org/T402229#11107934 (10xcollazo) [17:30:41] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10DPE-Mediawiki-Content: Compare outputs between XML File Export and MediaWiki's Special:Export - https://phabricator.wikimedia.org/T402229#11107935 (10xcollazo) 05Open→03In progress [17:31:22] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10DPE-Mediawiki-Content, 13Patch-For-Review: Modify XML dumping code to support multiple compression algorithms - https://phabricator.wikimedia.org/T402209#11107945 (10xcollazo) 05Open→03In progress [17:31:50] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10DPE-Mediawiki-Content, 13Patch-For-Review: Modify XML dumping code to support multiple compression algorithms - https://phabricator.wikimedia.org/T402209#11107951 (10xcollazo) a:03xcollazo [17:44:00] 06Data-Engineering, 06Traffic: Reduce noise from duplicate sequence-gap alerts on HaProxy-webrequests - https://phabricator.wikimedia.org/T401383#11107990 (10Vgutierrez) Right now we get the sequence number from haproxy `%rt` log format, that's `request_counter (HTTP req or TCP session)` according to its docum... [17:44:58] 06Data-Engineering, 06Traffic: Reduce noise from duplicate sequence-gap alerts on HaProxy-webrequests - https://phabricator.wikimedia.org/T401383#11107995 (10CDanis) [17:51:29] 14Data-Engineering (Q3 2025 January 1st - March 31th), 10DPE-Mediawiki-Content, 13Patch-For-Review: Optimize XML Dump code to be able to handle wikis from simplewiki to enwiki - https://phabricator.wikimedia.org/T381016#11108042 (10xcollazo) For completeness, here is a working `spark3-submit` as of today... [17:56:18] 06Data-Engineering, 06Traffic, 13Patch-For-Review: Reduce noise from duplicate sequence-gap alerts on HaProxy-webrequests - https://phabricator.wikimedia.org/T401383#11108064 (10Vgutierrez) p:05Triage→03High flagging as high cause this is already making the downsampling in benthos fail (nice catch by @CD... [18:12:44] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Movement-Insights, 06Traffic: NEW BUG REPORT: Investigate rise in May 2025 Reader metrics - https://phabricator.wikimedia.org/T395934#11108125 (10Mayakp.wiki) Movement Insights is currently testing 1 week of baseline (April) and Issue (May) data; a... [18:40:43] 06Data-Engineering, 10CirrusSearch, 10DPE-Mediawiki-Content, 10Discovery-Search (2025.08.15 - 2025.09.03), 13Patch-For-Review: Source the CirrusSearch index dumps from hadoop instead of a MW maintenance script - https://phabricator.wikimedia.org/T366248#11108298 (10xcollazo) >>! In T366248#11107620, @EBe... [19:53:13] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Superset / LDAP access for aude - https://phabricator.wikimedia.org/T402022#11108512 (10Dzahn) [19:56:17] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Superset / LDAP access for aude - https://phabricator.wikimedia.org/T402022#11108515 (10Dzahn) 05Open→03In progress uploaded a patch. Tagged with "Data Engineering" per https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requ... [22:07:38] 06Data-Engineering, 10CirrusSearch, 10DPE-Mediawiki-Content, 10Discovery-Search (2025.08.15 - 2025.09.05), 13Patch-For-Review: Source the CirrusSearch index dumps from hadoop instead of a MW maintenance script - https://phabricator.wikimedia.org/T366248#11108926 (10EBernhardson) >>! In T366248#11108298,... [22:34:07] 10Data-Engineering (Q1 FY25/26 July 1st - September 30th): Hack unique_devices_per_domain recreating `.m` subdomain use `x_analytics` `ismobile` value - https://phabricator.wikimedia.org/T401666#11109039 (10Hghani) Hi @JAllemandou, we also think it would be a good idea to add the access_method field to the domai...