[02:53:08] 06Data-Engineering, 06Movement-Insights, 06Product-Analytics, 06Research, and 2 others: Temporary Accounts Initiative (IP Masking) - Add user_is_temporary and user_is_permanent to data tables - https://phabricator.wikimedia.org/T356701#10507142 (10nshahquinn-wmf) →14Duplicate dup:03T377293 [02:53:09] 06Data-Engineering, 10DPE Temporary Accounts (Sprint 1), 07Epic: [Epic] Modify DPE pipelines to account for Temp Accounts - https://phabricator.wikimedia.org/T377293#10507144 (10nshahquinn-wmf) [06:13:07] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10function-evaluator, 10Wikifunctions, 10Abstract Wikipedia team (25Q3 (Jan–Mar)), 13Patch-For-Review: Function Evaluator log data loss due to ECS nonconforming fields - https://phabricator.wikimedia.org/T383448#10507246 (10ecarg) I can see our logs... [14:06:33] 10Data-Engineering (Q3 2024 January 1st - March 31th): HDFS capacity needs data engineering and platform users - https://phabricator.wikimedia.org/T384100#10508348 (10JAllemandou) We can add the `webrequest_actor` and `pageview_actor` datasets if we are in a similar case as the unique-devices one. This would rep... [14:10:36] 06Data-Engineering: [Maintenance] Add a deletion job for `hdfs_usage` data - https://phabricator.wikimedia.org/T348774#10508371 (10JAllemandou) Nope, we haven't addressed this. We could reuse a similar pattern as what has been done with `drop_older_than` in the `webrequest_frontend` airflow jobs: https://gitlab.... [14:13:01] (03CR) 10Joal: "The code looks good but I have a broader question about this change: does the SRE team still need the `webrequest_sampled` druid datasourc" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1114340 (https://phabricator.wikimedia.org/T383900) (owner: 10Filippo Giunchedi) [14:24:10] (03PS2) 10Filippo Giunchedi: Pick up tls_sess and nocookies for webrequest_sampled [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1114340 (https://phabricator.wikimedia.org/T383900) [14:24:17] (03CR) 10Filippo Giunchedi: "Good point, I'm happy to leave webrequest_sampled behind and will amend the patch to change webrequest_sampled_live only." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1114340 (https://phabricator.wikimedia.org/T383900) (owner: 10Filippo Giunchedi) [14:31:22] 06Data-Engineering, 10Dumps 2.0: Modify XML dumping code to be able to do 'partial' dumps - https://phabricator.wikimedia.org/T384383#10508536 (10pfischer) 05Open→03In progress a:03pfischer [14:33:18] 06Data-Engineering, 10Dumps 2.0: Refactor code to use new table and column names - https://phabricator.wikimedia.org/T384385#10508553 (10pfischer) 05Open→03In progress a:03pfischer [14:46:49] 10Data-Engineering (Q3 2024 January 1st - March 31th), 06Product-Analytics, 10Event-Platform, 13Patch-For-Review: Enable Event Platform instruments to opt out of collecting User-Agent data - https://phabricator.wikimedia.org/T382173#10508703 (10Ottomata) > Clients should set the value of their user-agent i... [14:53:34] 10Data-Engineering (Q3 2024 January 1st - March 31th), 07Essential-Work: Analyze Dumps Usage Through Apache Logs - https://phabricator.wikimedia.org/T383175#10508760 (10HCoplin-WMF) I browsed through the data for the dumps logs. It's interesting, especially seeing ZH content as the top hit. Were you able to de... [14:56:58] 06Data-Engineering, 10WikiLambda, 10WikiLambda Front-end, 10Abstract Wikipedia team (25Q3 (Jan–Mar)), 07Schema-change: Run the T383561 ALTERs on wikifunctionswiki - https://phabricator.wikimedia.org/T385183 (10Jdforrester-WMF) 03NEW [14:58:07] (03CR) 10Joal: [C:03+1] "Ok great - Can we take this as a "yes, you can deprecate the `webrequest_sampled` datasource in Druid"? If so, I'll create a ticket and we" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1114340 (https://phabricator.wikimedia.org/T383900) (owner: 10Filippo Giunchedi) [15:07:01] (03PS1) 10Gerrit maintenance bot: Add knc.wikipedia to pageview allowlist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1115418 (https://phabricator.wikimedia.org/T385185) [15:30:51] (03CR) 10Filippo Giunchedi: "Yes please and thank you, it is my understanding that webrequest_sampled can be left behind. However I'll loop in SRE in the task as well " [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1114340 (https://phabricator.wikimedia.org/T383900) (owner: 10Filippo Giunchedi) [15:41:30] (03CR) 10CDanis: "+1 from me" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1114340 (https://phabricator.wikimedia.org/T383900) (owner: 10Filippo Giunchedi) [15:49:21] (03PS15) 10Peter Fischer: Rewrite MediawikiDumper partitioning implementation [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101892 (https://phabricator.wikimedia.org/T381016) [15:49:22] (03PS1) 10Peter Fischer: Partial dumps [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1115441 (https://phabricator.wikimedia.org/T384383) [16:06:08] 06Data-Engineering: Deprecate `webrequest_sampled_128` druid datasource - https://phabricator.wikimedia.org/T385198 (10JAllemandou) 03NEW [16:11:40] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10Dumps 2.0, 13Patch-For-Review: Modify XML dumping code to be able to do 'partial' dumps - https://phabricator.wikimedia.org/T384383#10509246 (10pfischer) [16:12:18] 06Data-Engineering: airflow-dags: Mutualization of _IMPORTED flag sensors creations - https://phabricator.wikimedia.org/T371373#10509248 (10amastilovic) > We could add an option to the Dataset library to describe the flag in the YAML file and use it by default to build a sensor for the _PARTITIONED flag. The dat... [16:12:54] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10Dumps 2.0, 13Patch-For-Review: Refactor code to use new table and column names - https://phabricator.wikimedia.org/T384385#10509251 (10pfischer) [16:17:29] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10Dumps 2.0: Modify code to dump all slots - https://phabricator.wikimedia.org/T384945#10509263 (10pfischer) 05Open→03In progress a:03pfischer [16:19:33] (03CR) 10CI reject: [V:04-1] Rewrite MediawikiDumper partitioning implementation [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101892 (https://phabricator.wikimedia.org/T381016) (owner: 10Peter Fischer) [16:19:38] (03CR) 10CI reject: [V:04-1] Adapt table/column names [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1115440 (https://phabricator.wikimedia.org/T384385) (owner: 10Peter Fischer) [16:19:41] (03CR) 10CI reject: [V:04-1] Partial dumps [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1115441 (https://phabricator.wikimedia.org/T384383) (owner: 10Peter Fischer) [16:21:08] 06Data-Engineering: [Data Quality] Improve Superset visualizations - https://phabricator.wikimedia.org/T372678#10509276 (10Ahoelzl) [16:25:24] 10Data-Engineering (Q3 2024 January 1st - March 31th), 06Experimentation Lab, 10Dumps 2.0 (Kanban Board), 13Patch-For-Review: Dashboard and alerting of data quality metrics for wmf_content.mediawiki_content_history_v1 - https://phabricator.wikimedia.org/T357684#10509283 (10tchin) [16:36:10] (03CR) 10Joal: [V:03+2 C:03+2] "Thank you folks for the answers about `webrequest_sampled`. Related ticket: https://phabricator.wikimedia.org/T385198. I'm going to merge " [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1114340 (https://phabricator.wikimedia.org/T383900) (owner: 10Filippo Giunchedi) [17:02:37] analytics1073 is down since two days, known issue? I couldn't find a hw task for it in Phab [17:19:26] 06Data-Engineering: Deprecate `webrequest_sampled_128` druid datasource - https://phabricator.wikimedia.org/T385198#10509552 (10JAllemandou) [17:22:47] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10Commons-Impact-Metrics, 10Commons-Impact-Metrics-Requests: Update Commons Impact Metrics allow-list January 2025 - https://phabricator.wikimedia.org/T384259#10509566 (10mforns) [18:18:23] 06Data-Engineering, 06Machine-Learning-Team, 10Event-Platform: Create new mediawiki.page_links_change stream based on fragment/mediawiki/state/change/page - https://phabricator.wikimedia.org/T331399#10509906 (10Ladsgroup) In my volunteer capacity, I would love to have a stream of external links added (e.g. l...