[00:56:20] 06Data-Engineering, 06Language and Product Localization: Shut down the Language Reportcard - https://phabricator.wikimedia.org/T384409 (10nshahquinn-wmf) 03NEW [00:57:06] 06Data-Engineering, 06Language and Product Localization: Shut down the Language Reportcard - https://phabricator.wikimedia.org/T384409#10482497 (10nshahquinn-wmf) Not sure whether this is for #language_and_product_localization or #data-engineering. [01:00:23] 06Data-Engineering: Shut down the Page Creation dashboard - https://phabricator.wikimedia.org/T384410 (10nshahquinn-wmf) 03NEW [01:03:21] 06Data-Engineering: Shut down the Vital Signs dashboard - https://phabricator.wikimedia.org/T384411 (10nshahquinn-wmf) 03NEW [01:12:41] 06Data-Engineering, 10Data-Engineering-Dashiki, 06Data-Engineering-Radar, 06Experimentation Lab, and 3 others: Data Platform - Public dashboard support - https://phabricator.wikimedia.org/T361214#10482550 (10nshahquinn-wmf) [01:16:52] 06Data-Engineering: Shut down the Flow Reportcard - https://phabricator.wikimedia.org/T384413 (10nshahquinn-wmf) 03NEW [06:40:06] 06Data-Engineering, 06Data-Engineering-Radar, 10CampaignEvents, 06Data-Persistence, and 7 others: Add "event_is_test_event" field to "campaign_events" table - https://phabricator.wikimedia.org/T381759#10482712 (10Marostegui) @Daimona so this ticket is fully ready to be deployed in production? [07:09:59] 06Data-Engineering, 10Dumps 2.0, 10Dumps-Generation, 06SRE: Dumps generation cause disruption to the production environment - https://phabricator.wikimedia.org/T368098#10482785 (10Nemo_bis) Thanks for the update on XML data dumps list. I see there's progress on the other side: https://phabricator.wikimedia... [09:33:50] 06Data-Engineering, 06Data-Engineering-Radar, 10CampaignEvents, 06Data-Persistence, and 7 others: Add "event_is_test_event" field to "campaign_events" table - https://phabricator.wikimedia.org/T381759#10483021 (10cmelo) >>! In T381759#10482712, @Marostegui wrote: > @Daimona so this ticket is fully ready to... [09:36:50] 06Data-Engineering, 06Data-Engineering-Radar, 10CampaignEvents, 06Data-Persistence, and 7 others: Add "event_is_test_event" field to "campaign_events" table - https://phabricator.wikimedia.org/T381759#10483026 (10Marostegui) a:05MHorsey-WMF→03Marostegui Excellent, thank you [11:01:05] 06Data-Engineering, 06Data-Engineering-Radar, 10CampaignEvents, 06Data-Persistence, and 7 others: Add "event_is_test_event" field to "campaign_events" table - https://phabricator.wikimedia.org/T381759#10483401 (10Marostegui) 05In progress→03Resolved Deployed on the listed wikis. [12:18:53] 06Data-Engineering, 10Data-Engineering-Dashiki, 06Data-Engineering-Radar, 06Experimentation Lab, and 3 others: Data Platform - Public dashboard support - https://phabricator.wikimedia.org/T361214#10483721 (10BTullis) >>! In T361214#10035582, @AndrewTavis_WMDE wrote: > Bringing a point from @Ottomata from t... [12:40:35] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 13Patch-For-Review: Deploy new file tables in production - https://phabricator.wikimedia.org/T383491#10483790 (10Ladsgroup) [12:41:52] 06Data-Engineering, 10Data-Engineering-Dashiki, 06Data-Engineering-Radar, 06Experimentation Lab, and 3 others: Data Platform - Public dashboard support - https://phabricator.wikimedia.org/T361214#10483793 (10Ottomata) Amazing Ben! Details to discuss, but one requirement we might need to add is accessible o... [12:43:24] 06Data-Engineering, 10Data-Engineering-Dashiki, 06Data-Engineering-Radar, 06Experimentation Lab, and 3 others: Data Platform - Public dashboard support - https://phabricator.wikimedia.org/T361214#10483798 (10Ottomata) In some ways, this proposal sounds similar to {T377362}. Too bad we can't all just use th... [12:58:46] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 13Patch-For-Review: Deploy new file tables in production - https://phabricator.wikimedia.org/T383491#10483822 (10Ladsgroup) [13:01:37] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 13Patch-For-Review: Deploy new file tables in production - https://phabricator.wikimedia.org/T383491#10483838 (10Ladsgroup) [13:17:58] 06Data-Engineering, 06DBA, 13Patch-For-Review: Deploy new file tables in production - https://phabricator.wikimedia.org/T383491#10483870 (10fnegri) a:05Ladsgroup→03fnegri Thanks @Ladsgroup for reviewing and merging, I'm gonna run the `update-views` cookbook, following https://wikitech.wikimedia.org/wiki/... [13:31:47] 06Data-Engineering, 06DBA: Deploy new file tables in production - https://phabricator.wikimedia.org/T383491#10483906 (10Ladsgroup) [13:32:41] 06Data-Engineering, 06DBA: Deploy new file tables in production - https://phabricator.wikimedia.org/T383491#10483908 (10fnegri) a:05fnegri→03Ladsgroup Reassigning it to @Ladsgroup as he's already on it! [14:09:08] 06Data-Engineering, 06DBA: Deploy new file tables in production - https://phabricator.wikimedia.org/T383491#10484040 (10Ladsgroup) 05Open→03Resolved >>! In T383491#10483908, @fnegri wrote: > Reassigning it to @Ladsgroup as he's already on it! Sorry I stepped on your toes, I missed the comment and didn... [14:16:16] 06Data-Engineering, 10Data-Engineering-Dashiki, 06Data-Engineering-Radar, 06Experimentation Lab, and 3 others: Data Platform - Public dashboard support - https://phabricator.wikimedia.org/T361214#10484083 (10BTullis) >>! In T361214#10483793, @Ottomata wrote: > Details to discuss, but one requirement we mig... [14:32:45] 06Data-Engineering, 06Data-Platform-SRE, 10Dumps-Generation, 05MW-1.39-notes, and 3 others: WE 5.4 KR - Hypothesis 5.4.6 - Q3 FY24/25 - Validate Dumps 1.0 compatibility with PHP 8.1 - https://phabricator.wikimedia.org/T382484#10484142 (10BTullis) I performed a full dump of `eswiki` without errors. The file... [14:33:15] 06Data-Engineering, 06DBA: Deploy new file tables in production - https://phabricator.wikimedia.org/T383491#10484145 (10fnegri) That's ok! The exact workflow for these changes is still unclear to everyone, I'm doing my best to clarify it, see {T382607}. [15:06:27] 10Data-Engineering (Q3 2024 January 1st - March 31th), 06Data-Platform-SRE, 07Epic: HDFS capacity needs FY24/25 - https://phabricator.wikimedia.org/T384098#10484387 (10Gehel) [15:06:51] 06Data-Engineering, 06Data-Engineering-Radar, 06SRE, 10Data-Platform-SRE (2025.01.11 - 2025.01.31), 13Patch-For-Review: Data Platform access streamlining for WMDE staff - https://phabricator.wikimedia.org/T381824#10484390 (10jcrespo) Feel free to review the patch above as well as the wiki changes documen... [15:08:47] 10Data-Engineering (Q3 2024 January 1st - March 31th), 06Data-Platform-SRE, 07Epic: HDFS capacity needs FY24/25 - https://phabricator.wikimedia.org/T384098#10484401 (10Gehel) We also want to know if we need to plan for additional retention of webrequests [15:53:42] 06Data-Engineering, 10Data-Engineering-Dashiki, 06Data-Engineering-Radar, 06Experimentation Lab, and 3 others: Data Platform - Public dashboard support - https://phabricator.wikimedia.org/T361214#10484759 (10xcollazo) First, I love that this use case is getting attention, this is great! I want to caution... [16:07:05] 06Data-Engineering: Fix skein/spark memory unit missfit - https://phabricator.wikimedia.org/T383589#10484950 (10JAllemandou) 05Open→03Resolved [16:08:55] 06Data-Engineering, 10EventStreams, 10Event-Platform, 13Patch-For-Review: EventStreams: kafka key should be serialized as a string - https://phabricator.wikimedia.org/T373689#10484961 (10Ottomata) [16:09:02] 06Data-Engineering, 10EventStreams, 10Event-Platform, 13Patch-For-Review: EventStreams: kafka key should be serialized as a string - https://phabricator.wikimedia.org/T373689#10484963 (10Ottomata) p:05Triage→03Medium [16:17:47] 06Data-Engineering, 10MediaWiki-extensions-EventLogging, 07Essential-Work, 10Experimentation Lab (Experiment Platform Sprint 1), and 2 others: [SPIKE] Investigate possible event loss on navigation in Google Chrome - https://phabricator.wikimedia.org/T384307#10485006 (10Milimetric) [16:20:09] 06Data-Engineering, 10CampaignEvents, 06Campaigns-Product-Team, 06Data-Persistence, 07Schema-change: Convert `ce_participants.cep_private` from boolean to mwtinyint - https://phabricator.wikimedia.org/T383777#10485026 (10Roshan_s) a:03Roshan_s [16:20:58] 06Data-Engineering, 10CampaignEvents, 06Campaigns-Product-Team, 06Data-Persistence, 07Schema-change: Convert `ce_participants.cep_private` from boolean to mwtinyint - https://phabricator.wikimedia.org/T383777#10485039 (10Roshan_s) I would like to work on this issue !! [16:59:29] 06Data-Engineering, 06Data-Persistence, 10Dumps-Generation, 10Data-Platform-SRE (2025.01.11 - 2025.01.31), 13Patch-For-Review: Switch dumps 1.0 processes to use the analytics MariadB replicas (dbstore100[7-9]) - https://phabricator.wikimedia.org/T382947#10485316 (10BTullis) I have also tried a test dump... [17:13:18] 06Data-Engineering, 06Data-Persistence, 10Dumps-Generation, 10Data-Platform-SRE (2025.01.11 - 2025.01.31), 13Patch-For-Review: Switch dumps 1.0 processes to use the analytics MariadB replicas (dbstore100[7-9]) - https://phabricator.wikimedia.org/T382947#10485391 (10BTullis) >>! In T382947#10476420, @Joe... [18:31:03] 10Data-Engineering (Q3 2024 January 1st - March 31th): HDFS capacity needs data engineering and platform users - https://phabricator.wikimedia.org/T384100#10485910 (10Ahoelzl) Working assumption: we need to be able to retain up to 6 months of webrequests data, legal might at any time request that. [18:31:27] 10Data-Engineering (Q3 2024 January 1st - March 31th): HDFS capacity needs data engineering and platform users - https://phabricator.wikimedia.org/T384100#10485915 (10Ahoelzl) a:03JAllemandou [18:31:47] 10Data-Engineering (Q3 2024 January 1st - March 31th), 06Data-Platform-SRE, 07Epic: HDFS capacity needs FY24/25 - https://phabricator.wikimedia.org/T384098#10485918 (10Ahoelzl) a:03Ahoelzl [18:32:00] 10Data-Engineering (Q3 2024 January 1st - March 31th): HDFS capacity needs HTML dumps - https://phabricator.wikimedia.org/T384099#10485919 (10Ahoelzl) a:03JAllemandou [18:33:19] 10Data-Engineering (Q3 2024 January 1st - March 31th): HDFS capacity needs HTML dumps - https://phabricator.wikimedia.org/T384099#10485920 (10Ahoelzl) As planning, conversations on the actual HTML dumps pipeline and operationalization are in progress this can be a reasonable first estimate. [18:54:06] !log starting deployment of data lake temp accounts changes [18:54:08] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:07:31] (03PS9) 10Mforns: Update MediaWiki History to support Temp Accounts [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1088232 (https://phabricator.wikimedia.org/T379230) [19:08:58] (03CR) 10Mforns: [C:03+2] "Deploying!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1088232 (https://phabricator.wikimedia.org/T379230) (owner: 10Mforns) [19:09:07] (03CR) 10Mforns: [V:03+2 C:03+2] Update MediaWiki History to support Temp Accounts [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1088232 (https://phabricator.wikimedia.org/T379230) (owner: 10Mforns) [19:23:23] !log [data lake temp accounts] stopped DAGs [19:23:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:28:31] (03PS1) 10Mforns: Update changelog.md for v0.2.56 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1113531 [19:30:26] (03CR) 10Mforns: [V:03+2 C:03+2] "LGTM!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1113531 (owner: 10Mforns) [19:33:05] Starting build #25 for job analytics-refinery-maven-release [19:39:02] (03CR) 10Mforns: [V:03+2] Modify MediaWiki History queries to support Temp Accounts (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1088342 (https://phabricator.wikimedia.org/T379230) (owner: 10Mforns) [19:49:52] (03CR) 10Snwachukwu: [C:03+2] Modify MediaWiki History queries to support Temp Accounts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1088342 (https://phabricator.wikimedia.org/T379230) (owner: 10Mforns) [19:50:12] (03CR) 10Snwachukwu: [C:03+2] Add New Edit Hourly HQL for Temp Account Change [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1084873 (https://phabricator.wikimedia.org/T377767) (owner: 10Jennifer Ebe) [19:50:29] (03CR) 10Snwachukwu: [C:03+2] Edit Geoeditors Daily Monthly to support Temp Account Changes [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1098083 (https://phabricator.wikimedia.org/T379728) (owner: 10Jennifer Ebe) [19:51:25] (03CR) 10Snwachukwu: [V:03+2 C:03+2] Update Druid Geoeditors monthly to Support Temp Accounts. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1099006 (https://phabricator.wikimedia.org/T379772) (owner: 10Snwachukwu) [19:51:35] (03CR) 10Snwachukwu: [V:03+2] unique_editors_by_country support temporary and permanent users [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1106349 (https://phabricator.wikimedia.org/T379771) (owner: 10Aleksandar Mastilovic) [19:52:02] (03CR) 10Snwachukwu: [V:03+2] Update Geoeditors Edits Monthly to support Temp Accounts. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1092890 (https://phabricator.wikimedia.org/T379768) (owner: 10Snwachukwu) [19:53:48] Project analytics-refinery-maven-release build #25: 09SUCCESS in 20 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release/25/ [19:59:31] Starting build #22 for job analytics-refinery-update-jars [19:59:55] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.2.56 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1113536 [19:59:55] Project analytics-refinery-update-jars build #22: 09SUCCESS in 24 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars/22/ [20:05:00] (03CR) 10Mforns: [C:03+2] Add refinery-source jars for v0.2.56 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1113536 (owner: 10Maven-release-user) [20:05:03] (03CR) 10Mforns: [V:03+2 C:03+2] Add refinery-source jars for v0.2.56 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1113536 (owner: 10Maven-release-user) [20:06:39] !log [data lake temp accounts] deployed refinery source [20:06:41] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:11:42] !log [data lake temp accounts] starting refinery deployment [20:11:44] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:28:24] !log [data lake temp accounts] deployed airflow [20:33:13] !log [data lake temp accounts] finished refinery deployment [20:33:15] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:39:44] 06Data-Engineering, 06Data-Engineering-Radar, 10Wmfdata-Python: Remove Wmfdata's custom update-notification code - https://phabricator.wikimedia.org/T346706#10486377 (10nshahquinn-wmf) p:05Triage→03Low [20:51:08] 06Data-Engineering, 10Wmfdata-Python: Deprecate the Hive module - https://phabricator.wikimedia.org/T384541 (10nshahquinn-wmf) 03NEW p:05Triage→03Low [20:57:13] 06Data-Engineering, 10Wmfdata-Python: Deprecate the Hive module - https://phabricator.wikimedia.org/T384541#10486442 (10nshahquinn-wmf) @fkaelin has put up [an MR](https://gitlab.wikimedia.org/repos/data-engineering/wmfdata-python/-/merge_requests/61) that simply switches `run` and `load_csv` to use Spark unde... [21:34:47] 06Data-Engineering, 10Dumps 2.0 (Kanban Board): HDFS capacity needs for XML Dumps temporary storage - https://phabricator.wikimedia.org/T384397#10486593 (10xcollazo) A full dump from Dumps 1.0 for all wikis weights about ~7.5TB: ` xcollazo@snapshot1011:/mnt/dumpsdata/xmldatadumps$ hostname -f snapshot1011.eqia... [21:35:37] 06Data-Engineering, 10Dumps 2.0 (Kanban Board): HDFS capacity needs for XML Dumps temporary storage - https://phabricator.wikimedia.org/T384397#10486595 (10xcollazo) 05Open→03Resolved p:05Triage→03High [21:38:23] 06Data-Engineering, 06Data-Engineering-Icebox, 06Movement-Insights, 10Wmfdata-Python: Support importing a Parquet file into HDFS using wmfdata-python - https://phabricator.wikimedia.org/T273196#10486601 (10nshahquinn-wmf) a:03nshahquinn-wmf @fkaelin has put up [MR 66](https://gitlab.wikimedia.org/repos/d... [22:12:17] 06Data-Engineering, 10Wmfdata-Python: mariadb.run returns SQL dates types as Python datetime.date, not Pandas datetime - https://phabricator.wikimedia.org/T384546 (10nshahquinn-wmf) 03NEW p:05Triage→03Low [22:18:40] 06Data-Engineering, 10Wmfdata-Python: mariadb.run returns SQL dates types as Python datetime.date, not Pandas datetime - https://phabricator.wikimedia.org/T384546#10486772 (10nshahquinn-wmf) [22:18:50] 06Data-Engineering, 10Wmfdata-Python: mariadb.run returns SQL dates as Python datetime.date, not Pandas datetime - https://phabricator.wikimedia.org/T384546#10486773 (10nshahquinn-wmf) [22:19:55] 06Data-Engineering, 10Wmfdata-Python: mariadb.run returns SQL dates as Python datetime.date, not Pandas datetime - https://phabricator.wikimedia.org/T384546#10486781 (10nshahquinn-wmf) [22:34:44] 06Data-Engineering, 10Wmfdata-Python: spark.run returns SQL dates as Python datetime.date, not Pandas datetime - https://phabricator.wikimedia.org/T384548 (10nshahquinn-wmf) 03NEW p:05Triage→03Low [22:36:19] !log [data lake temp accounts] re-ran DAG mediawiki_history_denormalized for 2024-12 [22:36:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log