[04:48:03] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#9975787 (10Marostegui) [05:16:13] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#9975803 (10Marostegui) [05:18:13] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#9975810 (10Marostegui) [05:18:17] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#9975812 (10Marostegui) [09:11:11] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Drop deprecated abuse filter fields on wmf wikis - https://phabricator.wikimedia.org/T367781#9976003 (10ABran-WMF) execution collided with T367856 on s7, stopped and repooling will resume monday. [10:17:16] Seeing a lot of SLA miss Airflow alerts from `2024-07-11T10:00:00+00:00` most seem to have successfully ran but took a longer time than usual [11:07:52] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Request for Kerb credentials for Ariel Glenn - https://phabricator.wikimedia.org/T368911#9976186 (10ArielGlenn) 05In progress→03Resolved a:05ArielGlenn→03Dzahn Hey Daniel, I'd just assumed that getting added to the analytics-privatedata-users grou... [11:35:23] Amir1: by the way, https://analytics.wikimedia.org/published/datasets/querypage/MostTranscludedPages/?cachebust shows the results of our updated query now. (note the ?cachebust hack because something's wrong with the cache settings on that server, I pinged SRE about it) [11:36:20] Thanks! I will continue working on the core patch [11:37:22] (03CR) 10Milimetric: [V:03+2 C:03+2] "Ah, sorry we missed that in review as well, thanks Ben" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1053364 (https://phabricator.wikimedia.org/T363434) (owner: 10Btullis) [11:58:01] (03PS1) 10Milimetric: wikifunctions sqoop: Fix table location [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1053905 [12:00:42] (03CR) 10Milimetric: [V:03+2 C:03+2] wikifunctions sqoop: Fix table location [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1053905 (owner: 10Milimetric) [12:52:56] 06Data-Engineering: Develop Airflow ExternalTaskSensor to orchestrate DAG dependencies - https://phabricator.wikimedia.org/T369900 (10Ahoelzl) 03NEW [12:58:50] 06Data-Engineering, 06Data-Platform-SRE, 06Discovery-Search, 06Java-Scala-Standardization, 07Epic: [Epic] Replace Archiva with Gitlab artifact repositories - https://phabricator.wikimedia.org/T367315#9976462 (10Gehel) [13:26:09] (03PS2) 10Milimetric: Implement new way to aggregate browser statistics [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1049281 (https://phabricator.wikimedia.org/T342267) [13:26:13] (03CR) 10Milimetric: Implement new way to aggregate browser statistics (035 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1049281 (https://phabricator.wikimedia.org/T342267) (owner: 10Milimetric) [13:27:31] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Drop deprecated abuse filter fields on wmf wikis - https://phabricator.wikimedia.org/T367781#9976558 (10ABran-WMF) [14:24:12] 10Data-Engineering (Q1 2024 July 1st - September 30th): Develop Airflow ExternalTaskSensor to orchestrate DAG dependencies - https://phabricator.wikimedia.org/T369900#9976711 (10lbowmaker) [14:27:16] 10Data-Engineering (Q1 2024 July 1st - September 30th): [Spike] [Refine Refactoring] List out all production Refine datasets that need to be migrated to the config store (Airflow and Iceberg) - https://phabricator.wikimedia.org/T361498#9976715 (10lbowmaker) [14:28:56] 06Data-Engineering, 10Event-Platform: Evaluate ESC and explore an alternative design. - https://phabricator.wikimedia.org/T365005#9976724 (10lbowmaker) [14:29:29] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform: Evaluate ESC and explore an alternative design. - https://phabricator.wikimedia.org/T365005#9976727 (10lbowmaker) [14:32:38] 10Data-Engineering (Q1 2024 July 1st - September 30th), 13Patch-For-Review: [Refine Refactoring] Integrate Refine workflow configuration into ESC - https://phabricator.wikimedia.org/T367134#9976743 (10lbowmaker) [14:32:48] 10Data-Engineering (Q1 2024 July 1st - September 30th): [Refine Refactoring] Configure and deploy all Refine data sets for parallel production processing and testing - https://phabricator.wikimedia.org/T361501#9976748 (10lbowmaker) [14:33:17] 10Data-Engineering (Q1 2024 July 1st - September 30th): [Refine Refactoring] Define and implement a automated testing / comparison tool for config store configured datasets - https://phabricator.wikimedia.org/T361502#9976751 (10lbowmaker) [14:33:58] 10Data-Engineering (Q1 2024 July 1st - September 30th), 13Patch-For-Review: [Refine refactoring] Extract refine schema management into a dedicated tool - https://phabricator.wikimedia.org/T356762#9976753 (10lbowmaker) [14:34:35] 10Data-Engineering (Q1 2024 July 1st - September 30th): [Refine Refactoring] Switch new Refine system outputs to production location and monitor - https://phabricator.wikimedia.org/T369845#9976755 (10lbowmaker) [14:38:33] 10Data-Engineering (Q1 2024 July 1st - September 30th): Migrate and re-deploy eventstreams using new service runner - https://phabricator.wikimedia.org/T361769#9976768 (10lbowmaker) [14:38:34] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10MediaWiki-General, 10Event-Platform, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review: Create legacy EventLogging proxy HTTP intake (for MediaWikiPingback) endpoint to EventGate - https://phabricator.wikimedia.org/T353817#9976762 (10lbowmaker) [14:38:36] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10MediaWiki-extensions-EventLogging, 10Event-Platform, 13Patch-For-Review: Decommission EventLogging backend components by migrating to MEP - https://phabricator.wikimedia.org/T238230#9976766 (10lbowmaker) [14:38:47] 10Data-Engineering (Q1 2024 July 1st - September 30th), 13Patch-For-Review: Create gitlab ci npm publish pipeline and job in workflow_utils gitlab_ci_templates - https://phabricator.wikimedia.org/T366537#9976772 (10lbowmaker) [14:39:18] 10Data-Engineering (Q1 2024 July 1st - September 30th): Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code - https://phabricator.wikimedia.org/T366612#9976775 (10lbowmaker) [14:39:53] 10Data-Engineering (Q1 2024 July 1st - September 30th): [Developer Experience] Implement CI hql Linting - https://phabricator.wikimedia.org/T360967#9976776 (10lbowmaker) [14:40:15] 10Data-Engineering (Q1 2024 July 1st - September 30th): Implement automatic sync of refinery HQL files to HDFS - https://phabricator.wikimedia.org/T365659#9976778 (10lbowmaker) [14:40:53] 10Data-Engineering (Q1 2024 July 1st - September 30th): Migrate refinery HQL files to CI/CD supported GitLab repository - https://phabricator.wikimedia.org/T362832#9976780 (10lbowmaker) [14:41:24] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Data Pipelines: Improve Airflow DAG testing process - https://phabricator.wikimedia.org/T368944#9976782 (10lbowmaker) [14:42:05] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform, 13Patch-For-Review: Migrate Data Engineering NodeJS library repos to GitLab - https://phabricator.wikimedia.org/T366611#9976788 (10lbowmaker) [14:43:56] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform: Migrate Event Platform Schema Respositories to Gitlab - https://phabricator.wikimedia.org/T366836#9976793 (10lbowmaker) [14:44:31] 10Data-Engineering (Q1 2024 July 1st - September 30th), 06Data-Platform-SRE, 06Discovery-Search, 06Java-Scala-Standardization: Migrate existing Java packages to deploying to Gitlab, including new version of parent pom, validation that all dependencies are a... - https://phabricator.wikimedia.org/T367405#9976795 [14:44:53] 10Data-Engineering (Q1 2024 July 1st - September 30th), 06Data Products, 06Movement-Insights, 10Movement-Metrics: Data Quality Issue: Wikitext History Job fail / rerun in Airflow - https://phabricator.wikimedia.org/T342911#9976797 (10lbowmaker) [14:46:00] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board), 10Event-Platform, 10MW-1.43-notes (1.43.0-wmf.14; 2024-07-16): [Event Platform] Instrument EventBus with prometheus MW Statslib - https://phabricator.wikimedia.org/T363587#9976799 (10lbowmaker) [14:46:34] 07Analytics-Data-Problem, 06Data-Engineering, 10Data-Engineering-Dashiki, 10Data Products (Data Products Sprint 16), and 2 others: Investigate surprising "10% Other" portion of Analytics Browsers report - https://phabricator.wikimedia.org/T342267#9976806 (10Milimetric) Ok, sent updated code, it's fast now... [14:47:18] (03PS3) 10Milimetric: Implement new way to aggregate browser statistics [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1049281 (https://phabricator.wikimedia.org/T342267) [14:48:26] 10Data-Engineering (Q1 2024 July 1st - September 30th): Develop Airflow ExternalTaskSensor to orchestrate DAG dependencies - https://phabricator.wikimedia.org/T369900#9976807 (10lbowmaker) [14:48:26] 10Data-Engineering (Q4 2024 April 1st - June 30th), 10Dumps 2.0 (Kanban Board), 07Spike: [Status Store] [SPIKE] Investigate and document approach for Iceberg Sensors - https://phabricator.wikimedia.org/T360922#9976808 (10lbowmaker) [14:48:49] 06Data-Engineering, 06Discovery-Search, 06Java-Scala-Standardization, 10Data-Platform-SRE (2024.07.08 - 2024.07.28): Update parent pom to disable fetching dependencies from Archiva and use Gitlab instead - https://phabricator.wikimedia.org/T367404#9976810 (10Gehel) [14:49:27] 06Data-Engineering, 06Discovery-Search, 06Java-Scala-Standardization, 10Data-Platform-SRE (2024.07.08 - 2024.07.28): Update parent pom to disable fetching dependencies from Archiva and use Gitlab instead - https://phabricator.wikimedia.org/T367404#9976812 (10Gehel) a:03Gehel [14:50:03] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board): MediaWiki Reconciliation API - https://phabricator.wikimedia.org/T368782#9976813 (10lbowmaker) [14:50:08] 10Data-Engineering (Q1 2024 July 1st - September 30th): [SPIKE] Define process to build out lineage in DataHub - https://phabricator.wikimedia.org/T369758#9976818 (10lbowmaker) [14:50:17] 10Data-Engineering (Q1 2024 July 1st - September 30th): [SPIKE] Define process to build out lineage in DataHub - https://phabricator.wikimedia.org/T369758#9976819 (10lbowmaker) [14:50:47] 10Data-Engineering (Q1 2024 July 1st - September 30th), 07Spike: [SPIKE] Evaluate and document solutions for table-management tooling - https://phabricator.wikimedia.org/T360969#9976820 (10lbowmaker) [14:58:15] 06Data-Engineering, 10Observability-Metrics, 13Patch-For-Review, 10Sustainability (Incident Followup): Site Issue: Delayed data in the `webrequest_sampled_live` kafka topic - https://phabricator.wikimedia.org/T369737#9976830 (10fgiunchedi) Today I've done extensive tests and tweaking of benthos@webrequest_... [14:59:10] 07Analytics-Data-Problem, 06Data-Engineering, 06Data-Platform, 06Movement-Insights: NEW BUG REPORT Mediawiki_history contains duplicate rows for some revisions - https://phabricator.wikimedia.org/T369851#9976831 (10lbowmaker) a:03Snwachukwu [14:59:32] 07Analytics-Data-Problem, 06Data-Engineering, 06Data-Platform, 06Movement-Insights: NEW BUG REPORT Mediawiki_history contains duplicate rows for some revisions - https://phabricator.wikimedia.org/T369851#9976834 (10lbowmaker) [15:01:53] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform: Event Platform schemas should not support type changes to structs as array element or map value types - https://phabricator.wikimedia.org/T366487#9976848 (10lbowmaker) [15:01:57] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform: [Event Platform] - Add schema CI test that array ensures properties with object types also enumerate object properties - https://phabricator.wikimedia.org/T366562#9976850 (10lbowmaker) [15:03:27] 10Data-Engineering (Q1 2024 July 1st - September 30th), 06Data-Platform: 6 new wikis missing from mediawiki_history - https://phabricator.wikimedia.org/T368788#9976852 (10lbowmaker) [15:05:50] 10Data-Engineering (Q4 2024 April 1st - June 30th): Update converted reportupdater DAG queries to correct CSV options - https://phabricator.wikimedia.org/T362699#9976867 (10lbowmaker) 05Open→03Resolved [15:05:51] 10Data-Engineering (Q4 2024 April 1st - June 30th): Clickstream datasets only reference 'other' link type, no 'link' - https://phabricator.wikimedia.org/T366042#9976866 (10lbowmaker) 05Open→03Resolved [15:05:53] 10Data-Engineering (Q4 2024 April 1st - June 30th): [Maintenance] Resolve long launch times for canary events on Airflow (30mins in total) - https://phabricator.wikimedia.org/T361499#9976868 (10lbowmaker) 05Open→03Resolved [15:05:55] 10Data-Engineering (Q4 2024 April 1st - June 30th): [Maintenance] Migrate pingback to Airflow - https://phabricator.wikimedia.org/T357372#9976869 (10lbowmaker) 05Open→03Resolved [15:05:57] 10Data-Engineering (Q4 2024 April 1st - June 30th): [Refine Refactoring] Refactor refinery code for compatibility with Airflow integration - https://phabricator.wikimedia.org/T356363#9976870 (10lbowmaker) 05Open→03Resolved [15:05:58] 10Data-Engineering (Q4 2024 April 1st - June 30th), 06Data Products: Modify ClickStreamBuilder pipeline to cope with pagelinks schema changes - https://phabricator.wikimedia.org/T355588#9976873 (10lbowmaker) 05Open→03Resolved [15:06:02] 10Data-Engineering (Q4 2024 April 1st - June 30th): [Data Quality] Implement basic data quality metrics for MW history - https://phabricator.wikimedia.org/T354692#9976875 (10lbowmaker) 05Open→03Resolved [15:06:05] 10Data-Engineering (Q4 2024 April 1st - June 30th): We should provide DQ integration with Python - https://phabricator.wikimedia.org/T353940#9976877 (10lbowmaker) 05Open→03Resolved [15:06:09] 10Data-Engineering (Q4 2024 April 1st - June 30th), 13Patch-For-Review: [Maintenance] Migrate ReportUpdater browser queries to Airflow - https://phabricator.wikimedia.org/T354552#9976874 (10lbowmaker) 05Open→03Resolved [15:11:12] 10Data-Engineering (Q1 2024 July 1st - September 30th), 06Data-Platform: 7 new wikis missing from mediawiki_history - https://phabricator.wikimedia.org/T368788#9976883 (10CMyrick-WMF) [15:15:19] 10Data-Engineering (Q1 2024 July 1st - September 30th), 06Data-Platform: 7 new wikis missing from mediawiki_history - https://phabricator.wikimedia.org/T368788#9976903 (10CMyrick-WMF) Now that there is a June mediawiki_history snapshot, I've updated this ticket to include the missing wiki that was created in J...