[02:37:59] 07Analytics-Data-Problem, 06Data-Engineering, 10Data-Engineering-Dashiki, 10Data Products (Data Products Sprint 17), 10MediaWiki-Platform-Team (Radar): Investigate surprising "10% Other" portion of Analytics Browsers report - https://phabricator.wikimedia.org/T342267#10065754 (10Milimetric) >>! In T34226... [03:54:56] FIRING: MediawikiPageContentChangeEnrichAvailability: ... [03:54:56] Low percentage of enriched events produced by mw_page_content_change_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [04:51:01] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#10065817 (10Marostegui) [05:09:27] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#10065856 (10Marostegui) [05:57:58] PROBLEM - Check if active EventStreams endpoint is delivering messages. on alert1001 is CRITICAL: CRITICAL: No EventStreams message was consumed from https://stream.wikimedia.org/v2/stream/recentchange within 10 seconds. https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams/Administration [06:57:58] RECOVERY - Check if active EventStreams endpoint is delivering messages. on alert1001 is OK: OK: An EventStreams message was consumed from https://stream.wikimedia.org/v2/stream/recentchange within 10 seconds. https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams/Administration [07:54:56] FIRING: MediawikiPageContentChangeEnrichAvailability: ... [07:54:56] Low percentage of enriched events produced by mw_page_content_change_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [08:16:43] 06Data-Engineering, 06collaboration-services, 10Data Pipelines, 10Data-Platform-SRE (2024.07.29 - 2024.08.16), and 2 others: Upgrade Airflow to 2.9.3 - https://phabricator.wikimedia.org/T365449#10066074 (10Stevemunene) 05Open→03Resolved All hosts are currently on v2.9.3 now so I am marking this as... [08:22:32] 06Data-Engineering, 10Data-Platform-SRE (2024.07.29 - 2024.08.16): Requesting Kerberos access for ifrahkhanyaree - https://phabricator.wikimedia.org/T371894#10066083 (10Ifrahkhanyaree_WMDE) That got me a step further! New error after I add my passphrase is ` identity_sign: private key /home/ifrah.khanyaree/... [09:04:05] (03PS5) 10Jon Harald Søby: Remove text-transform:capitalize; and clean up capital letter use [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/928802 [09:25:08] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#10066196 (10Marostegui) [09:26:15] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#10066200 (10Marostegui) s8 revision table is even bigger than enwiki so this will take even longer. First I am going to be altering codfw so nothing to worry... [11:54:56] FIRING: MediawikiPageContentChangeEnrichAvailability: ... [11:54:56] Low percentage of enriched events produced by mw_page_content_change_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [12:33:18] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Data Pipelines, 10Data-Catalog: Spike: Integrate Spark with DataHub - https://phabricator.wikimedia.org/T306896#10066708 (10tchin) I ran a simple spark sql job that `lang=bash sudo -u analytics-privatedata spark3-sql --jars ./acryl-spark-lineage-0.2.1... [12:48:49] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board), 10Event-Platform, 10MW-1.43-notes (1.43.0-wmf.18; 2024-08-13), 13Patch-For-Review: [Event Platform] Instrument EventBus with prometheus MW Statslib - https://phabricator.wikimedia.org/T363587#10066735 (10gmodena) > I'd al... [13:01:30] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Data Pipelines, 10Data-Catalog: Spike: Integrate Spark with DataHub - https://phabricator.wikimedia.org/T306896#10066758 (10tchin) I can see that [[ https://github.com/apache/iceberg/blob/1.3.x/spark/v3.1/spark/src/main/java/org/apache/iceberg/spark/S... [13:13:09] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Data Pipelines, 10Data-Catalog: Spike: Integrate Spark with DataHub - https://phabricator.wikimedia.org/T306896#10066771 (10BTullis) > `sudo -u analytics-privatedata spark3-sql --jars ./acryl-spark-lineage-0.2.16.jar...` I might be confiused, but are... [13:20:12] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Data Pipelines, 10Data-Catalog: Spike: Integrate Spark with DataHub - https://phabricator.wikimedia.org/T306896#10066804 (10tchin) I'm using the newer `acryl-spark-lineage` which works for datahub 0.13.3 https://datahubproject.io/docs/metadata-integra... [13:26:07] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Data Pipelines, 10Data-Catalog: Spike: Integrate Spark with DataHub - https://phabricator.wikimedia.org/T306896#10066810 (10BTullis) Oh yes, I see. Sorry, I got that the wrong way around :-) [14:14:23] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board): MediaWiki Reconciliation API - https://phabricator.wikimedia.org/T368782#10066973 (10xcollazo) >>! In T368782#10064666, @Ottomata wrote: > @gmodena has a hacky PoC here: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Eve... [15:50:32] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board), 10Event-Platform, 10MW-1.43-notes (1.43.0-wmf.18; 2024-08-13), 13Patch-For-Review: [Event Platform] Instrument EventBus with prometheus MW Statslib - https://phabricator.wikimedia.org/T363587#10067286 (10Ottomata) Yes, le... [15:51:10] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board), 10Event-Platform, 10MW-1.43-notes (1.43.0-wmf.18; 2024-08-13), 13Patch-For-Review: [Event Platform] Instrument EventBus with prometheus MW Statslib - https://phabricator.wikimedia.org/T363587#10067287 (10Ottomata) [15:53:47] !log reran druid_load_geoeditors_monthly, cassandra_load_editors_by_country_monthly, and druid_load_edit_hourly airflow dags with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [15:53:49] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:56:58] 07Analytics-Data-Problem, 10Data-Engineering (Q1 2024 July 1st - September 30th), 06Data-Platform, 06Movement-Insights: NEW BUG REPORT Mediawiki_history contains duplicate rows for some revisions - https://phabricator.wikimedia.org/T369851#10067298 (10Snwachukwu) I reran the following diwnstream airflow da... [15:58:56] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board): MediaWiki Reconciliation API - https://phabricator.wikimedia.org/T368782#10067307 (10Ottomata) >> we use MW job queue. > EventBus could do that regardless of API design, yes? True. I ask about the stream enrichment or job qu... [16:38:57] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform: Migrate Event Platform Schema Respositories to Gitlab - https://phabricator.wikimedia.org/T366836#10067435 (10Snwachukwu) a:03Snwachukwu [16:43:58] (03PS3) 10Gmodena: webrequest: add error schema. [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/1036272 (https://phabricator.wikimedia.org/T314956) [16:53:52] 06Data-Engineering: Handle Late-Arrived Events from Gobblin into Airflow triggered Refine - https://phabricator.wikimedia.org/T370665#10067481 (10Antoine_Quhen) I've performed a short study of our late events, which I detect according to the timestamp of the file created by Gobblin. As the current Refine is che... [18:00:13] 10Data-Engineering (Q1 2024 July 1st - September 30th): [Developer Experience] Implement CI hql Linting - https://phabricator.wikimedia.org/T360967#10067588 (10Ahoelzl) a:03amastilovic [19:00:46] 06Data-Engineering: Handle Late-Arrived Events from Gobblin into Airflow triggered Refine - https://phabricator.wikimedia.org/T370665#10067707 (10Ottomata) @mforns IIUC, your idea is doable without ExternalTaskSensor, yes? For Refine, the sensor is on the _IMPORTED flag written by gobblin. [19:02:42] 06Data-Engineering: Handle Late-Arrived Events from Gobblin into Airflow triggered Refine - https://phabricator.wikimedia.org/T370665#10067721 (10Ottomata) @aqu, nice! Okay, seeing as the max is 7.6 hours, and that is I think too long to wait, we should solve this in some way, probably like @mforns suggested. L...