[05:36:29] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#10063377 (10Marostegui) clouddb1017 is back in sync with no lag. [07:27:59] 06Data-Engineering, 06collaboration-services, 10Data Pipelines, 10Data-Platform-SRE (2024.07.29 - 2024.08.16), 10Release-Engineering-Team (Radar): Upgrade Airflow to 2.9.3 - https://phabricator.wikimedia.org/T365449#10063424 (10Stevemunene) Seems we missed the `confluent_kafka` deletion while generating... [08:11:08] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform: Rollback haproxy feed automated ingestion - https://phabricator.wikimedia.org/T372456 (10gmodena) 03NEW [08:16:22] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform, 13Patch-For-Review: Rollback haproxy feed automated ingestion - https://phabricator.wikimedia.org/T372456#10063491 (10gmodena) [08:40:21] 06Data-Engineering, 06Data-Platform-SRE, 06Discovery-Search, 07IPv6: Some Search clusters have inconsistent AAAA DNS records for the primary IPv6 of the hosts - https://phabricator.wikimedia.org/T312555#10063567 (10Gehel) p:05High→03Medium [08:40:22] 06Data-Engineering, 06Data-Platform-SRE, 06SRE: Streamline Data Platform access approvals for WMF staff - https://phabricator.wikimedia.org/T370424#10063565 (10Gehel) p:05Triage→03Medium [08:47:52] (03PS1) 10Gmodena: gobblin: remove webrequest_frontend_rc0.pull [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1062677 (https://phabricator.wikimedia.org/T372456) [08:59:24] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform, 13Patch-For-Review: Rollback haproxy feed automated ingestion - https://phabricator.wikimedia.org/T372456#10063602 (10gmodena) [09:57:23] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#10063700 (10fnegri) I repooled clouddb1017. [11:22:01] 06Data-Engineering, 10Data-Platform-SRE (2024.07.29 - 2024.08.16): Requesting Kerberos access for ifrahkhanyaree - https://phabricator.wikimedia.org/T371894#10063814 (10BTullis) Hi @Ifrahkhanyaree_WMDE and apologies for the delay in responding. I think that the most important line from the log above is pr... [11:22:37] 06Data-Engineering, 10Data-Platform-SRE (2024.07.29 - 2024.08.16): Requesting Kerberos access for ifrahkhanyaree - https://phabricator.wikimedia.org/T371894#10063815 (10BTullis) 05Resolved→03Open [11:49:46] 06Data-Engineering, 10Data-Platform-SRE (2024.07.29 - 2024.08.16): request for new matomo site: trace.wikimedia.org/ - https://phabricator.wikimedia.org/T371124#10063881 (10BTullis) I have added the new site to Matomo: {F57273160,width=60%} So you can obtain the Javascript tracking code now from https://piwi... [11:54:41] FIRING: MediawikiPageContentChangeEnrichAvailability: ... [11:54:47] Low percentage of enriched events produced by mw_page_content_change_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [12:05:12] We shall be deploying a new airflow version to fix broken DAGs with the "No module named 'confluent_kafka' error" and then restarting the airflow scheduler across all airflow instances in 5. [12:13:31] !log deploy new airflow version 2.9.3-py3.10-20240814 across all instances [12:13:33] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:17:48] 06Data-Engineering, 06Data-Platform-SRE, 10observability, 06serviceops, and 3 others: Upgrade Kafka to from 1.x to later version - https://phabricator.wikimedia.org/T300102#10063948 (10elukey) @brouberol after T355550 do we have any plans to start testing the upgrade on kafka-test or similar? I can help if... [12:19:44] !log restart airflow services across all instances to pick up new version T365449 [12:19:52] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:19:52] T365449: Upgrade Airflow to 2.9.3 - https://phabricator.wikimedia.org/T365449 [13:33:20] (03PS2) 10Ottomata: gobblin: remove webrequest_frontend_rc0.pull [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1062677 (https://phabricator.wikimedia.org/T372456) (owner: 10Gmodena) [13:34:40] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform, 13Patch-For-Review: Rollback haproxy feed automated ingestion - https://phabricator.wikimedia.org/T372456#10064121 (10Ottomata) @gmodena I just pushed https://gerrit.wikimedia.org/r/c/operations/puppet/+/1062707, it prob has to go befo... [14:06:15] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Event-Platform, 13Patch-For-Review: Rollback haproxy feed automated ingestion - https://phabricator.wikimedia.org/T372456#10064173 (10gmodena) >>! In T372456#10064121, @Ottomata wrote: > @gmodena I just pushed https://gerrit.wikimedia.org/r/c/operatio... [14:13:23] 06Data-Engineering, 10Data-Platform-SRE (2024.07.29 - 2024.08.16): request for new matomo site: trace.wikimedia.org/ - https://phabricator.wikimedia.org/T371124#10064192 (10BTullis) Actually, it looks like the https://plugins.matomo.org/LoginLdap plugin could work better than the OIDC authentication. {F5727354... [14:14:19] (03PS1) 10Gehel: ci: migrate to new parent pom [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/1062711 (https://phabricator.wikimedia.org/T360219) [14:17:20] (03CR) 10Gehel: [C:04-1] "There is still an issue with the newer Maven version blocking some non-HTTPS repositories. We should track them down, or find a way to ign" [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/1062711 (https://phabricator.wikimedia.org/T360219) (owner: 10Gehel) [14:18:52] (03PS1) 10Gehel: build: introduce .sdkmanrc to document JDK version to use for this project [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/1062712 (https://phabricator.wikimedia.org/T346611) [14:54:46] (03PS1) 10Mforns: Add Special:AllEvents to the PageviewDefinition [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1062719 (https://phabricator.wikimedia.org/T368303) [15:33:01] Starting build #12 for job analytics-refinery-maven-release [15:54:56] FIRING: MediawikiPageContentChangeEnrichAvailability: ... [15:54:56] Low percentage of enriched events produced by mw_page_content_change_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [15:56:51] Project analytics-refinery-maven-release build #12: 04FAILURE in 23 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release/12/ [16:02:51] Starting build #13 for job analytics-refinery-maven-release [16:19:43] Yippee, build fixed! [16:19:43] Project analytics-refinery-maven-release build #13: 09FIXED in 16 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release/13/ [16:30:03] Starting build #12 for job analytics-refinery-update-jars [16:32:01] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.2.47 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1062738 [16:32:01] Project analytics-refinery-update-jars build #12: 09SUCCESS in 1 min 57 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars/12/ [16:40:23] (03CR) 10Ottomata: [C:03+2] Add refinery-source jars for v0.2.47 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1062738 (owner: 10Maven-release-user) [16:40:32] (03CR) 10Ottomata: [V:03+2 C:03+2] Add refinery-source jars for v0.2.47 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1062738 (owner: 10Maven-release-user) [16:42:35] !log deploying refinery for weekly train [16:42:38] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:48:00] 10Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board): MediaWiki Reconciliation API - https://phabricator.wikimedia.org/T368782#10064666 (10Ottomata) @gmodena has a hacky PoC here: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventBus/+/1053043 But ya let's discuss. >... [16:48:36] !log reran editors_daily_monthly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of downstream tasks after rerunning mediawiki_history_denormalize dag [16:48:47] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:52:51] !log reran edit_hourly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [16:53:00] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:55:19] !log reran unique_editors_by_country_monthly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [16:55:25] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:09:37] !log reran geoeditors_edits_monthly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [17:09:39] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:12:17] !log reran geoeditors_monthly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [17:12:23] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:16:50] !log reran geoeditors_public_monthly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [17:16:52] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:18:10] !log scap deploy airflow analytics_product for vandalism_pageviews_dag - T362612 [17:18:13] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:18:13] T362612: Create ETL pipelines for Automoderator baseline metrics - https://phabricator.wikimedia.org/T362612 [17:35:16] 06Data-Engineering, 10Data-Platform-SRE (2024.07.29 - 2024.08.16): Design a suitable DAG deployment method - https://phabricator.wikimedia.org/T368033#10064944 (10BTullis) >>! In T368033#10013702, @hashar wrote: > The scary part is any merge to the repo end up straight in production which sounds super scary, i... [17:49:10] 06Data-Engineering, 10Data-Platform-SRE (2024.07.29 - 2024.08.16): Requesting Kerberos access for ifrahkhanyaree - https://phabricator.wikimedia.org/T371894#10064967 (10BTullis) It could be caused by incorrect permissions on your SSH configuration directory and key files. You might like to try these commands t... [18:25:24] (03CR) 10Milimetric: [C:03+2] Add Special:AllEvents to the PageviewDefinition [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1062719 (https://phabricator.wikimedia.org/T368303) (owner: 10Mforns) [18:35:26] (03Merged) 10jenkins-bot: Add Special:AllEvents to the PageviewDefinition [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1062719 (https://phabricator.wikimedia.org/T368303) (owner: 10Mforns) [19:54:56] FIRING: MediawikiPageContentChangeEnrichAvailability: ... [19:54:56] Low percentage of enriched events produced by mw_page_content_change_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [21:07:36] 10Data-Engineering (Q1 2024 July 1st - September 30th): Airflow mapped tasks UI & metrics - https://phabricator.wikimedia.org/T357430#10065483 (10Ottomata) Airflow 2.9 is deployed. Xabriel is naming mapped tasks here: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/804/diffs Ca... [23:54:56] FIRING: MediawikiPageContentChangeEnrichAvailability: ... [23:54:56] Low percentage of enriched events produced by mw_page_content_change_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability