[03:03:38] 06Data-Engineering, 10Pageviews-API, 10RESTBase-API, 10Wikifeeds, 07Chinese-Sites: There are anomalies in some of the mostread data on zhwiki for March 2024 - https://phabricator.wikimedia.org/T360499 (10Shizhao) 03NEW [06:04:04] 06Data-Engineering, 06Data-Persistence, 06Data-Platform-SRE: analytics-mysql centralauth returns a MySQL error - https://phabricator.wikimedia.org/T360482#9644764 (10Marostegui) The host is up and responding normally. There are no traces of crashes on the logs, so it might be related to the client connection. [06:06:01] 06Data-Engineering, 06Data-Persistence, 06Data-Platform-SRE: analytics-mysql centralauth returns a MySQL error - https://phabricator.wikimedia.org/T360482#9644765 (10Marostegui) It all seems fine now? ` root@stat1005:~# telnet dbstore1008 3317 Trying 10.64.131.23... Connected to dbstore1008.eqiad.wmnet. Esc... [08:09:53] (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-coord1003:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1003:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage [09:08:22] 06Data-Engineering, 06Data-Persistence, 10Data-Platform-SRE (2024.03.04 - 2024.03.24): analytics-mysql centralauth returns a MySQL error - https://phabricator.wikimedia.org/T360482#9644961 (10Gehel) [09:26:30] 06Data-Engineering, 06Data-Persistence, 10Data-Platform-SRE (2024.03.04 - 2024.03.24): analytics-mysql centralauth returns a MySQL error - https://phabricator.wikimedia.org/T360482#9644992 (10Urbanecm_WMF) @Marostegui Now it indeed does seem fine, but it was not working for two consecutive days (yesterday an... [09:33:11] 06Data-Engineering, 06Data-Persistence, 06DBA, 10Data-Platform-SRE (2024.03.04 - 2024.03.24): 14analytics-mysql centralauth returns a MySQL error - 14https://phabricator.wikimedia.org/T360482#9645010 (10Marostegui) 05Open→03Resolved a:03Marostegui 14On a mariadb level, the host hasn't had any cr... [10:01:31] joal: Hey! Would you have 5' for a chat about maven parent pom? [10:04:47] I'll start the discussion on slack, that might be enough to unblock. See #wmf-java (slack) [10:20:43] !log migrating superset to Kubernetes. Some CAS errors are expected during ~15 minutes [10:20:46] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:50:01] !log superset.wikimedia.org is now migrated to the DSE k8s cluster, CAS errors have receeded [10:50:03] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:52:07] (03PS1) 10Gmodena: WIP - Add gobblin job webrequest_frontend to pull new webrequest stream [analytics/refinery] - 10https://gerrit.wikimedia.org/r/983926 (https://phabricator.wikimedia.org/T314956) (owner: 10Ottomata) [10:57:08] (03PS2) 10Gmodena: Add gobblin job webrequest_frontend [analytics/refinery] - 10https://gerrit.wikimedia.org/r/983926 (https://phabricator.wikimedia.org/T314956) (owner: 10Ottomata) [11:02:07] (03CR) 10Gmodena: Add gobblin job webrequest_frontend (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/983926 (https://phabricator.wikimedia.org/T314956) (owner: 10Ottomata) [11:32:13] 14Analytics-Radar, 06Data-Engineering, 06Data Products, 10Metrics Platform Backlog: mw.user.generateRandomSessionId should return a UUID - https://phabricator.wikimedia.org/T266813#9645366 (10phuedx) @VirginiaPoundstone: Yes. It's the `performer.session_id` contextual attribute (which is soon to be renamed... [11:49:53] (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-coord1003:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1003:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage [12:17:41] 06Data-Engineering, 06Data Products, 10Pageviews-API, 10RESTBase-API, and 2 others: There are anomalies in some of the mostread data on zhwiki for March 2024 - https://phabricator.wikimedia.org/T360499#9645466 (10lbowmaker) [12:21:37] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Migrate the Analytics Superset instances to our DSE Kubernetes cluster - https://phabricator.wikimedia.org/T347710#9645476 (10brouberol) [12:21:57] 06Data-Engineering, 10Data-Platform-SRE (2024.03.04 - 2024.03.24): 14Migrate bare-metal superset services over to Kubernetes - 14https://phabricator.wikimedia.org/T358569#9645474 (10brouberol) 05Open→03Resolved [13:03:09] (03PS3) 10Gmodena: Add gobblin job webrequest_frontend [analytics/refinery] - 10https://gerrit.wikimedia.org/r/983926 (https://phabricator.wikimedia.org/T314956) (owner: 10Ottomata) [13:09:49] (03CR) 10Gmodena: Add gobblin job webrequest_frontend (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/983926 (https://phabricator.wikimedia.org/T314956) (owner: 10Ottomata) [13:22:36] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Migrate the Analytics Superset instances to our DSE Kubernetes cluster - https://phabricator.wikimedia.org/T347710#9645597 (10brouberol) [13:22:39] 06Data-Engineering, 06Data-Platform-SRE, 07Epic, 13Patch-For-Review: 14Remove all resources associated with the superset-(next-)k8s.wimedia.org domains - 14https://phabricator.wikimedia.org/T358480#9645596 (10brouberol) 05Open→03Resolved [13:26:50] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Migrate the Analytics Superset instances to our DSE Kubernetes cluster - https://phabricator.wikimedia.org/T347710#9645603 (10brouberol) https://superset.wikimedia.org is now served by a service running in dse-k8s-eqiad. We only have a couple of cleanup tasks... [13:27:51] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Migrate the Analytics Superset instances to our DSE Kubernetes cluster - https://phabricator.wikimedia.org/T347710#9645607 (10brouberol) [13:50:51] (03CR) 10Phuedx: [C:04-1] "A couple of minor points inline. Other than those points, this LGTM!" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1008541 (https://phabricator.wikimedia.org/T358758) (owner: 10Santiago Faci) [13:58:02] (03PS1) 10Gmodena: Add webrequest_frontent raw schema. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1012656 (https://phabricator.wikimedia.org/T314956) [14:01:34] 06Data-Engineering: aqs endpoint health alerting about mismatched check - https://phabricator.wikimedia.org/T360522 (10MoritzMuehlenhoff) 03NEW [14:09:03] (GobblinKafkaRecordsExtractedNotEqualRecordsExpected) firing: Gobblin job event_default ingested an unexpected number of records for a Kafka topic partition. ... [14:09:04] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=event_default&var-kafka_topic=eqiad.mediawiki.cirrussearch.page_rerender.v1&viewPanel=4 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [14:21:22] 06Data-Engineering: aqs endpoint health alerting about mismatched check - https://phabricator.wikimedia.org/T360522#9645849 (10Milimetric) I'm working to find the relevant tickets, but AQS 1 should be sunset and I think it's ok to take it offline for now and follow through with the rest of the process. I've jus... [14:24:00] 06Data-Engineering: aqs endpoint health alerting about mismatched check - https://phabricator.wikimedia.org/T360522#9645880 (10Milimetric) Ah, thanks @will for finding {T358793}, @brouberol and others can go ahead and take AQS 1 offline and follow through with decommissioning. Take note of what Eric said there,... [15:09:03] (GobblinKafkaRecordsExtractedNotEqualRecordsExpected) resolved: Gobblin job event_default ingested an unexpected number of records for a Kafka topic partition. ... [15:09:04] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=event_default&var-kafka_topic=eqiad.mediawiki.cirrussearch.page_rerender.v1&viewPanel=4 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [15:14:38] 06Data-Engineering, 10Data-Platform-SRE (2024.03.04 - 2024.03.24), 13Patch-For-Review: aqs endpoint health alerting about mismatched check - https://phabricator.wikimedia.org/T360522#9646019 (10brouberol) [15:22:18] 06Data-Engineering, 10Data-Engineering-Wikistats, 06Data Products, 06Data-Platform, 06Movement-Insights: Wikistats "Active Editors by Country" does not follow definition for active editors - https://phabricator.wikimedia.org/T360073#9646043 (10VirginiaPoundstone) [15:26:35] 06Data-Engineering, 10Data-Platform-SRE (2024.03.04 - 2024.03.24), 13Patch-For-Review: aqs endpoint health alerting about mismatched check - https://phabricator.wikimedia.org/T360522#9646066 (10brouberol) a:03brouberol [15:31:12] 06Data-Engineering, 06Data Products, 06Data-Platform, 06Movement-Insights: Wikistats "Active Editors by Country" does not follow definition for active editors - https://phabricator.wikimedia.org/T360073#9646084 (10VirginiaPoundstone) @kzimmerman thank you for filing this task. I have a few follow up questi... [15:34:06] 06Data-Engineering, 06Data Products, 06Data-Platform, 06Movement-Insights: Wikistats "Active Editors by Country" does not follow definition for active editors - https://phabricator.wikimedia.org/T360073#9646110 (10Milimetric) I believe this dataset that's already being published is strictly better and in m... [15:34:18] 06Data-Engineering, 06Data Products: NEW BUG REPORT - Pageviews Missing Hourly Partition - https://phabricator.wikimedia.org/T358142#9646116 (10VirginiaPoundstone) [15:34:23] 06Data-Engineering, 06Data Products: NEW BUG REPORT - Pageviews Missing Hourly Partition - https://phabricator.wikimedia.org/T358142#9646112 (10VirginiaPoundstone) @cjming please review as part of ops week [15:40:47] 06Data-Engineering, 10Data Products (Data Products Sprint 11): project-title-country missing US data in recent data, and double quote escaping - https://phabricator.wikimedia.org/T341139#9646149 (10VirginiaPoundstone) @Htriedman this seems to be related to your differential privacy release. [15:41:08] 06Data-Engineering, 06Data Products: project-title-country missing US data in recent data, and double quote escaping - https://phabricator.wikimedia.org/T341139#9646150 (10VirginiaPoundstone) [15:57:16] 06Data-Engineering, 10AQS2.0, 10Data Products (Data Products Sprint 11): Method for per-file cumulative total in mediarequests API - https://phabricator.wikimedia.org/T343947#9646203 (10VirginiaPoundstone) @Dominicbm after grooming, we are adding this as a feature request for AQS2. I am putting this in the b... [15:57:21] 06Data-Engineering, 10AQS2.0: Method for per-file cumulative total in mediarequests API - https://phabricator.wikimedia.org/T343947#9646205 (10VirginiaPoundstone) [16:09:49] 06Data-Engineering, 06Data Products: project-title-country missing US data in recent data, and double quote escaping - https://phabricator.wikimedia.org/T341139#9646308 (10Htriedman) Hi @Ogiermaitre! Thanks for bringing this to our attention, and sorry it's taken so long to respond to you about this — I didn't... [16:27:44] 06Data-Engineering, 06Data Products, 06Data-Platform, 06Movement-Insights: Wikistats "Active Editors by Country" does not follow definition for active editors - https://phabricator.wikimedia.org/T360073#9646395 (10Htriedman) @kzimmerman @Milimetric happy to set up a meeting next week to discuss the differe... [16:32:01] 06Data-Engineering, 06Data Products, 06Data-Platform, 06Movement-Insights: Wikistats "Active Editors by Country" does not follow definition for active editors - https://phabricator.wikimedia.org/T360073#9646417 (10Htriedman) @Milimetric FWIW about the weekly dataset — folks from product analytics told me t... [17:35:30] Ah - back with write capabilities [17:35:43] gehel: sorry I was mostly not online today, we'll talk tomorrow [17:35:59] btullis: I could do with some help on a small gerrit task [17:36:08] btullis: would you have some time, or possibly tomorrow? [17:43:06] 06Data-Engineering, 10Data-Engineering-Wikistats, 06Data Products: arywiki view stats too low for agent = user? - https://phabricator.wikimedia.org/T359004#9646725 (10VirginiaPoundstone) After some preliminary review by @nshahquinn-wmf and @Mayakp.wiki we would like to observe the trend for another two month... [20:24:39] 14Analytics-Radar, 06Data-Engineering-Icebox, 06Web-Team-Backlog: % of "none" referers seems too high - https://phabricator.wikimedia.org/T195880#9647424 (10Jdlrobson) [21:29:54] (03PS1) 10Cooltey: Add `app_patroller_experience` to allowlist for Android app [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1013142 (https://phabricator.wikimedia.org/T358299) [22:49:42] (SparkHistoryTestServiceUnavailable) firing: ... [22:49:42] spark-history-analytics-test-hadoop is unavailable on k8s-dse - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Spark_History#The_app_isn't_running - https://grafana.wikimedia.org/d/hyl18XgMk/kubernetes-container-details?orgId=1&var-datasource=eqiad%2Bprometheus/k8s-dse&var-namespace=spark-history-test&var-container=All - https://alerts.wikimedia.org/?q=alertname%3DSparkHistoryTestServiceUnavailable [22:54:42] (SparkHistoryTestServiceUnavailable) resolved: ... [22:54:42] spark-history-analytics-test-hadoop is unavailable on k8s-dse - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Spark_History#The_app_isn't_running - https://grafana.wikimedia.org/d/hyl18XgMk/kubernetes-container-details?orgId=1&var-datasource=eqiad%2Bprometheus/k8s-dse&var-namespace=spark-history-test&var-container=All - https://alerts.wikimedia.org/?q=alertname%3DSparkHistoryTestServiceUnavailable [23:15:19] 10Quarry: Remove redis - https://phabricator.wikimedia.org/T360584 (10rook) 03NEW [23:16:11] (03PS1) 10Cooltey: Adding meta: { dt:keep } to align with the app_donor_experience [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1013149 [23:16:44] (03Abandoned) 10Cooltey: Adding meta: { dt:keep } to align with the app_donor_experience [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1013149 (owner: 10Cooltey) [23:19:13] (03PS2) 10Cooltey: Add `app_patroller_experience` to allowlist for Android app [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1013142 (https://phabricator.wikimedia.org/T358299)