[00:22:00] 14Analytics-Radar, 06Data-Engineering-Icebox, 10Data-Services, 06cloud-services-team: Implement technical details and process for "datasets_p" on wikireplica hosts - https://phabricator.wikimedia.org/T173511#9594675 (10bd808) 05Open→03Declined With no useful activity, including activism for implementat... [10:07:47] 06Data-Engineering, 10Foundational Technology Requests, 10Data-Platform-SRE (2024.03.04 - 2024.03.24): Enable the Marketing Campaigns Reporting plugin for matomo - https://phabricator.wikimedia.org/T319013#9595225 (10BTullis) a:03BTullis [10:32:00] !log restart hive-server2 and hive-metastore service on an-coord1004 [10:32:02] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:37:03] 06Data-Engineering, 10Data-Engineering-Jupyter, 06Data-Platform-SRE, 07Security: Use custom CDN if possible for Jupyter HTML exported notebooks - https://phabricator.wikimedia.org/T357064#9595480 (10BTullis) Moving back to the parent project, as I don't think we will have time to look at it in the next thr... [12:00:49] !log migrating analytics-hive from an-coord1003 to an-coord1004 with https://gerrit.wikimedia.org/r/c/operations/dns/+/1008414 [12:00:51] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:16:40] 06Data-Engineering, 10Data-Engineering-Wikistats, 06Data Products: arywiki view stats too low for agent = user? - https://phabricator.wikimedia.org/T359004#9595649 (10lbowmaker) [12:22:45] !log restarting hive-server2 and hive-metastore service on an-coord1003 [12:22:46] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:33:40] 10Data-Engineering (Sprint 9), 10ChangeProp, 10observability, 10service-runner, 10Event-Platform: Upgrade prom-client in NodeJS service-runner and enable collectDefaultMetrics - https://phabricator.wikimedia.org/T350180#9595693 (10gmodena) >>! In T350180#9583483, @Jdforrester-WMF wrote: >> Happy to pair... [13:05:38] 10Data-Engineering (Sprint 9): eventstreams: change default num_workers to 0 - https://phabricator.wikimedia.org/T359051 (10gmodena) [13:06:34] 10Data-Engineering (Sprint 9): eventstreams: change default num_workers to 0 - https://phabricator.wikimedia.org/T359051#9595810 (10gmodena) [13:14:55] hey folks! Nice job with superset-next :) [13:15:02] so I assume that kerberos on k8s works now? [13:18:31] Thanks! it does, for a definition of "kerberos on k8s". We generate a keytab for the service that is running in k8s, deploy it onto the deployment sevrer as base64 via the private puppet repo, and create a k8s Secret from it. Then, we render it in a volume, shared between the app container and a sidecar in charge of regenerating a TGT every hour [13:19:11] what does not work is "kerberos tokens associated with a multitude of human beings submitted short-lived jobs outside of our control" [13:19:21] *submitting [13:19:37] Here is our kerberos-kinit sidecar. https://gitlab.wikimedia.org/repos/data-engineering/kerberos-kinit We've used it on the spark-history and now superset. [13:20:38] It runs `k5start` as a daemon to keep the authentication active. https://linux.die.net/man/1/k5start [13:26:06] super [13:31:06] 06Data-Engineering, 10Dumps-Generation, 06SRE, 10Data-Platform-SRE (2024.03.04 - 2024.03.24), 13Patch-For-Review: Migrate Dumps Snapshot hosts from Buster to Bullseye - https://phabricator.wikimedia.org/T325228#9595872 (10BTullis) a:03BTullis [13:31:16] 06Data-Engineering, 10Dumps-Generation, 06SRE, 10Data-Platform-SRE (2024.03.04 - 2024.03.24), 13Patch-For-Review: Migrate Dumps Snapshot hosts from Buster to Bullseye - https://phabricator.wikimedia.org/T325228#9595870 (10BTullis) Moving this into our current milestone, as we are currently working on tes... [13:31:30] 06Data-Engineering, 10Dumps-Generation, 06SRE, 10Data-Platform-SRE (2024.03.04 - 2024.03.24), 13Patch-For-Review: Migrate Dumps Snapshot hosts from Buster to Bullseye - https://phabricator.wikimedia.org/T325228#9595880 (10BTullis) [13:36:17] 06Data-Engineering, 10Data-Platform-SRE (2024.03.04 - 2024.03.24), 13Patch-For-Review: Update the From: addresses of all email from DPE pipelines so that they use routable addresses - https://phabricator.wikimedia.org/T358675#9595895 (10BTullis) After some discussion on https://gerrit.wikimedia.org/r/1007576... [13:50:44] (03PS10) 10Joal: Add DataPivoter job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/995271 (https://phabricator.wikimedia.org/T354552) (owner: 10Snwachukwu) [14:03:40] 10Data-Engineering (Sprint 9): [Data Quality] Update data_quality schemas to be compatible with Iceberg tables - https://phabricator.wikimedia.org/T356866#9596009 (10gmodena) a:03gmodena [14:28:32] 10Data-Engineering (Sprint 9), 10ChangeProp, 10observability, 10service-runner, 10Event-Platform: Upgrade prom-client in NodeJS service-runner and enable collectDefaultMetrics - https://phabricator.wikimedia.org/T350180#9596108 (10gmodena) > I spent some time learning this code base, touching base to val... [14:38:58] (03CR) 10Joal: Extract RefineSingleApp code from Refine (0310 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1003745 (https://phabricator.wikimedia.org/T356363) (owner: 10Joal) [14:39:48] (03PS17) 10Joal: Extract RefineSingleApp code from Refine [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1003745 (https://phabricator.wikimedia.org/T356363) [14:41:00] 06Data-Engineering, 10Data-Platform-SRE (2024.03.04 - 2024.03.24): Alerts Review: determine if we can use Prometheus to alert based on historical datasets - https://phabricator.wikimedia.org/T357537#9596152 (10bking) [14:51:40] 06Data-Engineering, 10Data-Platform-SRE (2024.03.04 - 2024.03.24): Alerts Review: determine if we can use Prometheus to alert based on historical datasets - https://phabricator.wikimedia.org/T357537#9596222 (10bking) Based on [[ https://docs.google.com/document/d/1x2YioD__Ry_O6bbJOXOvHUcQxA_lO7gHLJYX6brp3Ko/ed... [14:52:04] 06Data-Engineering, 10Data-Platform-SRE (2024.03.04 - 2024.03.24): Alerts Review: determine if we can use Prometheus to alert based on historical datasets - https://phabricator.wikimedia.org/T357537#9596225 (10bking) 05In progress→03Resolved [15:03:16] (03PS11) 10Joal: Add DataPivoter job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/995271 (https://phabricator.wikimedia.org/T354552) (owner: 10Snwachukwu) [17:47:59] 14Analytics-Radar, 06Data-Engineering, 06Growth-Team, 10Growth-Team-Filtering, 10Event-Platform: Edits to Flow pages result in a page-links-change event with no performer - https://phabricator.wikimedia.org/T216726#9597498 (10HouseBlaster) I don't see why #the-wikipedia-library is tagged here, so I have... [19:09:03] (GobblinKafkaRecordsExtractedNotEqualRecordsExpected) firing: Gobblin job event_default ingested an unexpected number of records for a Kafka topic partition. ... [19:09:03] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=event_default&var-kafka_topic=eqiad.mediawiki.cirrussearch.page_rerender.v1&viewPanel=4 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [20:05:36] 10Data-Engineering (Sprint 9), 10ChangeProp, 10observability, 10service-runner, 10Event-Platform: Upgrade prom-client in NodeJS service-runner and enable collectDefaultMetrics - https://phabricator.wikimedia.org/T350180#9598036 (10gmodena) @Ottomata @Jdforrester-WMF there's a caveat wrt using `collectDe... [20:09:03] (GobblinKafkaRecordsExtractedNotEqualRecordsExpected) resolved: Gobblin job event_default ingested an unexpected number of records for a Kafka topic partition. ... [20:09:03] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=event_default&var-kafka_topic=eqiad.mediawiki.cirrussearch.page_rerender.v1&viewPanel=4 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [20:43:02] (03PS1) 10Santiago Faci: Adding a new contextual attribute: performer.activity_token [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1008541 (https://phabricator.wikimedia.org/T358758) [20:44:09] (03PS2) 10Santiago Faci: Adding a new contextual attribute to the web/base schema: performer.activity_token [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1008541 (https://phabricator.wikimedia.org/T358758)