[02:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [06:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [09:32:51] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, and 2 others: Setup Analytics team in VO/splunk oncall - https://phabricator.wikimedia.org/T273064 (10BTullis) 05In progress→03Resolved [09:55:21] 10Data-Engineering, 10Data-Engineering-Kanban, 10observability, 10serviceops: Move kafka-jumbo to a fixed uid/gid - https://phabricator.wikimedia.org/T296990 (10BTullis) 05Open→03Stalled Postponing this work until the New Year, based on the feedback from @Jgreen here: T296064#7538028 [09:55:25] 10Data-Engineering, 10observability, 10serviceops, 10Patch-For-Review: Move kafka clusters to fixed uid/gid - https://phabricator.wikimedia.org/T296982 (10BTullis) [10:12:37] 10Data-Engineering, 10observability, 10serviceops, 10Patch-For-Review: Move kafka clusters to fixed uid/gid - https://phabricator.wikimedia.org/T296982 (10elukey) [10:24:25] 10Data-Engineering, 10DBA, 10Infrastructure-Foundations, 10Patch-For-Review, 10Puppet: Split mariadb::dbstore_multiinstance into 2 separate roles (backup sources and analytics) - https://phabricator.wikimedia.org/T296285 (10Kormat) 05Open→03Resolved >>! In T296285#7545417, @jcrespo wrote: > ` > from:... [10:30:42] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban: Repair and reload cassandra2 mediarequest_per_file data table - https://phabricator.wikimedia.org/T291470 (10BTullis) Beginning the reload script now: ` ### Moving table data in keyspace local_group_default_T_mediarequest_per_f... [10:31:31] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban: Repair and reload cassandra2 mediarequest_per_file data table - https://phabricator.wikimedia.org/T291470 (10BTullis) [10:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [10:46:35] 10Data-Engineering, 10Infrastructure-Foundations, 10SRE, 10Traffic-Icebox, and 2 others: Collect netflow data for internal traffic - https://phabricator.wikimedia.org/T263277 (10ayounsi) @JAllemandou This is great, thanks! Note that we can tune sampling to adapt. What would be the next steps? [10:59:27] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review, 10User-razzi: Increase Superset Timeout - https://phabricator.wikimedia.org/T294771 (10BTullis) @Razzi, you're spot on! Many thanks. At first glance I thought that we **didn't** run envoy on an-tool1010, since trafficserver was already doi... [11:06:10] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review, 10User-razzi: Increase Superset Timeout - https://phabricator.wikimedia.org/T294771 (10elukey) We use envoy in front of all UIs that we run since ATS backends can call our endpoints from any caching pop, so potentially traversing non secure... [11:45:04] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review, 10User-razzi: Increase Superset Timeout - https://phabricator.wikimedia.org/T294771 (10BTullis) Thanks elukey. It works! And we didn't get hit by the 2 minute keepalive timeout in ATS either. My query ran to the full 180 seconds until the t... [12:17:43] eqi stat1008 [12:17:46] woops [12:19:53] :-) [12:33:32] Hi btullis :) [12:33:47] Hi joal. [12:33:50] btullis: I'm doing data vetting for cassandra for pageviews, so far so good [12:34:08] Oh, really glad to hear it. [12:35:04] The re-loading of mediaviews is under way now. 4 repaired snapshots. [12:40:57] \o/ [14:02:25] 10Analytics: Requesting Kerberos Identity - https://phabricator.wikimedia.org/T297114 (10SCherukuwada) [14:24:28] (03PS1) 10Kosta Harlan: Add WelcomeSurvey Interaction schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/744003 (https://phabricator.wikimedia.org/T267273) [14:32:52] 10Data-Engineering, 10Infrastructure-Foundations, 10SRE, 10Traffic-Icebox, and 2 others: Collect netflow data for internal traffic - https://phabricator.wikimedia.org/T263277 (10Ottomata) > @BTullis thanks! Real-time, would be a nice plus, but a hard requirement (unlike netflow). Did you mean _not_ a hard... [14:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [14:38:12] (03CR) 10jerkins-bot: [V: 04-1] Add WelcomeSurvey Interaction schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/744003 (https://phabricator.wikimedia.org/T267273) (owner: 10Kosta Harlan) [15:02:36] 10Data-Engineering: Try to improve the LDAP integration for Superset user account creation - https://phabricator.wikimedia.org/T297120 (10BTullis) [15:03:00] 10Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Presto/Superset User Experience Improvement - https://phabricator.wikimedia.org/T294259 (10BTullis) [15:03:02] 10Data-Engineering: Try to improve the LDAP integration for Superset user account creation - https://phabricator.wikimedia.org/T297120 (10BTullis) [15:04:24] 10Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Presto/Superset User Experience Improvement - https://phabricator.wikimedia.org/T294259 (10BTullis) [15:09:21] 10Data-Engineering, 10Data-Engineering-Kanban, 10Phabricator: Herald rule for Data-Engineering - https://phabricator.wikimedia.org/T295397 (10Ottomata) https://github.com/Ladsgroup/Phabricator-maintenance-bot/pull/42 It doesn't look like the bot can remove tags, only add them. [15:09:27] 10Data-Engineering, 10Data-Engineering-Kanban, 10Phabricator: Herald rule for Data-Engineering - https://phabricator.wikimedia.org/T295397 (10Ottomata) a:03Ottomata [15:55:34] 10Data-Engineering, 10Infrastructure-Foundations, 10SRE, 10Traffic-Icebox, and 2 others: Collect netflow data for internal traffic - https://phabricator.wikimedia.org/T263277 (10ayounsi) > Did you mean _not_ a hard requirement? Yep, my bad :) [15:57:17] 10Analytics-Radar, 10Data-Engineering, 10wmfdata-python, 10Product-Analytics (Kanban): Create a wmfdata-python test script - https://phabricator.wikimedia.org/T247261 (10odimitrijevic) [16:07:14] 10Data-Engineering, 10Data-Engineering-Kanban, 10SRE-swift-storage, 10Patch-For-Review: Deploy research_poc Swift credidentials to Hadoop - https://phabricator.wikimedia.org/T296945 (10Ottomata) @fkaelin `sudo -u analytics-research kerberos-run-command analytics-research hdfs dfs -cat /user/analytics-re... [17:00:34] 10Analytics-Radar, 10Data-Engineering, 10wmfdata-python, 10Product-Analytics (Kanban): Create a wmfdata-python test script - https://phabricator.wikimedia.org/T247261 (10nshahquinn-wmf) Okay, I've reworked the [pull request](https://github.com/wikimedia/wmfdata-python/pull/16) and solved the outstanding is... [18:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [18:48:38] 10Analytics, 10Data-Engineering, 10Event-Platform: Discussion of Event Driven Systems - https://phabricator.wikimedia.org/T290203 (10Ottomata) FYI, if anyone is interested, there is a free talk from Confluent on Dec 16 2021: Consistency and Completeness: Rethinking Distributed Stream Processing in Apache Kaf... [19:49:41] 10Analytics, 10Patch-For-Review, 10User-Elukey: Port architecture of irc-recentchanges to Kafka - https://phabricator.wikimedia.org/T234234 (10AntiCompositeNumber) Sandboxing connections would make it more difficult to debug problems like https://github.com/countervandalism/CVNBot/issues/72 by making it impo... [20:44:35] 10Data-Engineering: Investigate Superset Druid Timeouts - https://phabricator.wikimedia.org/T297148 (10odimitrijevic) [22:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org