[02:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [04:42:35] PROBLEM - Check unit status of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [04:53:47] RECOVERY - Check unit status of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [06:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [08:15:42] 10Data-Engineering, 10Data-Engineering-Kanban: Triage Superset Dashboard Timeouts - https://phabricator.wikimedia.org/T294768 (10razzi) 05Open→03Resolved [08:15:44] 10Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Presto/Superset User Experience Improvement - https://phabricator.wikimedia.org/T294259 (10razzi) [08:23:13] (03PS1) 10GoranSMilovanovic: T296926 [analytics/wmde/WD/WikidataAnalytics] - 10https://gerrit.wikimedia.org/r/745192 [08:23:38] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] T296926 [analytics/wmde/WD/WikidataAnalytics] - 10https://gerrit.wikimedia.org/r/745192 (owner: 10GoranSMilovanovic) [10:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [10:44:04] (03PS1) 10Btullis: Upgrade Superset to version 1.3.2 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/745212 (https://phabricator.wikimedia.org/T295983) [13:03:41] PROBLEM - Webrequests Varnishkafka log producer on cp5006 is CRITICAL: PROCS CRITICAL: 0 processes with args /usr/bin/varnishkafka -S /etc/varnishkafka/webrequest.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [13:04:44] --^ This alert is in #wikimedia-operations too. The host looks down. [13:05:28] Ah, it was power cycled by ema. [13:05:47] RECOVERY - Webrequests Varnishkafka log producer on cp5006 is OK: PROCS OK: 1 process with args /usr/bin/varnishkafka -S /etc/varnishkafka/webrequest.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [13:25:45] 10Data-Engineering, 10Data-Engineering-Kanban: Re-enable Superset metadata caching - https://phabricator.wikimedia.org/T295295 (10BTullis) 05Open→03Resolved [13:25:47] 10Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Presto/Superset User Experience Improvement - https://phabricator.wikimedia.org/T294259 (10BTullis) [13:35:01] mforns: o/ Hello. Is there any chance you could boost my superpowers in this group please? https://gitlab.wikimedia.org/people/wmf-team-data-engineering [14:23:40] btullis: done :] [14:32:54] 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow: [Airflow] Automate sync'ing archiva packages to HDFS - https://phabricator.wikimedia.org/T294024 (10mforns) @Ottomata That's amazing! [14:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [14:46:41] mforns: Many thanks. [14:46:54] btullis: did it work? [14:48:00] 10Analytics, 10Analytics-Wikistats, 10Data-Engineering, 10Data-Engineering-Kanban: Wikistats Bug differing view numbers - https://phabricator.wikimedia.org/T295298 (10Milimetric) Verified that the changes are now live. @Kipala, the approach we decided to go for was to link directly to the metric definitio... [15:01:01] 10Analytics-Radar, 10Data-Engineering, 10wmfdata-python, 10Product-Analytics (Kanban): Create a wmfdata-python test script - https://phabricator.wikimedia.org/T247261 (10Milimetric) I had, thanks, the regex looks good. I'm not able to test for merging this week, so I hope someone else can get to it. If n... [15:04:35] mforns: Yes, thanks. All good. I can now create a new project under repos/data-engineering and I can get to the CI runners configuraiton too. [15:04:47] 👍 [15:06:36] mforns: You might as well upgrade razzi too, while you're there please. He can then transfer https://gitlab.wikimedia.org/razzi/presto-query-logger into the repos/data-engineering/ namespace. [15:13:12] 10Analytics, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Add a presto query logger - https://phabricator.wikimedia.org/T269832 (10BTullis) >>! In T269832#7553450, @razzi wrote: > Sounds good @BTullis; I'm available to help as well. Repository is at https://gitlab.wikimedia.org/razzi/... [15:16:17] 10Analytics-Kanban, 10Airflow: Tooling for Deploying Conda Environments - https://phabricator.wikimedia.org/T296543 (10Ottomata) [15:23:04] done btullis and razzi, lmk if there's any problem [15:23:17] Excellent! Many thanks. [16:01:30] 10Analytics, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Add a presto query logger - https://phabricator.wikimedia.org/T269832 (10razzi) Good call @btulis. Imported the project into https://gitlab.wikimedia.org/repos/data-engineering/presto-query-logger, I'll keep a fork in case I wa... [16:03:11] hey a-team, quick question, are the pageprotections stored on the mediawiki_history? I don't see the on the event_types [16:14:25] dsaez: I don't recall collecting page protect events for mediawiki history [16:15:16] got it... so it would be difficult to know when a page went protected [16:17:38] dsaez: I think there's no way to extract that info from mediawiki_history [16:18:25] got it, thx [16:18:31] dsaez: I wonder if page protects are in the logging_table, probably they are, but I don't remember seeing any when developing mediawiki history reconstruction. [16:20:40] dsaez: Oh, just checked and it seems the logging table has the page protect logs, so theoretically we could add them to the algorithm. I imagine it would be a big effort though. [16:22:26] oh, I see, maybe a I can directly just go to that table on maria_db [16:24:56] 10Analytics, 10Analytics-Wikistats, 10Data-Engineering, 10Data-Engineering-Kanban: Wikistats Bug differing view numbers - https://phabricator.wikimedia.org/T295298 (10Kipala) Thanks for that. Now it is a bit more work to get an idea, but not misleadinga ny more. Thanks!! Kipala Am 08/12/2021 um 17:48... [16:27:00] dsaez: we import the logging table, it's in wmf_raw.mediawiki_logging, with data across dbs (wiki_db column), the only problem is it's there monthly. I don't think it would be a big effort to include page protection events for the base dataset, the problems start when we try to reduce the data for consumption by AQS and APIs [16:37:31] (03CR) 10Razzi: [C: 03+1] Upgrade Superset to version 1.3.2 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/745212 (https://phabricator.wikimedia.org/T295983) (owner: 10Btullis) [16:37:35] dsaez: maybe what you need is in the page restictions change stream [16:37:42] in the event.mediawiki_page_restrictions_change [16:37:43] table [16:38:06] ooh [16:38:24] thx milimetric, makes sense maybe to don't have it there.. [16:38:30] is see thigns page_restrictions like edit=editautopatrolprotected and [16:38:30] edit=editextendedsemiprotected [16:38:33] ottomata... this is very interesting [16:39:02] there is also mediawiki_revision_visibility_change [16:40:05] yeah, so you can query historically in wmf_raw.mediawiki_logging, and look at incoming events to catch up to the latest [18:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [19:22:42] joal: mforns am around in case you are and want to discuss airflow deployment [19:23:12] ottomata: in meeting, but yes, I'll ping you after is OK? [19:23:27] ya, sure! joal had specific concerns, but i could talk about other stuff too [20:22:21] ottomata: just finished meeting, you still there? [20:22:33] yup! [20:22:37] bc? [20:22:39] sure [20:23:49] 10Analytics, 10SRE, 10Traffic-Icebox: varnishkafka / ATSkafka should support setting the kafka message timestamp - https://phabricator.wikimedia.org/T277553 (10razzi) a:05razzi→03None I haven't been working on this for months; putting it up for grabs. Definitely still worth doing. [20:25:55] 10Data-Engineering: Superset annotation text overlaps illegibly - https://phabricator.wikimedia.org/T279738 (10razzi) a:05razzi→03None I'm not working on this, so I'm unassigning myself in case anybody else wants to give it a go. Steps to reproduce the upstream issue are at https://github.com/apache/superset... [21:02:22] 10Analytics-Radar, 10Data-Engineering, 10wmfdata-python, 10Product-Analytics (Kanban): Create a wmfdata-python test script - https://phabricator.wikimedia.org/T247261 (10nshahquinn-wmf) >>! In T247261#7556444, @Milimetric wrote: > I had, thanks, the regex looks good. I'm not able to test for merging this... [22:01:40] PROBLEM - eventgate-analytics-external validation error rate too high on alert1001 is CRITICAL: 2.084 gt 2 https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate https://grafana.wikimedia.org/d/ePFPOkqiz/eventgate?orgId=1&refresh=1m&var-service=eventgate-analytics-external&var-stream=All&var-kafka_broker=All&var-kafka_producer_type=All&var-dc=thanos [22:05:39] (03CR) 10MewOphaswongse: [C: 03+2] Add mediawiki.welcomesurvey.interaction schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/744003 (https://phabricator.wikimedia.org/T267273) (owner: 10Kosta Harlan) [22:06:26] (03Merged) 10jenkins-bot: Add mediawiki.welcomesurvey.interaction schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/744003 (https://phabricator.wikimedia.org/T267273) (owner: 10Kosta Harlan) [22:33:33] PROBLEM - eventgate-analytics-external validation error rate too high on alert1001 is CRITICAL: 2.082 gt 2 https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate https://grafana.wikimedia.org/d/ePFPOkqiz/eventgate?orgId=1&refresh=1m&var-service=eventgate-analytics-external&var-stream=All&var-kafka_broker=All&var-kafka_producer_type=All&var-dc=thanos [22:35:00] (EventgateLoggingExternalLatency) firing: (2) Elevated latency for GET events on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org [23:01:13] PROBLEM - eventgate-analytics-external validation error rate too high on alert1001 is CRITICAL: 2.035 gt 2 https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate https://grafana.wikimedia.org/d/ePFPOkqiz/eventgate?orgId=1&refresh=1m&var-service=eventgate-analytics-external&var-stream=All&var-kafka_broker=All&var-kafka_producer_type=All&var-dc=thanos [23:26:43] PROBLEM - eventgate-analytics-external validation error rate too high on alert1001 is CRITICAL: 2.041 gt 2 https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate https://grafana.wikimedia.org/d/ePFPOkqiz/eventgate?orgId=1&refresh=1m&var-service=eventgate-analytics-external&var-stream=All&var-kafka_broker=All&var-kafka_producer_type=All&var-dc=thanos