[00:01:13] RECOVERY - Check unit status of drop_event on an-launcher1002 is OK: OK: Status of the systemd unit drop_event https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [00:14:53] PROBLEM - Check unit status of drop_event on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit drop_event https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [08:05:40] 10Analytics-Kanban, 10Data-Engineering, 10Pontoon: Move the Analytics/DE testing infrastructure to Pontoon - https://phabricator.wikimedia.org/T292388 (10Aklapper) [10:25:16] 10Data-Engineering-Kanban, 10Data Engineering Planning, 10Event-Platform Value Stream, 10Metrics-Platform-Planning (Metrics Platform Kanban), 10Patch-For-Review: Remove StreamConfig::INTERNAL_SETTINGS logic from EventStreamConfig and do it in EventLogging client... - https://phabricator.wikimedia.org/T286344 [11:06:54] 10Data-Engineering, 10Infrastructure-Foundations, 10Product-Analytics, 10Research, and 3 others: Maybe restrict domains accessible by webproxy - https://phabricator.wikimedia.org/T300977 (10jbond) [12:15:15] RECOVERY - Check unit status of refinery-drop-webrequest-raw-partitions on an-launcher1002 is OK: OK: Status of the systemd unit refinery-drop-webrequest-raw-partitions https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [12:19:37] (03PS1) 10Gerrit maintenance bot: Add bn.wikiquote to pageview whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/837660 (https://phabricator.wikimedia.org/T319191) [12:21:42] (03CR) 10Joal: [C: 03+1] "LGTM :) Thanks for the latest addition @mforns :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/836295 (https://phabricator.wikimedia.org/T316746) (owner: 10Mforns) [12:24:37] (03CR) 10Joal: [C: 03+2] "LGTM - Thanks @xcollazo" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/836222 (https://phabricator.wikimedia.org/T316371) (owner: 10Xcollazo) [12:28:55] PROBLEM - Check unit status of refinery-drop-webrequest-raw-partitions on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit refinery-drop-webrequest-raw-partitions https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [12:34:14] (03Merged) 10jenkins-bot: Add unit test for MediaWikiEvent. Fix empty path bug. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/836222 (https://phabricator.wikimedia.org/T316371) (owner: 10Xcollazo) [14:57:39] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for next deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/837660 (https://phabricator.wikimedia.org/T319191) (owner: 10Gerrit maintenance bot) [15:01:30] 10Data-Engineering, 10Equity-Landscape: Editorship Input Metrics - https://phabricator.wikimedia.org/T309274 (10ntsako) Hi @JAnstee_WMF, Pease note the queries that make up the data: ` --geoeditor_input_metrics SELECT mon.country_code, sum(mon.distinct_editors) as distinct_editors, lower(wdb.g... [15:01:50] 10Data-Engineering, 10Equity-Landscape: Editorship Input Metrics - https://phabricator.wikimedia.org/T309274 (10ntsako) a:05ntsako→03JAnstee_WMF [16:05:44] 10Data-Engineering, 10Event-Platform Value Stream, 10Product-Analytics: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) a:03Ottomata Keeping assigned to me. We are hunting down remainders in T282131, and then creating subtasks for each one. Offic... [16:14:52] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: Design Schema for page state and page state with content (enriched) streams - https://phabricator.wikimedia.org/T308017 (10Ottomata) @daniel am modeling slots in the revision entity and also the page change event now: Easiest... [16:33:46] 10Data-Engineering, 10Machine-Learning-Team, 10observability: Evaluate Benthos as stream processor - https://phabricator.wikimedia.org/T319214 (10elukey) [16:42:57] 10Data-Engineering, 10Machine-Learning-Team, 10observability: Evaluate Benthos as stream processor - https://phabricator.wikimedia.org/T319214 (10JAllemandou) Ping @gmodena, as we talked about this exact topic this morning :) [17:03:09] I created a tool to analyze user retention in wikis, maybe it can be useful for the analytics team: https://retention.toolforge.org [17:27:44] danilo: this visualization is pretty cool! I have forwarded the link to out Product Analytics team. [17:33:38] :) [17:39:23] awesome work danilo :) I'll be interested to know what data you've used as source to compute the rentention :) [17:45:02] joal: the db revision table, I found a way to get all revisions of all wikis in batches without overload the db servers [17:45:26] ack danilo - thanks for letting me know :) [17:54:06] 10Data-Engineering, 10Machine-Learning-Team, 10observability: Evaluate Benthos as stream processor - https://phabricator.wikimedia.org/T319214 (10gmodena) thanks for the ping @JAllemandou . This looks really interesting, especially for ease of deployment. @elukey do you know if `http_client` calls are async... [18:32:50] 10Data-Engineering, 10Pageviews-API: Pageviews API: Problems accessing data from python (requests) - https://phabricator.wikimedia.org/T319233 (10diego) [18:34:25] 10Data-Engineering, 10Pageviews-API: Pageviews API: Problems accessing data from python (requests) - https://phabricator.wikimedia.org/T319233 (10diego) I see that the error says: ` Scripted requests from your IP have been blocked. ` However, the error persists from different IPs. [18:36:28] 10Data-Engineering, 10Pageviews-API: Pageviews API: Problems accessing data from python (requests) - https://phabricator.wikimedia.org/T319233 (10diego) Sorry, I've read the documentation here: https://meta.wikimedia.org/wiki/User-Agent_policy and everything is clear. I'm going to close this ticket. Just for... [18:36:42] 10Data-Engineering, 10Pageviews-API: Pageviews API: Problems accessing data from python (requests) - https://phabricator.wikimedia.org/T319233 (10diego) 05Open→03Resolved [18:37:27] 10Data-Engineering, 10Pageviews-API: Pageviews API: Problems accessing data from python (requests) - https://phabricator.wikimedia.org/T319233 (10taavi) 05Resolved→03Open https://meta.wikimedia.org/wiki/User-Agent_policy [18:37:36] 10Data-Engineering, 10Pageviews-API: Pageviews API: Problems accessing data from python (requests) - https://phabricator.wikimedia.org/T319233 (10taavi) 05Open→03Resolved a:03taavi [19:42:20] joal: heya, I gave yet another round of changes to the deletion script, for performance reasons. The method get_data_info was being called more than necessary. And also, I noticed that we could sort the directories after hdfs.ls() and assume they can be ordered alfanumerically, so the first one that returns a start date is the one that is rolled up recursively, instead of calling all children. [20:45:52] seems the changes have improved the performance indeed: the drop-predictions-actor_label-hourly job went from not finishing in 1 hour, to ~1 minute. [21:22:59] 10Data-Engineering, 10Equity-Landscape: Editorship Input Metrics - https://phabricator.wikimedia.org/T309274 (10JAnstee_WMF) @ntsako - second pass complete - while we do not fully align they are close and the discrepancy is related to the correction in the pipeline to pull the average whereas the 2021 workbook... [21:26:01] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10Performance-Team (Radar): Avoid extra HTTPS connections for most Event Platform beacons - https://phabricator.wikimedia.org/T263049 (10Krinkle) [21:27:26] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10Performance-Team (Radar): Avoid extra HTTPS connections for most Event Platform beacons - https://phabricator.wikimedia.org/T263049 (10Krinkle) Updated title to reflect to recognise that the original one of these (NEL: Network Error Logging)... [22:18:09] 10Data-Engineering, 10Product-Analytics, 10wmfdata-python, 10Data Pipelines (Sprint 02): Upgrade WMFData Python Package to use Spark3 - https://phabricator.wikimedia.org/T318587 (10nshahquinn-wmf) Based on the activity above, it looks like #data-engineering is planning to do this work.