[00:02:42] 10Analytics, 10Data-Engineering, 10Event-Platform: Discussion of Event Driven Systems - https://phabricator.wikimedia.org/T290203 (10Ottomata) Today, @Andrew and @Cparle said they weren't sure what was meant by 'turning the database inside out'. Here's a quote from the Designing Event Driven Systems book:... [00:09:01] (03CR) 10Bstorm: [C: 03+1] "I think everything works in this version. I tried whatever I could think of 😊" [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/716031 (owner: 10Andrew Bogott) [00:12:24] 10Quarry: Dev environment should have separate database to test against - https://phabricator.wikimedia.org/T287902 (10Bstorm) 05Openβ†’03Resolved People might not notice right away that I updated the readme to explain how to use it now, but it works pretty consistently and far more resembles working with the... [00:18:33] 10Analytics-Radar, 10Product-Analytics, 10Editing-team (Tracking): How often do people try to edit on mobile devices, using the desktop site, at the English Wikipedia? - https://phabricator.wikimedia.org/T288972 (10MNeisler) p:05Triageβ†’03Low [01:15:01] 10Analytics, 10Community-Tech, 10Event Metrics, 10Event-Platform: EventStreams sending same data over and over (page links change) - https://phabricator.wikimedia.org/T290211 (10Green_Cardamom) [01:16:02] 10Analytics, 10Community-Tech, 10Event Metrics, 10Event-Platform: EventStreams sending same data over and over (page links change) - https://phabricator.wikimedia.org/T290211 (10Green_Cardamom) `{"$schema":"/mediawiki/page/links-change/1.0.0","meta":{"uri":"https://arz.wikipedia.org/wiki/%D8%B1%D9%88%D8%AF... [01:21:01] 10Analytics, 10Community-Tech, 10Event Metrics, 10Event-Platform: EventStreams sending same data over and over (page links change) - https://phabricator.wikimedia.org/T290211 (10Samwilson) Is this related to #event_metrics? [01:27:34] 10Analytics, 10Community-Tech, 10Event Metrics, 10Event-Platform: EventStreams sending same data over and over (page links change) - https://phabricator.wikimedia.org/T290211 (10Green_Cardamom) No idea. Feel free to adjust for the right audience I wasn't sure. [01:36:23] 10Analytics, 10Event-Platform: EventStreams sending same data over and over (page links change) - https://phabricator.wikimedia.org/T290211 (10Samwilson) [06:59:00] 10Analytics, 10SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users group for Abban Dunne - https://phabricator.wikimedia.org/T289775 (10fgiunchedi) @odimitrijevic hello, a friendly reminder this task is pending approval (cc @Ottomata too) [07:55:45] (03PS1) 10Joal: Update pageview-monthly-dump oozie SLA [analytics/refinery] - 10https://gerrit.wikimedia.org/r/716215 [08:16:10] (03PS1) 10Jgiannelos: Map tile expiration event schema [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/716219 (https://phabricator.wikimedia.org/T289771) [08:38:55] 10Analytics: Check home/HDFS leftovers of fdans - https://phabricator.wikimedia.org/T290231 (10MoritzMuehlenhoff) [08:40:52] :( :( :( [08:45:25] (03CR) 10Joal: [C: 03+2] "LGTM! Thanks Hugh" [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/715995 (owner: 10Hnowlan) [08:49:49] (03CR) 10Joal: [C: 03+1] "The code looks good to me - Given how much impact the code has on our scheduled jobs, I'd like to synchronize releasing this with changing" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/495141 (https://phabricator.wikimedia.org/T217967) (owner: 10Gehel) [08:54:15] (03CR) 10Joal: [C: 03+1] "Code looks good - As for the other patch making -shaded jars explicit, I'd synchronize this with changes in the various places they're use" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/716033 (https://phabricator.wikimedia.org/T217967) (owner: 10Ottomata) [09:01:19] 10Analytics: Check home/HDFS leftovers of gilles - https://phabricator.wikimedia.org/T290232 (10MoritzMuehlenhoff) [09:02:28] (03CR) 10David Caro: Move flask creation into create_app() factory, move routes into blueprints (031 comment) [analytics/quarry/web] (buster) - 10https://gerrit.wikimedia.org/r/715621 (owner: 10Andrew Bogott) [09:45:54] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Update pageview-monthly-dump oozie SLA [analytics/refinery] - 10https://gerrit.wikimedia.org/r/716215 (owner: 10Joal) [10:32:17] hnowlan: Thank you for the document - I updated minimal stuff on loading jobs, added the step of quality checking, and added a comment - The note about sstableloader is interesting! [10:44:39] joal: nice, thank you! [12:28:53] (03CR) 10Gehel: [C: 03+1] "LGTM, minor comment inline. I have not done any kind of testing and don't know much about the larger context of this project, my +1 is onl" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/495141 (https://phabricator.wikimedia.org/T217967) (owner: 10Gehel) [13:29:39] (03CR) 10Hnowlan: [V: 03+2] Add new AQS hosts to scap groups [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/715995 (owner: 10Hnowlan) [13:40:18] 10Analytics, 10SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users group for Abban Dunne - https://phabricator.wikimedia.org/T289775 (10Ottomata) Approved [13:42:56] (03CR) 10Ottomata: "In prod, we should be using versioned jar files explicitly, not symlinks. So this should only matter for when we want to bump jar version" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/716033 (https://phabricator.wikimedia.org/T217967) (owner: 10Ottomata) [13:44:06] (03CR) 10Ottomata: Publish both shaded and unshaded artifacts. (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/495141 (https://phabricator.wikimedia.org/T217967) (owner: 10Gehel) [14:07:33] heya teammm [14:07:36] (03CR) 10David Caro: [C: 03+1] Move flask creation into create_app() factory, move routes into blueprints (031 comment) [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/716031 (owner: 10Andrew Bogott) [14:24:37] (03PS1) 10MNeisler: Add the content_translation_event stream to the allowlist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/716339 (https://phabricator.wikimedia.org/T281511) [14:25:50] 10Analytics, 10CX-analytics, 10Language-analytics, 10Patch-For-Review, 10Product-Analytics (Kanban): Add content_translation_event data stream to the sanitization allowlist - https://phabricator.wikimedia.org/T281511 (10MNeisler) [14:32:21] hello mforns ! [14:40:20] Γ‘+ [14:40:28] oops, wrong character set [14:40:35] it was: :] [14:55:52] mforns: can i help you figure out the sanitization alert? [14:56:18] ottomata: sure, I appreciate help :] [14:56:42] ok looking [14:57:27] It's weird, because the code is correct, but the interval checked by the monitor job is still 3 days (24 hours before and 24 hours after...) [14:58:07] OH [14:58:17] mforns: i think we might have changed the wrong place in puppet. [14:58:18] looking [14:58:30] ?? [14:58:32] cat /usr/local/bin/monitor_refine_event_sanitized_analytics_delayed [14:58:33] hm [14:58:35] --config_file /etc/refinery/refine/refine_event_sanitized_analytics_delayed.properties --since 1128 --until 1056 "${@}" [14:58:54] 72 hours... [15:00:24] oh, I changed the main job [15:00:28] oh, yeah [15:00:29] not analytics [15:00:38] which shoudl also do the same thing though, right? [15:00:43] yes, I think so [15:01:08] ok submitting patch [15:01:15] oh, I can do that! [15:01:21] oh ok if you like [15:01:33] yep [15:04:49] ottomata: https://gerrit.wikimedia.org/r/c/operations/puppet/+/716372 [15:07:28] thanks ottomata for the help! I didn't really look into that after first patch. was focusing on aqs tests. [15:08:08] no prob glad it was an easy fix! [15:11:04] (03PS9) 10Ottomata: Publish both shaded and unshaded artifacts. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/495141 (https://phabricator.wikimedia.org/T217967) (owner: 10Gehel) [15:11:08] (03CR) 10Ottomata: Publish both shaded and unshaded artifacts. (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/495141 (https://phabricator.wikimedia.org/T217967) (owner: 10Gehel) [15:11:10] (03CR) 10jerkins-bot: [V: 04-1] Publish both shaded and unshaded artifacts. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/495141 (https://phabricator.wikimedia.org/T217967) (owner: 10Gehel) [15:11:57] (03PS2) 10Ottomata: bin/update-refinery-source-jars now downloads and symlinks shaded jars [analytics/refinery] - 10https://gerrit.wikimedia.org/r/716033 (https://phabricator.wikimedia.org/T217967) [15:12:09] (03CR) 10Ottomata: [V: 03+2 C: 03+2] bin/update-refinery-source-jars now downloads and symlinks shaded jars [analytics/refinery] - 10https://gerrit.wikimedia.org/r/716033 (https://phabricator.wikimedia.org/T217967) (owner: 10Ottomata) [15:12:31] (03CR) 10Ottomata: "recheck" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/495141 (https://phabricator.wikimedia.org/T217967) (owner: 10Gehel) [15:14:19] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Publish both shaded and unshaded artifacts. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/495141 (https://phabricator.wikimedia.org/T217967) (owner: 10Gehel) [15:18:58] 10Analytics, 10Event-Platform, 10Patch-For-Review: WikipediaPortal Event Platform Migration - https://phabricator.wikimedia.org/T282012 (10Ottomata) Hi @EYener! Checking in, how's the Q1 migration of this instrumentation going? [15:22:29] awight: hello! [15:22:39] is the EditConflict EventLogging schema in use? [15:22:44] i see you made a change to the schema in 2020 [15:22:45] https://meta.wikimedia.org/w/index.php?title=Schema%3AEditConflict&type=revision&diff=19778941&oldid=8861166 [15:28:48] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Better Use Of Data, and 4 others: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned - https://phabricator.wikimedia.org/T282131 (10Ottomata) [15:29:06] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Better Use Of Data, and 4 others: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned - https://phabricator.wikimedia.org/T282131 (10Ottomata) Edited task to only include schemas we don't yet know what to d... [15:48:43] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Better Use Of Data, and 4 others: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned - https://phabricator.wikimedia.org/T282131 (10Ottomata) @MMiller_WMF do you know if weΒ can decom the GuidedTour* schema... [16:10:08] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10wmfdata-python: wmfdata-python's Hive query output includes logspam - https://phabricator.wikimedia.org/T275233 (10Milimetric) Will try the solution Andrew mentions above, after @nshahquinn-wmf lets me know how he'd like me to contribute to wmfdata. [16:12:04] 10Analytics-Kanban, 10Patch-For-Review: Add a spark job loading Cassandra 3 - https://phabricator.wikimedia.org/T280649 (10JAllemandou) 05Openβ†’03Resolved [16:12:06] 10Analytics, 10Cassandra, 10Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Cassandra3 migration for Analytics AQS - https://phabricator.wikimedia.org/T249755 (10JAllemandou) [16:12:43] 10Analytics, 10Analytics-Kanban, 10Cassandra, 10Data-Engineering, and 2 others: Update cassandra oozie jobs to load cassandra3 using Spark job - https://phabricator.wikimedia.org/T289161 (10JAllemandou) 05Openβ†’03Resolved [16:12:49] 10Analytics, 10Cassandra, 10Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Cassandra3 migration for Analytics AQS - https://phabricator.wikimedia.org/T249755 (10JAllemandou) [16:13:00] 10Analytics-Kanban, 10Data-Engineering, 10Privacy Engineering, 10Privacy, and 2 others: Add check to make sure deny-list countries aren't being passed through AQS - https://phabricator.wikimedia.org/T289279 (10JAllemandou) 05Openβ†’03Resolved [16:20:43] 10Analytics, 10Data-Engineering, 10Event-Platform: Discussion of Event Driven Systems - https://phabricator.wikimedia.org/T290203 (10Milimetric) Here's some of my (somewhat naive) perspective on link tables. I know there are lots of other problems, but this is stuff I've been thinking about over the years:... [16:44:43] 10Analytics: Check home/HDFS leftovers of gilles - https://phabricator.wikimedia.org/T290232 (10odimitrijevic) p:05Triageβ†’03High [16:48:45] 10Analytics, 10Event-Platform: EventStreams sending same data over and over (page links change) - https://phabricator.wikimedia.org/T290211 (10Ottomata) Hmmm very strange! @Pchelolo? Sounds like something is strange with the MW hook. [16:50:40] 10Analytics, 10Event-Platform, 10Platform Engineering: EventStreams sending same data over and over (page links change) - https://phabricator.wikimedia.org/T290211 (10odimitrijevic) p:05Triageβ†’03High [16:51:52] 10Analytics, 10Analytics-Kanban: Fix `wmf.editors_daily` data deletion - https://phabricator.wikimedia.org/T290093 (10odimitrijevic) p:05Triageβ†’03High [16:55:27] 10Analytics, 10Event-Platform, 10Patch-For-Review: Users should run explicit commands to materialize schema versions, rather than using magic git hooks - https://phabricator.wikimedia.org/T290074 (10odimitrijevic) p:05Triageβ†’03Medium [16:58:30] 10Analytics, 10Analytics-Kanban: Fix `wmf.editors_daily` data deletion - https://phabricator.wikimedia.org/T290093 (10Milimetric) a:03Milimetric [16:59:14] 10Analytics: Fill holes in pageview-complete dumps using pageview-count-raw - https://phabricator.wikimedia.org/T290060 (10odimitrijevic) p:05Triageβ†’03Medium Good starter task. [16:59:52] 10Analytics: Fill holes in pageview-complete dumps using pageview-count-raw - https://phabricator.wikimedia.org/T290060 (10JAllemandou) p:05Mediumβ†’03Triage [16:59:59] 10Analytics: Fill holes in pageview-complete dumps using pageview-count-raw - https://phabricator.wikimedia.org/T290060 (10JAllemandou) p:05Triageβ†’03Medium [17:02:15] 10Analytics, 10Event-Platform, 10Metrics-Platform: Source geolocation directly rather than using IP in schema - https://phabricator.wikimedia.org/T290014 (10odimitrijevic) p:05Triageβ†’03Medium [17:05:02] 10Analytics, 10Analytics-EventLogging: '.event.finalSlide' should be integer - https://phabricator.wikimedia.org/T289866 (10odimitrijevic) p:05Triageβ†’03High @gabriel-wmde @CorinnaHillebrand_WMDE @Tim_WMDE Is this a schema that your team is instrumenting? [17:06:30] 10Analytics-Radar: Update ROCm version on GPU instances. - https://phabricator.wikimedia.org/T287267 (10odimitrijevic) That's great! @BTullis and @razzi are busy but do reach out if you have any questions. [17:07:05] 10Analytics-Radar, 10Product-Analytics (Kanban): [REQUEST] Investigate decrease in New Registered Users - https://phabricator.wikimedia.org/T289799 (10odimitrijevic) [17:08:45] 10Analytics-Radar, 10CX-analytics, 10Language-analytics, 10Patch-For-Review, 10Product-Analytics (Kanban): Add content_translation_event data stream to the sanitization allowlist - https://phabricator.wikimedia.org/T281511 (10odimitrijevic) [17:09:29] (03CR) 10Andrew Bogott: Move flask creation into create_app() factory, move routes into blueprints (031 comment) [analytics/quarry/web] (buster) - 10https://gerrit.wikimedia.org/r/715621 (owner: 10Andrew Bogott) [17:10:55] (03PS5) 10Andrew Bogott: Move flask creation into create_app() factory, move routes into blueprints [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/716031 [17:13:36] (03CR) 10Andrew Bogott: [C: 03+2] Move flask creation into create_app() factory, move routes into blueprints [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/716031 (owner: 10Andrew Bogott) [17:15:08] 10Analytics, 10Data-Engineering, 10FR-Tech-Analytics, 10Privacy Engineering: event.WikipediaPortal referer modification - https://phabricator.wikimedia.org/T279952 (10odimitrijevic) [17:16:53] 10Analytics-Clusters, 10Data-Engineering: Deploy an-test-presto1002 as a Ganeti VM to test Presto and Alluxio integration - https://phabricator.wikimedia.org/T288766 (10odimitrijevic) [17:17:19] (03Merged) 10jenkins-bot: Move flask creation into create_app() factory, move routes into blueprints [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/716031 (owner: 10Andrew Bogott) [17:18:59] 10Analytics, 10SRE, 10Patch-For-Review: Trash cleanup cron spams on an-test hosts - https://phabricator.wikimedia.org/T286442 (10Ottomata) Could we just make the script use /home which is everywhere? [17:57:56] (03PS1) 10Andrew Bogott: quarry.wsgi: create application object even if we aren't main [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/716460 [18:01:42] (03CR) 10Bstorm: [C: 03+1] "That should make the real uwsgi happy. It wasn't finding an "application" callable because __name__ is not "__main__"" [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/716460 (owner: 10Andrew Bogott) [18:02:17] (03CR) 10Andrew Bogott: [C: 03+2] quarry.wsgi: create application object even if we aren't main [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/716460 (owner: 10Andrew Bogott) [18:03:49] mforns: Hi! [18:03:53] mforns: batcave? [18:07:41] joal: yesss, gimme 3 minutes [18:08:04] sure mforns no problem [18:15:34] joal: omw! [18:22:09] (03CR) 10Ottomata: "Would it make sense to model this as a state change, rather than an expiration command?" [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/716219 (https://phabricator.wikimedia.org/T289771) (owner: 10Jgiannelos) [18:27:34] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10Jclark-ctr) [18:38:05] 10Analytics, 10Event-Platform, 10Patch-For-Review: Users should run explicit commands to materialize schema versions, rather than using magic git hooks - https://phabricator.wikimedia.org/T290074 (10Urbanecm) >>! In T290074#7321592, @Ottomata wrote: > I've created this ticket to discuss with users of the sch... [18:50:21] heya joal, you still there? [18:50:38] Joining! [18:50:42] k [19:30:49] 10Analytics, 10Data-Engineering, 10Event-Platform: Discussion of Event Driven Systems - https://phabricator.wikimedia.org/T290203 (10daniel) I made this doodle of an "event driven mediawiki" architecture a while ago. I had forgotten about this, but listening the "db inside out talk" made me remember. I'd be... [19:32:57] 10Analytics-Radar, 10Product-Analytics (Kanban): [REQUEST] Investigate decrease in New Registered Users - https://phabricator.wikimedia.org/T289799 (10mpopov) From https://meta.wikimedia.org/wiki/Research:Newly_registered_user#Data_sources: > Newly registered users are logged globally via Schema:ServerSideAcc... [19:33:09] Gone for tonight team - see ou tomorrow [20:05:30] 10Analytics, 10Data-Engineering, 10FR-Tech-Analytics, 10Privacy Engineering: event.WikipediaPortal referer modification - https://phabricator.wikimedia.org/T279952 (10mforns) @sguebo_WMF & @EYener, we discussed this task and will go ahead and implement this feature. However, as it is something not trivial,... [21:27:10] (03Abandoned) 10Shay Nowick: Creating android_setting_action schema Bug: T285779 [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/714871 (https://phabricator.wikimedia.org/T285779) (owner: 10Shay Nowick) [21:27:30] (03Abandoned) 10Shay Nowick: Create android_user_state schema Bug: T285779 Change-Id: Iac1e505db912758cc02541b218d8095d0dc7ef4b [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/715562 (https://phabricator.wikimedia.org/T285779) (owner: 10Shay Nowick) [21:27:38] (03Abandoned) 10Shay Nowick: Create mobile-apps/android_user_state 1.0.0 [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/715563 (owner: 10Shay Nowick) [21:33:15] (03PS1) 10Andrew Bogott: Added minimal page load test for '/' route [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/716558 [21:39:38] (03CR) 10jerkins-bot: [V: 04-1] Added minimal page load test for '/' route [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/716558 (owner: 10Andrew Bogott) [22:10:40] 10Analytics, 10Patch-For-Review: Decide whether to migrate from Presto to Trino - https://phabricator.wikimedia.org/T266640 (10nshahquinn-wmf) [22:58:56] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10wmfdata-python: wmfdata-python's Hive query output includes logspam - https://phabricator.wikimedia.org/T275233 (10nshahquinn-wmf) @Milimetric thank you so much for volunteering to take this on! As you know, we want to need to rewrite the [hive module... [23:02:05] 10Analytics, 10Readers-Web-Backlog, 10Patch-For-Review, 10Product-Analytics (Kanban): Add UniversalLanguageSelector to the allowlist - https://phabricator.wikimedia.org/T287256 (10jwang) The sanitization is enabled. Now the data is available in event_sanitized.UniversalLanguageSelector. The earliest event...