[01:35:03] (03CR) 10Clare Ming: "waiting for naming convention resolution before updating schema name again" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/732089 (https://phabricator.wikimedia.org/T292587) (owner: 10Clare Ming) [04:15:20] (03PS6) 10DLynch: talk_page_event schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731333 (https://phabricator.wikimedia.org/T286076) [04:15:34] (03CR) 10DLynch: talk_page_event schema (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731333 (https://phabricator.wikimedia.org/T286076) (owner: 10DLynch) [09:01:06] !log reverted hive services back to an-coord1001. [09:01:08] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:02:06] 10Analytics-Kanban, 10Data-Engineering: Purge any Kerberos keytab files that are not managed by puppet - https://phabricator.wikimedia.org/T294124 (10BTullis) p:05Triage→03Medium a:03BTullis [11:07:44] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Patch-For-Review: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) I have now created a patch to open port 7000 on the aqs_next servers to an-presto1001. https:... [11:20:28] 10Quarry: Find somewhere else (not NFS) to store Quarry's resultsets - https://phabricator.wikimedia.org/T178520 (10Majavah) [11:20:33] 10Quarry: Find somewhere else (not NFS) to store Quarry's resultsets - https://phabricator.wikimedia.org/T178520 (10Majavah) [12:36:01] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban: Write document about "Fast Enough Superset" - https://phabricator.wikimedia.org/T294046 (10JAllemandou) [12:36:20] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban: Write document about "Fast Enough Superset" - https://phabricator.wikimedia.org/T294046 (10JAllemandou) [12:36:46] 10Analytics: Reduce superset timeouts problem - https://phabricator.wikimedia.org/T294048 (10JAllemandou) [12:37:49] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Push Gobblin import metrics to Prometheus and add alerts on some critical imports - https://phabricator.wikimedia.org/T286503 (10JAllemandou) [13:14:21] (03CR) 10Ottomata: talk_page_event schema (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731333 (https://phabricator.wikimedia.org/T286076) (owner: 10DLynch) [14:16:32] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, and 2 others: Upgrade Superset to 1.3.1 or higher - https://phabricator.wikimedia.org/T288115 (10BTullis) I've checked Superset again a few more times and not run into any issues. [14:29:36] o/ [14:30:11] how does deduplication work when refining raw json events into the event.* tables? [14:31:13] I think I found an event in the raw json files that is apparently missing in the corresponding event.table and was wondering if it could be related to dedup [14:41:32] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Patch-For-Review: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) At last, loading of snapshot 7 of 12 is now under way. ` progress: [/10.64.32.128]0:0/2131 0... [14:56:24] 10Analytics, 10Data-Engineering, 10Desktop Improvements, 10MediaWiki-extensions-WikimediaEvents, 10Readers-Web-Backlog (Kanbanana-FY-2021-22): Add agent_type and access_method to event data - https://phabricator.wikimedia.org/T294246 (10cjming) hi @ovasileva - just some context on why this ticket is on o... [14:56:49] (03CR) 10MNeisler: Add the SearchSatisfaction legacy schema to the allowlist (035 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/715055 (https://phabricator.wikimedia.org/T274607) (owner: 10MNeisler) [15:04:07] dcausse: https://github.com/wikimedia/analytics-refinery-source/blob/354b8a45ac0081d6d17397376c46f623e522f5fa/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/refine/TransformFunctions.scala#L63-L115 [15:04:52] ah thanks [15:05:05] so no uuid nor meta.id no dedup [15:05:24] hm... my events do not have any of this [15:06:04] will file a task with my investigation I might have missed some important bits [15:07:35] (03CR) 10DLynch: talk_page_event schema (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731333 (https://phabricator.wikimedia.org/T286076) (owner: 10DLynch) [15:08:35] (03PS7) 10DLynch: talk_page_event schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731333 (https://phabricator.wikimedia.org/T286076) [15:27:02] 10Analytics, 10Wikidata-Query-Service: Events missing from event.rdf_streaming_updater_fetch_failure but present in /wmf/data/raw/event/eqiad.rdf-streaming-updater.fetch-failure - https://phabricator.wikimedia.org/T294361 (10dcausse) [15:33:32] a-team anything to deploy for the weekly train? [15:33:53] Nothing from me, thanks. [15:34:59] ottomata: nothing new - do we test/redeploy the gobblin-purge? [15:35:04] 10Analytics, 10Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Alluxio for Improved Superset Query Performance - https://phabricator.wikimedia.org/T288252 (10BTullis) Should we decline this ticket now, or mark it as resolved, or re-title it? [15:36:42] 10Analytics, 10Wikidata-Query-Service: Events missing from event.rdf_streaming_updater_fetch_failure but present in /wmf/data/raw/event/eqiad.rdf-streaming-updater.fetch-failure - https://phabricator.wikimedia.org/T294361 (10dcausse) [15:40:00] joal: can we do in 20 mins? [15:40:17] sure ottomata - even tomorrow! [15:40:30] oh no i mean [15:40:34] will it take only 20 minutes? [15:40:35] 10Analytics, 10Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Alluxio for Improved Superset Query Performance - https://phabricator.wikimedia.org/T288252 (10Ottomata) I think decline. [15:40:39] we got standup in 20 [15:40:40] Ah! [15:40:51] hm, maybe, but not sure [15:41:11] ottomata: Let me schedule some time tomorrow (same as today?) - With an hour we should be safe [15:41:19] ya lets do that [15:44:00] 10Analytics: Check home/HDFS leftovers of tonina - https://phabricator.wikimedia.org/T293676 (10Ottomata) Done: ` sudo -u hdfs kerberos-run-command hdfs hdfs dfs -rm -r /user/tonina ` ` sudo cumin 'C:profile::analytics::cluster::client or C:profile::hadoop::master or C:profile::hadoop::master::standby' 'rm -rf... [15:44:07] 10Analytics: Check home/HDFS leftovers of tonina - https://phabricator.wikimedia.org/T293676 (10Ottomata) 05Open→03Resolved [15:49:54] (03CR) 10Ottomata: [C: 03+1] talk_page_event schema (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731333 (https://phabricator.wikimedia.org/T286076) (owner: 10DLynch) [16:03:02] 10Analytics, 10Event-Platform, 10Observability-Logging, 10SRE, and 2 others: Integrate Event Platform and ECS logs - https://phabricator.wikimedia.org/T291645 (10Ottomata) @colewhite, in https://phabricator.wikimedia.org/T288851#7456931 you said: > topics prefixed by rsyslog- will be automatically picked... [16:04:14] 10Analytics, 10Analytics-Kanban: Check home/HDFS leftovers of jmads - https://phabricator.wikimedia.org/T290715 (10Ottomata) a:03Ottomata [16:05:36] 10Analytics, 10Data-Engineering, 10Desktop Improvements, 10MediaWiki-extensions-WikimediaEvents, 10Readers-Web-Backlog (Kanbanana-FY-2021-22): Add agent_type and access_method to event data - https://phabricator.wikimedia.org/T294246 (10ovasileva) p:05Medium→03High [17:14:10] 10Analytics, 10Product-Analytics: conda list does not show all packages in environment - https://phabricator.wikimedia.org/T294368 (10nshahquinn-wmf) [17:14:39] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: conda list does not show all packages in environment - https://phabricator.wikimedia.org/T294368 (10nshahquinn-wmf) [17:15:15] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: conda list does not show all packages in environment - https://phabricator.wikimedia.org/T294368 (10nshahquinn-wmf) [17:17:06] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: conda list does not show all packages in environment - https://phabricator.wikimedia.org/T294368 (10nshahquinn-wmf) [17:21:19] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: conda list does not show all packages in environment - https://phabricator.wikimedia.org/T294368 (10Ottomata) Hm, that is annoying, and looks like it is due to conda's 'stacking' support. Python knows how to import packages from the base environment, but c... [17:22:02] stats.wikimedia.org now powers one of the tweets by one of my wikimedia related twitter bots :) https://twitter.com/addshore/status/1453048839280148483 [17:22:09] thanks for the slick API ;) [17:22:19] \o/ [17:22:40] it'll tweet out every 10 million edits across all language wikipedias :) [17:23:10] Thanks a lot addshore hivng built that :) [17:23:32] im looking around for other stats that might be interesting to tweet out as milestones now [17:23:48] wondering if I need to register WikimediaMeter too xD [17:23:58] hehe :) [17:27:07] registered, now to think of what to put there xD [17:28:03] 10Analytics, 10Data-Engineering, 10Event-Platform, 10Wikidata, and 3 others: Add MCR slot information to revision-create events - https://phabricator.wikimedia.org/T293195 (10dcausse) This is blocked on https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/629406 which is required to support the new... [17:34:07] 10Analytics, 10Data-Engineering, 10Event-Platform, 10Wikidata, and 3 others: Add MCR slot information to revision-create events - https://phabricator.wikimedia.org/T293195 (10Ottomata) I was about to merge that today but then thought that your suggestion to ensure that properties validate with the addition... [17:35:21] I'm going to register WikimeteorMeter [17:56:27] i beat you to WikimediaMeter [17:56:44] Can WikimeteorMeter be a daily progress bar to something about meteors? :D [18:27:29] 10Analytics, 10Data-Services, 10Privacy Engineering, 10cloud-services-team (Kanban): Increased visibility in wiki-replicas for volunteers fighting vandals - https://phabricator.wikimedia.org/T284944 (10nskaggs) @odimitrijevic I don't. If you want to explore some sampled queries you can look at the research... [18:46:46] 10Analytics, 10Data-Services, 10Privacy Engineering, 10cloud-services-team (Kanban): Increased visibility in wiki-replicas for volunteers fighting vandals - https://phabricator.wikimedia.org/T284944 (10sguebo_WMF) > @sguebo_WMF Is this data visible on the wikis? @odimitrijevic -- yes, it is. My take on... [18:59:55] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-test-coord1002 - https://phabricator.wikimedia.org/T293938 (10Cmjohnson) @Jclark-ctr I noticed this moved out of D6, can you update task and netbox when you get a chance [20:39:00] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban: Upgrade Matomo to latest upstream - https://phabricator.wikimedia.org/T275144 (10Ottomata) 05Open→03Resolved Today Razzi and I looked into this. We upgraded to Matomo 3.14.1. However, http://debian.matomo.org/... [20:43:24] 10Analytics, 10Analytics-Kanban: Check home/HDFS leftovers of jmads - https://phabricator.wikimedia.org/T290715 (10Ottomata) Via Slack, @MNovotny_WMF informed me that she'd like jmads' data archived in HDFS. [20:52:39] 10Analytics-Radar, 10Fundraising-Backlog, 10Product-Analytics, 10Wikipedia-iOS-App-Backlog, and 2 others: Understand impact of Apple's Relay Service - https://phabricator.wikimedia.org/T289795 (10SNowick_WMF) [21:35:29] 10Analytics, 10EventStreams: Expose mediawiki/revision/tags-change in stream.wikimedia.org - https://phabricator.wikimedia.org/T294391 (10Urbanecm)