[07:03:04] !log stop jupyter-kaywong-singleuser.service on stat1005 to allow puppet to clean up [07:03:07] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:03:55] hello folks, puppet was complaining about not being able to clean up the user since it was holding a process [07:13:47] Ah! [07:13:49] thanks elukey [07:13:55] good morning :) [07:17:55] bonjour :) [08:02:29] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics: Upgrade Superset to 1.3 - https://phabricator.wikimedia.org/T288115 (10elukey) Tried to quickly check and I noticed that presto dashboards were not working (due to the krb settings mentioning an-... [08:04:56] bonjour [08:05:07] which hadoop dataset has deleted (all) revisions in it? D: [08:06:00] Hi addshore - when you ask for revisions, do you mean revision metadata or content? [08:06:06] content [08:06:27] addshore: we don't have them [08:06:32] D: I see [08:36:21] joal: when it's convenient for you, can we have a look together at this missing rows issue please? [08:37:18] Hi btullis - In meeting now, will ping you when available [08:37:29] Perfect, thanks. [08:51:26] btullis: I have 10 mins now if you wish [08:52:00] OK, let's just catch up quickly then. Don't want to take up too much of your time. [08:52:05] BC? [08:52:09] sure [09:44:06] joal: and no plan to have them at all? [09:44:23] (deleted revision content) [09:46:21] addshore: plans - yes, resourced plans - no :) [09:46:30] is there a ticket? [09:46:55] I was trying to come up with a data set of the last english labels items had on wikidata before being deleted, to see if things regularly got recreated [09:47:02] but that would require this data set :P [09:47:11] at least I have all the queries ironed out now :D [09:47:33] :) [09:58:23] addshore: I can't find the ticket that describes the main idea (getting all created content on HDFS using doing API calls from events) [10:26:37] The existing datasets come from dumps right? which is why they miss deleted things? [10:27:06] I thought there was also something about sqooping data from sql tables though too? (But I guess content isn't there? SO I guess we have the meta data, not content for all revisions)? [10:27:28] addshore: you are absolutely right --^ [10:27:35] dang it :P okay :) [11:36:59] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10vm-requests: Site: Eqiad - 1 VM request for analytics test cluster - coordinator replica role - https://phabricator.wikimedia.org/T289664 (10BTullis) The request (T289784) for a physical machine to serve as an-test... [12:12:52] btullis: I have test all tables except the two big ones (mediarequest_per_file and pageviews_per_article_flat) - They are mostly complete except for the small number of missing rows we discussed earlier [12:15:37] Right. I've done a scripted reload of all of them for the avoidance of doubt. The last of the `top_pageviews/data` one is at 75% and that's the last table that I've scripted. [12:16:19] ack btullis - let mo know when it's done, I can easily retest problematic dates first, and if they are solved, all dates (longer) [12:16:20] So it does seem to confirm that there is something wrong with the method and that we need more instance snapshots to get a complete set of data. [12:16:55] btullis: IMO it means we have issues with data-replication on the old cluster [12:17:51] 10Analytics, 10Analytics-Kanban: Check AQS with cassandra (serving + data) - https://phabricator.wikimedia.org/T290068 (10JAllemandou) Updates after Cassandra-2 data has been copied to the cassandra-3 cluster (1 rack) for small-ish tables (all except 2 tables): All tables are mostly complete, except for a smal... [12:18:21] If I'm correct, the process we've followed is good in thoery but in practice misses the rows with replication problems [12:18:35] Yes, I see what you mean. You suggested that we carry out a test on `pageviews_per_project_v2/data` 1) Taking a *new* snapshot on the remaining 8 instances 2) Importing the `pageviews_per_project_v2/data` from all of the remaining 8 snapshots, 3) re-testing. [12:19:28] Correct btullis - Another option is to run a 'repair table' on cassandra on the table (pageviews_per_project_v2/data) and retry with a single rack snapshot [12:22:22] Yes, I see. It would probably need to be a *full* repair, I suppose. https://cassandra.apache.org/doc/latest/cassandra/operating/repair.html#full-repair-example [12:22:35] > Full repair is typically needed to redistribute data after increasing the replication factor of a keyspace or after adding a node to the cluster. [12:22:38] yes btullis - full repair, with -pr [12:23:04] to only repair primary- ranges, on all 4 instances of aqs100[47] [12:24:13] but I thought hnowlan had done that before his trials of data movement months ago [12:25:44] Do we know roughly how long a full repair takes across all keyspaces? Is it feasible for us to run a full repair against the biggest tables and have it complete in a reasonable time? [12:26:18] joal: I don't believe I did a full repair on the old cluster, just cleanups [12:26:56] Ah! my bad hnowlan :) We might have talked about it and I printed that you did it maybe :) [12:27:13] btullis: for big tables it's gonna take a huge time [12:27:31] not sure how useful this is as a datapoint but I think I did a full repair on one of the new nodes for all tables once a full import was complete and it took around 2 days [12:27:37] btullis: we can test for the pageviews_per_project_v2/data first, swee if it solves our issue, and then decide [12:27:53] hm, not that bad [12:29:45] By the way, that 4th import of `top_pageviews/data` finished. [12:31:00] joal: I'm a bit confused by this: [12:31:03] > to only repair primary- ranges, on all 4 instances of aqs100[47] [12:31:14] btullis: batcave? [12:31:25] Cool. [12:31:41] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10vm-requests: Site: Eqiad - 1 VM request for analytics test cluster - coordinator replica role - https://phabricator.wikimedia.org/T289664 (10jcrespo) > cleaning up netbox and DNS There is a script for that that is... [12:48:52] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10vm-requests: Site: Eqiad - 1 VM request for analytics test cluster - coordinator replica role - https://phabricator.wikimedia.org/T289664 (10elukey) >>! In T289664#7358626, @BTullis wrote: > The request (T289784) f... [13:01:54] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10vm-requests: Site: Eqiad - 1 VM request for analytics test cluster - coordinator replica role - https://phabricator.wikimedia.org/T289664 (10jcrespo) > I would personally try this road +1 [13:04:06] (03PS1) 10GoranSMilovanovic: T291170 [analytics/wmde/WD/WikidataAdHocAnalytics] - 10https://gerrit.wikimedia.org/r/721516 [13:04:15] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] T291170 [analytics/wmde/WD/WikidataAdHocAnalytics] - 10https://gerrit.wikimedia.org/r/721516 (owner: 10GoranSMilovanovic) [13:06:14] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics: Upgrade Superset to 1.3 - https://phabricator.wikimedia.org/T288115 (10elukey) I have tried to create a `presto_elukey` database entry for Presto, adding `{"connect_args":{"KerberosConfigPath":"/... [13:14:43] addshore: q: woudl it be useful to have all revision wikitext content? or would you need template expanded html? [13:15:05] for wikidata, i dont care about templates ;) [13:15:37] And I havn't really thought about usecases outside of wikidata right now [13:15:56] I could imagine if I thought about other possible things I'd want to do, only having HTML could end up being annoying [13:16:11] but also, having HTML additionally could be nice [13:21:14] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata Analytics: Privacy Policy Review for Global South Wikidata edits and active editors datasets - https://phabricator.wikimedia.org/T291186 (10GoranSMilovanovic) [13:22:03] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata Analytics: Privacy Policy Review for Global South Wikidata edits and active editors datasets - https://phabricator.wikimedia.org/T291186 (10GoranSMilovanovic) [13:22:41] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata Analytics: Privacy Policy Review for Global South Wikidata edits and active editors datasets - https://phabricator.wikimedia.org/T291186 (10GoranSMilovanovic) [13:48:49] 10Analytics, 10Analytics-Kanban: Check AQS with cassandra (serving + data) - https://phabricator.wikimedia.org/T290068 (10BTullis) I am running the following commands sequentially on aqs1004.eqiad.wmnet and aqs1007.eqiad.wmnet ` sudo nodetool-a repair --full local_group_default_T_pageviews_per_project_v2 data... [14:51:37] addshore: ah sorry misunderstood your q, nm! jo al answered all :) [15:12:41] (03CR) 10Dave Pifke: [C: 04-1] Add TLS support (032 comments) [analytics/statsv] - 10https://gerrit.wikimedia.org/r/721044 (https://phabricator.wikimedia.org/T290131) (owner: 10Dave Pifke) [15:19:21] (03PS2) 10Dave Pifke: Add TLS support [analytics/statsv] - 10https://gerrit.wikimedia.org/r/721044 (https://phabricator.wikimedia.org/T290131) [15:26:51] 10Analytics-Clusters, 10Analytics-Kanban, 10observability, 10Patch-For-Review: Setup Analytics team in VO/splunk oncall - https://phabricator.wikimedia.org/T273064 (10razzi) 05Open→03In progress [15:27:42] 10Analytics, 10Product-Analytics: Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10MNeisler) [15:28:43] (03CR) 10Dave Pifke: "Latest patch set seems to work in deployment-prep." [analytics/statsv] - 10https://gerrit.wikimedia.org/r/721044 (https://phabricator.wikimedia.org/T290131) (owner: 10Dave Pifke) [15:47:58] 10Analytics, 10Patch-For-Review: Fix the open bugs for Hue - https://phabricator.wikimedia.org/T264896 (10elukey) 05Open→03Declined All the cloudera issues were self closed by a bot for inactivity (I tried to keep them alive for a few times). Airflow will likely push Hue out of our infrastructure, so I am... [16:06:47] (03CR) 10Krinkle: "@Filippo @Andrew I couldn't find anything in other production codebases about disabling hostname validation. Could one of you confirm that" [analytics/statsv] - 10https://gerrit.wikimedia.org/r/721044 (https://phabricator.wikimedia.org/T290131) (owner: 10Dave Pifke) [16:08:28] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics: Upgrade Superset to 1.3 - https://phabricator.wikimedia.org/T288115 (10razzi) Interesting @elukey, perhaps it's a regression in 1.3; never ran into that error in any prior versions. Thanks for tr... [16:12:54] (03CR) 10Dave Pifke: Add TLS support (031 comment) [analytics/statsv] - 10https://gerrit.wikimedia.org/r/721044 (https://phabricator.wikimedia.org/T290131) (owner: 10Dave Pifke) [16:14:19] 10Analytics, 10Analytics-Kanban: Check home/HDFS leftovers of jkatz - https://phabricator.wikimedia.org/T287235 (10Ottomata) 05Open→03Resolved [16:14:43] (03CR) 10Dave Pifke: Add TLS support (031 comment) [analytics/statsv] - 10https://gerrit.wikimedia.org/r/721044 (https://phabricator.wikimedia.org/T290131) (owner: 10Dave Pifke) [16:16:50] (03CR) 10Ottomata: [C: 03+1] Add num-partitions param to mw-history checkers [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/719290 (https://phabricator.wikimedia.org/T290469) (owner: 10Joal) [16:18:56] (03CR) 10Ottomata: [C: 03+1] "One nit but feel free to merge as is if you prefer." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/719111 (https://phabricator.wikimedia.org/T290469) (owner: 10Joal) [16:21:00] (03CR) 10MNeisler: Add the content_translation_event stream to the allowlist (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/716339 (https://phabricator.wikimedia.org/T281511) (owner: 10MNeisler) [16:24:09] (03CR) 10Ottomata: [C: 03+1] Add TLS support [analytics/statsv] - 10https://gerrit.wikimedia.org/r/721044 (https://phabricator.wikimedia.org/T290131) (owner: 10Dave Pifke) [16:28:12] (03CR) 10Ottomata: [C: 03+1] Add TLS support (031 comment) [analytics/statsv] - 10https://gerrit.wikimedia.org/r/721044 (https://phabricator.wikimedia.org/T290131) (owner: 10Dave Pifke) [16:30:57] 10Analytics, 10Product-Analytics: Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10Milimetric) We talked about this in the meeting, but good reminder from @elukey that Superset is accessible by everyone with an LDAP nda account, so we should openly discuss whether that's sufficie... [16:33:14] 10Analytics, 10Product-Analytics: Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10elukey) Without 2FA anybody with a weak password can be problematic, I would prefer to use clouddb1020 since it contains only sanitized data (if possible). [16:39:56] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10vm-requests: Site: Eqiad - 1 VM request for analytics test cluster - coordinator replica role - https://phabricator.wikimedia.org/T289664 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by btullis@... [16:43:15] 10Analytics, 10Product-Analytics: Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10odimitrijevic) @MNeisler can you please add the use case(s) that this is requested for so that we can explore other possible solutions. [16:46:03] 10Analytics, 10Product-Analytics: Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10odimitrijevic) p:05Triage→03Medium [16:46:23] 10Quarry: make automatically deploy-able staging quarry - https://phabricator.wikimedia.org/T291204 (10mdipietro) [16:47:12] 10Quarry: make automatically deploy-able staging quarry - https://phabricator.wikimedia.org/T291204 (10mdipietro) [16:49:22] 10Analytics-Radar, 10Privacy Engineering, 10WMDE-Analytics-Engineering, 10Wikidata, 10Wikidata Analytics: Privacy Policy Review for Global South Wikidata edits and active editors datasets - https://phabricator.wikimedia.org/T291186 (10odimitrijevic) [16:51:59] 10Analytics-Radar, 10Privacy Engineering, 10WMDE-Analytics-Engineering, 10Wikidata, 10Wikidata Analytics: Privacy Policy Review for Global South Wikidata edits and active editors datasets - https://phabricator.wikimedia.org/T291186 (10Milimetric) We are forwarding this request to #privacy_engineering but... [16:52:03] 10Analytics, 10Analytics-Kanban: Check home/HDFS leftovers of fdans - https://phabricator.wikimedia.org/T290231 (10Ottomata) a:05Ottomata→03None [16:53:14] 10Analytics-Radar, 10Dumps-Generation, 10Wikidata, 10wdwb-tech: Proposal: Generate Wikidata JSON & RDF dumps from Hadoop - https://phabricator.wikimedia.org/T291089 (10odimitrijevic) [16:53:28] 10Analytics: Check home/HDFS leftovers of kaywong - https://phabricator.wikimedia.org/T291060 (10odimitrijevic) p:05Triage→03High [16:58:35] 10Analytics, 10Data-Engineering: LVS in Analytics VLANs - https://phabricator.wikimedia.org/T288750 (10odimitrijevic) p:05Triage→03Low [17:03:28] 10Analytics: hdfs directory for analytics-research - https://phabricator.wikimedia.org/T290918 (10mforns) [17:03:30] 10Analytics, 10Platform Team Workboards (Image Suggestion API): Airflow collaborations - https://phabricator.wikimedia.org/T282033 (10mforns) [17:04:42] (03PS2) 10Milimetric: Stop retaining all GuidedTour events [analytics/refinery] - 10https://gerrit.wikimedia.org/r/715967 (https://phabricator.wikimedia.org/T288416) [17:05:48] 10Analytics, 10Analytics-Kanban: hdfs directory for analytics-research - https://phabricator.wikimedia.org/T290918 (10odimitrijevic) a:03Ottomata [17:07:12] 10Analytics, 10Analytics-Kanban: Improve Refine bad data handling - https://phabricator.wikimedia.org/T289003 (10odimitrijevic) a:05Milimetric→03Ottomata [17:08:35] 10Analytics, 10Product-Analytics: Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10MNeisler) [17:09:15] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Product-Analytics (Kanban): Add SearchSatisfaction to the allowlist - https://phabricator.wikimedia.org/T274607 (10odimitrijevic) p:05Medium→03High a:05MNeisler→03mforns [17:09:19] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Product-Analytics (Kanban): Add SearchSatisfaction to the allowlist - https://phabricator.wikimedia.org/T274607 (10odimitrijevic) [17:09:25] 10Analytics, 10Product-Analytics: Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10MNeisler) @odimitrijevic - Sure, I've updated the task subscription with one current open request from the Editing team. I'll reach out to my team to see if there are others that would be useful to... [17:10:28] 10Analytics-EventLogging, 10Analytics-Radar, 10Metrics-Platform: Consider how to best architect transmission of events - https://phabricator.wikimedia.org/T240454 (10Ottomata) [17:10:32] 10Analytics-EventLogging, 10Analytics-Radar, 10Metrics-Platform: Consider how to best architect transmission of events from Browser Client - https://phabricator.wikimedia.org/T240454 (10odimitrijevic) [17:12:27] 10Analytics, 10Analytics-Kanban, 10Traffic: Review use of realloc in varnishkafka - https://phabricator.wikimedia.org/T287561 (10Ottomata) a:03odimitrijevic [17:14:20] 10Analytics-Clusters, 10DC-Ops, 10Data-Engineering, 10SRE, 10ops-eqiad: Q1:(Need By: TBD) rack/setup/install an-presto10[06-15] - https://phabricator.wikimedia.org/T290987 (10odimitrijevic) [17:16:06] 10Analytics, 10Analytics-Kanban: Investigate why gobblin pulls webrequest data late - https://phabricator.wikimedia.org/T290723 (10JAllemandou) 05Open→03Resolved [17:16:42] 10Analytics, 10Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Alluxio for Improved Superset Query Performance - https://phabricator.wikimedia.org/T288252 (10Ottomata) [17:17:27] 10Analytics-Radar, 10DC-Ops, 10Data-Engineering, 10SRE, 10ops-eqiad: Q1:(Need By: TBD) rack/setup/install an-presto10[06-15] - https://phabricator.wikimedia.org/T290987 (10odimitrijevic) [17:32:26] 10Analytics, 10Product-Analytics: Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10elukey) @MNeisler Hi :) I'd be very interested to know if any table with PII data will be needed, or if something like the mariadb sanitized replicas could be fine for your use case (https://wikite... [17:39:21] 10Analytics: Agree on a repository structure for Airflow-related code - https://phabricator.wikimedia.org/T290664 (10ACraze) mostly echoing @gmodena - I'm in favor of option 1. The ML Team has also been working with a monorepo for #lift-wing [[ https://gerrit.wikimedia.org/g/machinelearning/liftwing/inference-se... [18:12:51] heya ottomata - Can I manually create the user for Fabian? Or would it impair your investigation? [18:14:24] 10Analytics-Radar, 10Privacy Engineering, 10WMDE-Analytics-Engineering, 10Wikidata, 10Wikidata Analytics: Privacy Policy Review for Global South Wikidata edits and active editors datasets - https://phabricator.wikimedia.org/T291186 (10GoranSMilovanovic) @Milimetric Thank you Dan. [18:25:45] joal: i think that's fine, i think i can experiement, go ahead1 [18:25:54] ack! doing ottomata - thanks [18:25:55] i can probably test with the other system user [18:30:04] !log Create HDFS home folder for user 'analytics-research' [18:30:08] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:34:51] (03CR) 10Ottomata: [C: 03+1] Stop retaining all GuidedTour events [analytics/refinery] - 10https://gerrit.wikimedia.org/r/715967 (https://phabricator.wikimedia.org/T288416) (owner: 10Milimetric) [18:57:11] oh mforns still there? [18:57:17] yes [18:57:22] sup? [18:57:23] ok so [18:57:37] ssh -t -N -L8000:an-web1001.eqiad.wmnet:80 an-web1001.eqiad.wmnet [18:57:46] http://localhost:8000/ [18:57:55] you need to have hour browser override the host header though [18:58:11] for that i use [18:58:11] https://modheader.com/ [18:58:24] ok [18:58:26] set Host to either [18:58:28] stats.wikimedia.org [18:58:33] analytics.wikimedia.org [18:58:35] datasets.wikimedia.org [18:58:41] heya ottomata - We need those system users to be in analytics-private-data-users group :) [18:58:43] and just verify that things look about right to you [18:58:47] joal, aren't they? [18:58:55] if they aren't that's why the home dir wasn't created!!! [18:58:59] looking [18:59:35] THEY ARE NOT [18:59:39] welp that answers it! [19:00:16] ottomata: \o/ solving that will also solve the yarn-launching problem Fabian is facing [19:00:25] fixing now [19:00:54] great :) ottomata can you please let Fabian know it's done? [19:01:40] I can tell him if you want [19:13:19] Ok gone for tonigth - see you tomorrow team [19:15:52] ok mforns shoudl be done [19:15:55] where is fab.... [19:16:29] hey ottomata I'm looking at those sites, and can not find anything weird, all I have tested works [19:16:31] 10Analytics, 10Analytics-Kanban: hdfs directory for analytics-research - https://phabricator.wikimedia.org/T290918 (10Ottomata) Ah! These users were not properly added to the analytics-privatedata-users group. Done. [19:16:34] what am I looking for? [19:16:45] do you expect anything in particular to fail? [19:16:47] mforns: anything weird! [19:16:50] no i expect it to work [19:16:54] just wanted a second pair of eyes [19:16:57] couldn't find anything! [19:17:02] i'm going to proceed then mforns [19:17:09] going to do another rsync and switchover public routing [19:18:44] ottomata: the only minor thing that is broken is the "Code for this page can be seen here: " link, but that is also broken in prod... [19:18:51] oh ok [19:18:58] welp if broken in prod oh well [19:19:15] actually, the link works, but is inivisible O.o [19:25:48] !log pointing analytics-web cname at new an-web1001, this moves stats and analytics .wm.org from thorium to an-web1001 - T285355 [19:25:51] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:25:51] T285355: Set up an-web1001 and decommission thorium - https://phabricator.wikimedia.org/T285355 [19:32:23] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Set up an-web1001 and decommission thorium - https://phabricator.wikimedia.org/T285355 (10Ottomata) stats.wikimedia.org and analyitcs.wikimedia.org are now being served by an-web1001 [19:40:07] 10Analytics, 10Product-Analytics, 10Editing-team (Tracking): Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10ppelberg) [19:41:23] 10Analytics, 10Product-Analytics, 10Editing-team (Tracking): Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10ppelberg) [19:43:13] 10Analytics, 10Product-Analytics, 10Editing-team (Tracking): Add MariaDB replicas to Superset - https://phabricator.wikimedia.org/T291195 (10ppelberg)