[00:07:03] 10Analytics, 10Code-Health-Objective, 10Epic, 10Platform Engineering Roadmap, and 2 others: AQS 2.0 - https://phabricator.wikimedia.org/T263489 (10Clarakosi) [03:02:04] RECOVERY - Check unit status of refinery-import-page-history-dumps on an-launcher1002 is OK: OK: Status of the systemd unit refinery-import-page-history-dumps https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [06:52:04] PROBLEM - Check unit status of refinery-import-page-history-dumps on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit refinery-import-page-history-dumps https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [07:13:11] 10Analytics, 10Dumps-Generation: xmldatadumps dumpstatus.json files only readable by root - https://phabricator.wikimedia.org/T287989 (10ArielGlenn) @Ottomata I only found two owned by root, and I have fixed them up: hywwiki and ugwikibooks. The process that writes them is running as the dumpsgen user, so it c... [08:14:41] (03PS2) 10David Caro: docs: added docker compose link and minor rewording [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/709951 [08:14:44] (03CR) 10David Caro: docs: added docker compose link and minor rewording (032 comments) [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/709951 (owner: 10David Caro) [08:53:43] (03CR) 10David Caro: add stop query function (031 comment) [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/710067 (https://phabricator.wikimedia.org/T71037) (owner: 10Michael DiPietro) [09:11:19] Small request so that our team can merge patches: https://gerrit.wikimedia.org/r/c/analytics/reportupdater-queries/+/709646 [09:28:36] 10Quarry: quarry explain not working since move to multiple databases - https://phabricator.wikimedia.org/T288170 (10Aklapper) [12:50:14] hi team :] [12:50:29] Hi mforns. :-) [13:45:00] 10Analytics-Clusters, 10Analytics-Kanban, 10User-MoritzMuehlenhoff: Improve user experience for Kerberos by creating automatic token renewal service - https://phabricator.wikimedia.org/T268985 (10BTullis) I think I need to seek some input from SRE on this, as to what is the best way to proceed. I'm trying to... [14:47:49] 10Analytics: SPIKE - Will Hadoop 3 container support help us for Airflow deployment pipelines? - https://phabricator.wikimedia.org/T288247 (10Ottomata) [14:48:24] 10Analytics: SPIKE - Will Hadoop 3 container support help us for Airflow deployment pipelines? - https://phabricator.wikimedia.org/T288247 (10Ottomata) [14:48:26] 10Analytics, 10Platform Team Workboards (Image Suggestion API): Airflow collaborations - https://phabricator.wikimedia.org/T282033 (10Ottomata) [15:16:50] 10Analytics, 10 Data-Engineering: SPIKE - Will Hadoop 3 container support help us for Airflow deployment pipelines? - https://phabricator.wikimedia.org/T288247 (10odimitrijevic) [15:18:03] 10Quarry: update python for quarry - https://phabricator.wikimedia.org/T288249 (10mdipietro) [15:19:10] 10Analytics, 10Analytics-Kanban, 10Epic, 10Patch-For-Review: Replace Camus by Gobblin - https://phabricator.wikimedia.org/T271232 (10odimitrijevic) [15:26:12] 10Analytics, 10 Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Gobblin Monitoring - https://phabricator.wikimedia.org/T287991 (10odimitrijevic) p:05Triage→03High [15:27:18] 10Analytics, 10 Data-Engineering, 10Better Use Of Data, 10Data-Engineering-Kanban, 10Product-Analytics: Upgrade Superset to 1.2 - https://phabricator.wikimedia.org/T288115 (10odimitrijevic) [15:28:06] 10Analytics, 10 Data-Engineering, 10Data-Engineering-Kanban: Push Gobblin import metrics to Prometheus and add alerts on some critical imports - https://phabricator.wikimedia.org/T286503 (10odimitrijevic) [15:34:25] If I need to disable puppet temporarily on a node in the test cluster, am I right in thinking that it's just `sudo puppet agent --disable` - No special WMF sauce for this? [15:35:38] 10Analytics, 10 Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Alluxio for Improved Superset Query Performance - https://phabricator.wikimedia.org/T288252 (10odimitrijevic) [15:37:28] btullis: there's a wrapper `sudo disable-puppet` that forces you to add a message and auto adds your username [15:37:55] Perfect. Thanks majavah. [15:46:08] btullis: o/ usually we also add a msg like "elukey - testing something something" [15:46:20] so if the host is left disabled there is a poc [15:46:27] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: jupyter notebook causing syslog/etc.. to fill up with error messages - https://phabricator.wikimedia.org/T287339 (10BTullis) After a little testing and tweaking I got the configuration to be passed through to the transient unit file. ` '-c.SystemdSpawner.... [15:46:34] (otherwise it is more difficult to ping people etc..) [15:49:01] Great, thanks. I had forgotten that it was run from cron, not the service and not a systemd timer either; so my test changes got overwritten. :-) [16:03:22] a-team standup! [16:04:33] razzi i sent an email about those refinery-import jobs [16:04:35] see also https://phabricator.wikimedia.org/T287989 [16:04:38] need to follow up [16:24:47] ottomata: about to ask service-ops about the analytics_base_url in deployment_charts, had a question or two I wanted to get clear first tho [16:24:50] 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint): Add geolocation information to Growth schemas - https://phabricator.wikimedia.org/T287121 (10mewoph) [16:24:50] this was added early march in https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/667561 (see also dependent patch https://gerrit.wikimedia.org/r/c/research/mwaddlink/+/667553) by service-ops [16:24:59] looks like the goal was to avoid going through edge caches for various reasons, and just talk to the production instance directly [16:25:26] 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint): Add geolocation information to Growth schemas - https://phabricator.wikimedia.org/T287121 (10mewoph) a:03mewoph [16:27:27] ryankemper: huh. [16:27:28] that makes sense [16:27:40] i guess we don't have a way to get to the service aside from hostname then [16:27:44] unless we make a cname? [16:28:08] ryankemper: or. [16:28:11] yup right that's my question basically [16:28:18] it is possible to get puppet to populate some helm values [16:28:55] possibly dumb question: why does puppet need to know that? [16:29:11] if right now this value is hardcoded in `deployment-charts` [16:29:11] well, something needs to know what the host should be [16:29:16] could be done via cname [16:29:24] or make puppet be the one who knows [16:29:29] ideally it would be in one place either way [16:29:31] maybe cname is cleaner [16:29:53] leaning towards cname I think [16:29:56] yeah [16:30:10] ryankemper: i guess somethign should be authoritative, and this value in deployment-charts is not it :) [16:30:12] so puppet or cname [16:30:22] cname probably cleaner, since really that's what we use for the public routing anyway [16:30:29] just need one for internal routing too [16:30:39] so ya ok, lets add a cname, we probably only need one for analytics [16:30:45] so...analytics.eqiad.wmnet ? [16:31:27] I like the sound of that [16:31:46] and we'll just point that to `http://an-web1001.eqiad.wmnet/published/datasets/one-off/research-mwaddlink/`? [after an-web1001 is setup ofc] [16:31:59] ya, the cname will point to an-web1001 [16:32:05] er yeah just the an-web1001 part [16:32:07] yeah [16:32:16] any place that needs to access that internally will use analytics.eqiad.wmnet [16:32:36] okay cool that clears things up [16:34:29] 10Analytics, 10Dumps-Generation: xmldatadumps dumpstatus.json files only readable by root - https://phabricator.wikimedia.org/T287989 (10Ottomata) Interesting, this is still happening, but the example I see right now isn't owned by root, but is only readable by dumpsgen: ` 16:33:38 [@labstore1006:/home/otto]... [16:36:12] 10Analytics-Radar, 10SRE, 10ops-eqiad: Try to move some new analytics worker nodes to different racks - https://phabricator.wikimedia.org/T276239 (10Cmjohnson) We are close to moving these to A7 now. Several MW's have been decom'd and John and I need to get them out of the rack. Looking to have this done... [16:37:21] (03PS1) 10Michael DiPietro: upgrade quarry to python 3.9 [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/710305 (https://phabricator.wikimedia.org/T288249) [16:46:32] 10Analytics-Radar, 10SRE, 10ops-eqiad: Try to move some new analytics worker nodes to different racks - https://phabricator.wikimedia.org/T276239 (10Ottomata) Thank you! [17:07:48] 10Analytics, 10Event-Platform, 10EventStreams: EventStreams (via KafkaSSE) does not consume from newly added partitions in topic - https://phabricator.wikimedia.org/T173006 (10Ottomata) [17:12:45] 10Analytics: Make it possible to use anaconda + stacked conda envs for Airflow executors - https://phabricator.wikimedia.org/T288271 (10Ottomata) [17:16:14] 10Analytics, 10Event-Platform, 10EventStreams: EventStreams (via KafkaSSE) does not consume from newly added partitions in topic - https://phabricator.wikimedia.org/T173006 (10RBrounley_WMF) a:03Protsack.stephan [17:16:41] 10Analytics, 10Event-Platform, 10EventStreams: EventStreams (via KafkaSSE) does not consume from newly added partitions in topic - https://phabricator.wikimedia.org/T173006 (10RBrounley_WMF) a:05Protsack.stephan→03None [17:17:18] 10Analytics, 10Event-Platform, 10EventStreams: EventStreams (via KafkaSSE) does not consume from newly added partitions in topic - https://phabricator.wikimedia.org/T173006 (10RBrounley_WMF) Hah meant to add @Protsack.stephan as a subscriber not owner [17:20:43] (03PS1) 10MewOphaswongse: Add client_ip to Growth schemas [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/710310 (https://phabricator.wikimedia.org/T287121) [17:24:17] 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint), 10Patch-For-Review: Add geolocation information to Growth schemas - https://phabricator.wikimedia.org/T287121 (10mewoph) Sample payload from updated HomepageVisit, HomepageModule and HelpPanel schemas HomepageVisit ` "http": {... [18:03:53] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: jupyter notebook causing syslog/etc.. to fill up with error messages - https://phabricator.wikimedia.org/T287339 (10BTullis) Confirmed, there is no variable interpolation done by the SystemdSpawner. https://github.com/jupyterhub/systemdspawner#unit_extra_p... [18:08:55] 10Analytics, 10Analytics-Wikistats, 10Wikidata, 10Research (FY2021-22-Research-July-Sept): Identifying controversial content in Wikidata - https://phabricator.wikimedia.org/T287946 (10Manuel) [18:09:08] 10Analytics, 10Wikidata, 10Wikidata Analytics, 10Research (FY2021-22-Research-July-Sept): Identifying controversial content in Wikidata - https://phabricator.wikimedia.org/T287946 (10Manuel) [18:17:05] btullis: we know the guy who wrote SystemdSpawner [18:17:16] and i'm certain he would accept a patch to make the extra properties interpolated [18:17:52] seeing as other configs do it [18:17:56] i'd assume it owuldn't be hard [18:19:17] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: jupyter notebook causing syslog/etc.. to fill up with error messages - https://phabricator.wikimedia.org/T287339 (10Ottomata) @yuvipanda could/should we submit a patch to make SystemdSpawner interpolate `unit_extra_properties`? [18:21:17] 10Analytics-Radar, 10Wikipedia-Android-App-Backlog (Android Release FY2021-22): android image_recommendation_interaction error - https://phabricator.wikimedia.org/T284620 (10Sharvaniharan) Thank you for the heads-up @odimitrijevic . We do not need the data anymore, so it sounds good. [18:21:57] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: jupyter notebook causing syslog/etc.. to fill up with error messages - https://phabricator.wikimedia.org/T287339 (10Ottomata) @btullis it looks like just calling `self._expand_user_vars` on each of the `self.unit_extra_properties` values on [[ https://gith... [18:24:50] (03CR) 10Ottomata: [C: 03+1] Add client_ip to Growth schemas [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/710310 (https://phabricator.wikimedia.org/T287121) (owner: 10MewOphaswongse) [18:26:13] 10Analytics-Radar, 10Wikipedia-Android-App-Backlog (Android Release FY2021-22): android image_recommendation_interaction error - https://phabricator.wikimedia.org/T284620 (10Sharvaniharan) >>! In T284620#7243224, @Ottomata wrote: > Let's leave the schema for now, you can leave the patch open and we'll see abou... [18:26:38] 10Analytics-Radar, 10Wikipedia-Android-App-Backlog (Android Release FY2021-22): android image_recommendation_interaction error - https://phabricator.wikimedia.org/T284620 (10Ottomata) Ok! thanks! [18:27:59] 10Analytics-Radar, 10Wikipedia-Android-App-Backlog (Android Release FY2021-22): android image_recommendation_interaction decommissioning - https://phabricator.wikimedia.org/T284620 (10Sharvaniharan) [18:29:17] 10Analytics-Radar, 10Wikipedia-Android-App-Backlog: android image_recommendation_interaction decommissioning - https://phabricator.wikimedia.org/T284620 (10Sharvaniharan) [18:31:03] 10Analytics, 10Analytics-Kanban: Fix default ownership and permissions for Hive managed databases in /user/hive/warehouse - https://phabricator.wikimedia.org/T280175 (10Ottomata) @Mayakp.wiki asked that her db and tables be owned by analytics-product. Done: ` sudo -u hdfs hdfs dfs -chown -R analytics-product... [18:40:34] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: jupyter notebook causing syslog/etc.. to fill up with error messages - https://phabricator.wikimedia.org/T287339 (10BTullis) Oh nice. Thanks for that. I didn't think of patching it. Happy to do so. [18:43:18] Hello! Do we have a pageviews-like info, but for special pages? Ideally something that works irrespective of the alias used to access the special page. Thanks! [18:43:56] milimetric: ^ maybe? [18:44:51] (responding) [18:45:57] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: jupyter notebook causing syslog/etc.. to fill up with error messages - https://phabricator.wikimedia.org/T287339 (10BTullis) I did wonder whether or not Logstash would be a good target for these user notebook logs. Given that we already have quite a bit o... [18:49:04] urbanecm: we just have basic stats on special pages (no grouping of aliases). Here's an example query that gets hits for three special pages, you can filter to see if the ones you're interested in show up: https://w.wiki/3nZW [18:49:59] this is the webrequest dataset, but if you have access to pageview hourly it's there too. We do exclude certain Special pages from pageviews hourly, where the title/URI Path would divulge potentially sensitive information [18:51:11] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: jupyter notebook causing syslog/etc.. to fill up with error messages - https://phabricator.wikimedia.org/T287339 (10Ottomata) Sure! [18:53:35] 10Analytics, 10Analytics-Kanban: Fix default ownership and permissions for Hive managed databases in /user/hive/warehouse - https://phabricator.wikimedia.org/T280175 (10Ottomata) Ah, I reverted ^. Maya had meant the tables in wmf_product should be analytics-product owned: ` sudo -u hdfs hdfs dfs -chown -R an... [18:53:54] thanks milimetric, this is helpful. I'll look into it! [19:05:38] milimetric: https://www.alluxio.io/blog/building-high-performance-data-lake-using-apache-hudi-and-alluxio-at-t3go/ [19:10:06] nice ottomata, and there's a Presto Iceberg connector, so we're good. It feels like some of the pieces are falling in place in the big data world [19:10:13] yeah [19:10:23] come a long way since Pig was a suggested query language (yuuuck) [19:10:43] picturing kafka -> iceberg with hive metastore and alluxio on top [19:11:07] cataloged by atlast [19:11:10] atlas [19:11:16] there's your shared data platform cold storage [19:16:29] I think that's pretty much the common picture, I wonder if anyone disagrees [20:01:35] ryankemper: heyo, i'm going to be out for 3 weeks [20:01:44] anything i can help with before I go? [20:01:59] feel free to get btullis and/or razzi to do any reviews [20:02:01] elukey: can review things too [20:10:19] (03PS1) 10DLynch: EditAttemptStep: Add a new integration [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/710348 (https://phabricator.wikimedia.org/T270636) [20:15:07] (03PS1) 10MewOphaswongse: Add client_ip to serversideaccountcreation schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/710349 (https://phabricator.wikimedia.org/T287121) [20:37:59] ottomata: razzi and I are pairing today so I should be good, I'll reach out to elukey or btullis if necessary as well [20:38:04] have a good vacation! [20:38:16] great! [20:58:17] (03PS1) 10Andrew Bogott: tox.ini: update to work with default buster tox version [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/710357 [20:59:45] (03CR) 10Andrew Bogott: [C: 04-1] "Let's not actually merge this until we decide what our deployment platform will be." [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/710357 (owner: 10Andrew Bogott) [21:30:29] 10Analytics, 10Code-Health-Objective, 10Epic, 10Platform Engineering Roadmap, and 2 others: AQS 2.0 - https://phabricator.wikimedia.org/T263489 (10Eevans) [21:48:55] 10Analytics, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement pageviews endpoints - https://phabricator.wikimedia.org/T288296 (10Eevans) [21:51:59] 10Analytics, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement the unique devices endpoints - https://phabricator.wikimedia.org/T288298 (10Eevans) [21:52:31] 10Analytics, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement the unique devices endpoints - https://phabricator.wikimedia.org/T288298 (10Eevans) [21:53:49] 10Analytics, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement the unique devices endpoints - https://phabricator.wikimedia.org/T288298 (10Eevans) [21:53:51] 10Analytics, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement pageviews endpoints - https://phabricator.wikimedia.org/T288296 (10Eevans) [21:53:53] 10Analytics, 10Code-Health-Objective, 10Epic, 10Platform Engineering Roadmap, and 2 others: AQS 2.0 - https://phabricator.wikimedia.org/T263489 (10Eevans) [21:54:49] 10Analytics, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement wikistats 2 endpoints - https://phabricator.wikimedia.org/T288301 (10Eevans) [21:55:36] 10Analytics, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement wikistats 2 endpoints - https://phabricator.wikimedia.org/T288301 (10Eevans) [21:57:13] 10Analytics, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement mediarequests endpoints - https://phabricator.wikimedia.org/T288303 (10Eevans) [21:59:20] 10Analytics, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement geoeditors endpoints - https://phabricator.wikimedia.org/T288305 (10Eevans) [22:00:04] 10Analytics-Clusters, 10Analytics-Kanban: Set up an-web1001 and decommission thorium - https://phabricator.wikimedia.org/T285355 (10RKemper) Realized I never linked the gerrit patches to this ticket. See https://gerrit.wikimedia.org/r/c/operations/puppet/+/709822 and https://gerrit.wikimedia.org/r/c/operations... [22:00:21] 10Analytics-Clusters, 10Analytics-Kanban: Set up an-web1001 and decommission thorium - https://phabricator.wikimedia.org/T285355 (10RKemper) test [23:01:04] (03CR) 10Nettrom: [C: 03+1] "Looks good to me" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/710349 (https://phabricator.wikimedia.org/T287121) (owner: 10MewOphaswongse) [23:02:00] (03CR) 10Nettrom: [C: 03+1] "Looks good to me" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/710310 (https://phabricator.wikimedia.org/T287121) (owner: 10MewOphaswongse) [23:33:16] (03CR) 10Gergő Tisza: [C: 03+2] Add client_ip to serversideaccountcreation schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/710349 (https://phabricator.wikimedia.org/T287121) (owner: 10MewOphaswongse) [23:34:41] (03Merged) 10jenkins-bot: Add client_ip to serversideaccountcreation schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/710349 (https://phabricator.wikimedia.org/T287121) (owner: 10MewOphaswongse) [23:44:34] (03CR) 10Gergő Tisza: [C: 03+2] Add client_ip to Growth schemas [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/710310 (https://phabricator.wikimedia.org/T287121) (owner: 10MewOphaswongse) [23:45:09] (03Merged) 10jenkins-bot: Add client_ip to Growth schemas [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/710310 (https://phabricator.wikimedia.org/T287121) (owner: 10MewOphaswongse)