[04:28:12] PROBLEM - Check unit status of monitor_refine_event_sanitized_analytics_delayed on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_event_sanitized_analytics_delayed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [06:26:02] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform, 10SRE: Allow kafka clients to verify brokers hostnames when using SSL - https://phabricator.wikimedia.org/T291905 (10elukey) Thanks for all the work John! >>! In T291905#7435523, @jbond wrote: >>>! In T291905#7431136, @elukey wrote: >> Let me know y... [07:14:23] !log Rerun cassandra-daily-wf-local_group_default_T_mediarequest_top_files-2021-10-17 [07:14:25] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:54:01] Hi btullis - All failed jobs have been rerun for cassandra loading, we're back to normal - However there still are a lot ongoing compactions from what I can see on the dashboard - let's maybe wait for them to finalize before we restart loading snapshots? [08:18:22] joal: Thanks. That's good news. Yes I think that's a good idea. I still have 4 snapshots to copy to an-presto1001 to clear space on the aqs_next cluster, so I can do these without impacting Cassandra I think, then wait for the compactions to finish. [08:20:46] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) Transfers of `aqs1012-a` and `aqs1013-a` completed successfully. ` btullis@cumin1001:~$ sudo transfer.py aqs1012.eq... [08:21:17] great btullis :) [08:24:53] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) Starting the next two transfer operations now. `aqs1014-b` and `aqs1015-a` ` btullis@cumin1001:~$ sudo transfer.py... [08:46:34] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Test Alluxio as cache layer for Presto - https://phabricator.wikimedia.org/T266641 (10BTullis) I have deployed a patch to the alluxio resources from puppet, given that it's not going to be able to meet our... [09:35:19] (03PS7) 10DCausse: Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) [09:38:41] (03CR) 10DCausse: "Thanks for the review!" [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) (owner: 10DCausse) [10:01:05] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform, 10SRE: Allow kafka clients to verify brokers hostnames when using SSL - https://phabricator.wikimedia.org/T291905 (10jbond) > One question - when I use profile::pki::get_cert do I need to do any extra work before using a label like kafka (that I assu... [10:02:02] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform, 10SRE: Allow kafka clients to verify brokers hostnames when using SSL - https://phabricator.wikimedia.org/T291905 (10elukey) >>! In T291905#7439824, @jbond wrote: >> One question - when I use profile::pki::get_cert do I need to do any extra work befo... [10:03:27] elukey: feel free to ping here with any questions re ^^ [10:05:31] sure!! I hope to send a code review that looks legit following the docs :) [10:05:37] thanks a lot for the help [10:05:50] cool and np :) [10:32:30] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Patch-For-Review, and 2 others: Migrate analytics cluster alerts from Icinga to AlertManager - https://phabricator.wikimedia.org/T293399 (10BTullis) I have made some progress on the migration by carrying out the following: * created a CR to add the da... [11:29:43] (03CR) 10Jgiannelos: "Any other feedback about the schema? Just a heads up, I don't have +2 rights in the repository so somebody else needs to approve." [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/730841 (https://phabricator.wikimedia.org/T293366) (owner: 10Jgiannelos) [12:19:49] 10Analytics, 10SRE, 10SRE Observability (FY2021/2022-Q2): statsd and gunicorn metrics for superset - https://phabricator.wikimedia.org/T293761 (10fgiunchedi) [12:23:38] btullis: compactions are gently being absorbed by cassandra - I think it's a good idea to let it bake for a while :) [13:19:19] 10Analytics, 10SRE, 10SRE-Access-Requests: Kerberos identity for kcv-wikimf - https://phabricator.wikimedia.org/T293189 (10Ottomata) @KCVelaga_WMF do you have ssh access yet? You'll need that to if not. In {T291475} I don't see that being requested. See https://wikitech.wikimedia.org/wiki/Analytics/Data_a... [13:22:27] (03CR) 10Ottomata: [C: 03+2] maps: Schema for batched tile changes [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/730841 (https://phabricator.wikimedia.org/T293366) (owner: 10Jgiannelos) [13:23:47] joal: Yes, I agree. No more loading for a while, but are you OK with me transferring the final snapshot to an-presto1001? Or would you rather I wait? [13:24:12] btullis: please go do the transfer, I just meant waiting on loading :) [13:24:17] thanks for triple checking btullis [13:25:08] (03PS3) 10Clare Ming: Add new schema for desktop UI scroll tracking. [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) [13:25:20] Cool. Five of the remaining six snapshots to be loaded have been copied to an-presto1001. I'll start the last now. [13:25:23] https://usercontent.irccloud-cdn.com/file/XgPchLwH/image.png [13:25:41] awesome :) [13:26:48] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) Those two transfers have completed successfully. I will now start on the final one. ` btullis@cumin1001:~$ sudo tra... [13:27:44] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) This is running now. ` btullis@cumin1001:~$ sudo transfer.py aqs1015.eqiad.wmnet:/srv/cassandra-b/tmp/local_group_d... [13:34:16] (03CR) 10Ottomata: "Schema looks good to me, see the one comment about event." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [13:35:16] (03CR) 10Ottomata: "Mikhael/Jason, what do you think about the convention of naming intrumentations for the MW desktop browser UI mediawiki/desktop_ui?" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [13:39:56] 10Analytics, 10SRE, 10SRE-Access-Requests: Kerberos identity for kcv-wikimf - https://phabricator.wikimedia.org/T293189 (10KCVelaga_WMF) Hi @Ottomata: If I am not wrong, I think that had been done with {T292992}. I just tried, and I am able to run Hive queries through JupyterHub. Please let me know if I am m... [13:45:54] 10Analytics, 10SRE, 10SRE-Access-Requests: Kerberos identity for kcv-wikimf - https://phabricator.wikimedia.org/T293189 (10Ottomata) You are right! You have it! :) [14:03:22] (03PS4) 10Clare Ming: Add new schema for desktop UI scroll tracking. [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) [14:04:19] (03CR) 10Clare Ming: Add new schema for desktop UI scroll tracking. (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [14:12:33] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Growth-Team, and 4 others: Revisions missing from mediawiki_revision_create - https://phabricator.wikimedia.org/T215001 (10dcausse) [14:39:21] 10Analytics, 10SRE, 10SRE Observability (FY2021/2022-Q2): statsd and gunicorn metrics for superset - https://phabricator.wikimedia.org/T293761 (10BTullis) Thanks for the heads-up @fgiunchedi I've just had a quick look and, as far as I can see, the stats from gunicorn //haven't// been used to date. The only... [15:00:27] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Metrics-Platform, and 2 others: wgEventStreams (EventStreamConfig) should support per wiki overrides - https://phabricator.wikimedia.org/T277193 (10Ottomata) Docs updated: https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To#Overrid... [15:49:55] 10Analytics, 10SRE, 10SRE Observability (FY2021/2022-Q2): statsd and gunicorn metrics for superset - https://phabricator.wikimedia.org/T293761 (10BTullis) Ah, it looks like there **is** another way of instrumenting superset, but it's only with statsd: https://superset.apache.org/docs/installation/event-loggi... [15:50:27] 10Analytics, 10Data-Engineering, 10Event-Platform: Discussion of Event Driven Systems - https://phabricator.wikimedia.org/T290203 (10Ottomata) https://martinfowler.com/articles/data-monolith-to-mesh.html is quite excellent. It focuses on more than just the 'event driven' part, but data products using events... [15:52:19] 10Analytics, 10SRE, 10SRE Observability (FY2021/2022-Q2): statsd and gunicorn metrics for superset - https://phabricator.wikimedia.org/T293761 (10Ottomata) https://github.com/prometheus/statsd_exporter ? Not the best, but it would work. [15:53:15] ottomata: o/, when you get a chance I'd love if you could check that using a pattern like this: https://gerrit.wikimedia.org/r/c/schemas/event/primary/+/731006/7/jsonschema/mediawiki/revision/create/current.yaml#103 is supported by your jsonschema -> hive schema converter [15:53:30] 10Analytics, 10SRE, 10SRE Observability (FY2021/2022-Q2): statsd and gunicorn metrics for superset - https://phabricator.wikimedia.org/T293761 (10Ottomata) > Note that it’s also possible to implement you own logger by deriving superset.stats_logger.BaseStatsLogger. Probably better to implement our own prome... [15:53:31] looking [15:54:01] dcausse: type: object + items? [15:54:18] damn wrong link [15:54:43] wrong patchset even [15:55:07] (03PS8) 10DCausse: Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) [15:55:11] forgot to ship sorry [15:55:29] ottomata: https://gerrit.wikimedia.org/r/c/schemas/event/primary/+/731006/8/jsonschema/mediawiki/revision/create/current.yaml [15:55:43] it's basically a map type where one entry is "forced" [15:56:30] OHHH interesting [15:56:45] we have kind of wanted to support that or other reasons... lemme check [15:57:04] (03CR) 10Ottomata: Add fragment/mediawiki/revision/slot (031 comment) [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) (owner: 10DCausse) [15:57:41] dcausse: no, but we could change it [15:57:48] right now if properties is defined, it will be a struct [15:57:54] we could invert that [15:58:15] actually, i already have an old patch for it [15:58:15] https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/629406 [15:59:18] ottomata: cool, happy to dig into this patch if you want [15:59:26] am re-reading it myself to remember [15:59:36] meetings starting now, but i think we should do it dcausse [16:00:02] ottomata: great thanks! I'll add this patch to review list [16:01:22] ottomata: stand up ! [16:01:58] nevermind!!!!!!!!! technical decision forum! [16:03:14] razzi do you have a meet link? [16:15:47] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Jupyter notebook logs should appear in Logstash - https://phabricator.wikimedia.org/T288348 (10razzi) 05Open→03In progress Discussed in standup today, this has stopped producing: https://logstash.wikimedia.org/app/dashboards#/view/af2a44f0-2c0e-11ec-81... [16:18:40] 10Analytics, 10Analytics-Kanban: Request Kerberos credentials - https://phabricator.wikimedia.org/T292532 (10razzi) 05Open→03Resolved Looks done, reopen if there's more to do. [16:20:01] (03CR) 10Michael Große: "This change is ready for review." [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/732005 (owner: 10Michael Große) [16:27:17] (03CR) 10Lucas Werkmeister (WMDE): "null would indeed be a problem (see also I8163fe4856 / T293329), but I don’t see why we would get null there… shouldn’t it just be 0 if th" [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/732005 (owner: 10Michael Große) [16:28:41] 10Analytics, 10User-razzi: Presto error in Superset - https://phabricator.wikimedia.org/T292879 (10razzi) [16:29:25] (03CR) 10Michael Große: Send 0 not null if wb_changes is empty (031 comment) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/732005 (owner: 10Michael Große) [16:29:54] (03CR) 10Lucas Werkmeister (WMDE): Send 0 not null if wb_changes is empty (031 comment) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/732005 (owner: 10Michael Große) [16:33:04] (03CR) 10Michael Große: Send 0 not null if wb_changes is empty (031 comment) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/732005 (owner: 10Michael Große) [16:33:26] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, and 2 others: Upgrade Superset to 1.3.1 or higher - https://phabricator.wikimedia.org/T288115 (10BTullis) It looks like this bug might also be affecting us: https://github.com/apache/superset/issues/14377 [16:39:58] (03PS2) 10Michael Große: Don't crash if wb_changes is empty [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/732005 [16:40:23] (03CR) 10jerkins-bot: [V: 04-1] Don't crash if wb_changes is empty [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/732005 (owner: 10Michael Große) [16:40:46] 10Analytics: Kerberos request ticket for Naray-ctr - https://phabricator.wikimedia.org/T293814 (10NaRay) [16:42:15] (03PS3) 10Michael Große: Don't crash if wb_changes is empty [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/732005 [17:11:54] 10Analytics-Clusters, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, and 2 others: Upgrade Superset to 1.3.1 or higher - https://phabricator.wikimedia.org/T288115 (10BTullis) Hey @Razzi, what about if we enable user impersonation with a SQL statement, rather than by ticking the box. As... [17:13:48] 10Analytics, 10Product-Analytics, 10wmfdata-python: Upstream relevant parts of wmfdata-python into refinery - https://phabricator.wikimedia.org/T293700 (10ldelench_wmf) p:05Triage→03Low [18:32:51] Hi razzi - I'm sorry I got hacked by kids after my previous meeting [18:33:13] And now is very late - let's maybe deploy tomorrow? [18:35:36] Starting build #96 for job analytics-refinery-maven-release-docker [18:38:02] Oh hey joal, no worries, I kicked off the refinery-source build and figured I'd continue unless it seemed risky [18:38:15] So far so good, and I've done all these deploys before so I'm not too worried [18:42:31] (03CR) 10Nray: Add new schema for desktop UI scroll tracking. (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [18:47:05] Project analytics-refinery-maven-release-docker build #96: 09SUCCESS in 11 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/96/ [18:48:25] (03CR) 10Clare Ming: Add new schema for desktop UI scroll tracking. (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [18:55:12] Starting build #55 for job analytics-refinery-update-jars-docker [18:55:40] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.1.19 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/732033 [18:55:44] Project analytics-refinery-update-jars-docker build #55: 09SUCCESS in 32 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars-docker/55/ [19:00:18] (03CR) 10Razzi: [V: 03+2 C: 03+2] Add refinery-source jars for v0.1.19 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/732033 (owner: 10Maven-release-user) [19:04:07] (03CR) 10Razzi: [C: 03+2] "LGTM, will deploy tomorrow / whenever you're ready Joal." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/724412 (https://phabricator.wikimedia.org/T287084) (owner: 10Joal) [19:05:01] sooo joal [19:05:02] https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/629406 [19:05:03] ? [19:05:07] I built the 0.1.19 jars but I'll wait for joal to merge https://gerrit.wikimedia.org/r/c/analytics/refinery/+/724412/ and continue with https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Deploy/Refinery [19:05:08] whatcha think? [19:05:28] I think joal is out for the day ottomata [19:05:32] aye k [19:06:26] I can take a look ヾ(*^__^*)ゞ [19:06:34] (03CR) 10Ottomata: Add new schema for desktop UI scroll tracking. (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [19:08:03] razzi: sure if you like! i submitted it a year ago as an idea for a different reason, but now d causse wants it for the rev_slots field in https://gerrit.wikimedia.org/r/c/schemas/event/primary/+/731006/8/jsonschema/mediawiki/revision/create/current.yaml [19:08:15] happy to provide more context if you are interested [19:11:13] yeah ottomata want to do a 10 minute call in a minute? [19:13:37] can we do at 3:50? [19:14:02] oh in 35 mins? [19:19:19] (03PS5) 10Clare Ming: Add new schema for desktop UI scroll tracking. [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) [19:19:47] (03CR) 10Clare Ming: Add new schema for desktop UI scroll tracking. (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [19:58:26] (03CR) 10Clare Ming: Add new schema for desktop UI scroll tracking. (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [20:06:48] razzi: , am in bc [20:06:56] cool brt [21:00:43] (03PS3) 10Ottomata: Spark JsonSchemaConverter - additionalProperties with schema is always a MapType [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/629406 (https://phabricator.wikimedia.org/T263466) [21:00:46] (03PS4) 10Ottomata: Spark JsonSchemaConverter - additionalProperties with schema is always a MapType [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/629406 (https://phabricator.wikimedia.org/T263466) [21:36:37] (03PS1) 10Clare Ming: Add new web A/B test schema to track bucketing of users. [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/732089 (https://phabricator.wikimedia.org/T292587) [21:43:49] (03PS2) 10Clare Ming: Add new web A/B test schema to track bucketing of users for a given experiment. [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/732089 (https://phabricator.wikimedia.org/T292587)