[04:51:17] 10Data-Engineering, 10Data-Engineering-Kanban: Some varnishkafka instances dropped traffic for a long time due to the wrong version of the package installed - https://phabricator.wikimedia.org/T300164 (10AndyRussG) Ahhhhhh oki so just //one// more thought that I guess //might// be worth sharing here... (Really... [09:37:46] Good morning btullis - Could please merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/763189/ ? [09:57:18] Morning. I've merged and deployed that now.. [10:13:23] (03CR) 10Btullis: "These are the blubber files and small tweaks required to get the datahub containers building locally." [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [10:17:28] (03CR) 10Elukey: Add blubber files for deployment pipeline (031 comment) [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [10:26:06] (03CR) 10Btullis: Add blubber files for deployment pipeline (034 comments) [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [10:45:19] btullis: Thanks a lot! [10:53:42] (03PS3) 10Btullis: Add blubber files for deployment pipeline [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) [10:56:29] (03PS4) 10Btullis: Add configuration for deployment pipeline [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) [10:57:11] (03PS5) 10Btullis: Add blubber files for deployment pipeline [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) [10:58:33] (03PS6) 10Btullis: Add configuration for deployment pipeline [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) [11:03:10] (03PS7) 10Btullis: Add configuration for deployment pipeline [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) [11:13:34] (03CR) 10Btullis: Add configuration for deployment pipeline (036 comments) [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [11:31:08] 10Data-Engineering, 10Data-Engineering-Kanban: Some varnishkafka instances dropped traffic for a long time due to the wrong version of the package installed - https://phabricator.wikimedia.org/T300164 (10JAllemandou) Based on yesterday's feedback I have not started extracting data yesterday. I have sent a pat... [11:54:19] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Release Pipeline, 10Patch-For-Review: Create DataHub containers with deployment pipeline - https://phabricator.wikimedia.org/T301453 (10BTullis) I'm pretty happy with the progress on the blubber files and the pipeline config, but I'm awaiti... [12:51:06] (03PS1) 10Aqu: Migrate wikidata/item_page_link/weekly from Oozie to Airflow [analytics/refinery] - 10https://gerrit.wikimedia.org/r/763219 [12:55:24] (03PS4) 10Aqu: Migrate AQS/hourly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/756601 (https://phabricator.wikimedia.org/T299398) [13:04:33] (03CR) 10Elukey: "Left some comments just to understand the use case, Alex will surely have more precise suggestions, I am not 100% familiar with Blubber :)" [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [13:41:42] (03CR) 10Btullis: Add configuration for deployment pipeline (034 comments) [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [14:04:22] (03CR) 10Elukey: Add configuration for deployment pipeline (033 comments) [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [14:09:21] (03CR) 10Btullis: Add configuration for deployment pipeline (031 comment) [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [14:13:50] !log deployed airflow-dags to analytics instance [14:13:52] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:16:37] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog: Define the Kubernetes Deployments for Datahub - https://phabricator.wikimedia.org/T301454 (10BTullis) a:03BTullis Beginning work on this. [14:25:39] (03CR) 10Btullis: Add configuration for deployment pipeline (032 comments) [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [15:33:27] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Patch-For-Review: Define the Kubernetes Deployments for Datahub - https://phabricator.wikimedia.org/T301454 (10BTullis) p:05Triage→03High [15:39:20] (03CR) 10Ottomata: [C: 03+1] "Couple more comments, but aside from those this LGTM! +1ing so you can merge at will and make progress while I'm out for the next 1.5 wee" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/676392 (https://phabricator.wikimedia.org/T276379) (owner: 10Jason Linehan) [15:40:59] 10Data-Engineering, 10Data-Engineering-Kanban: Some varnishkafka instances dropped traffic for a long time due to the wrong version of the package installed - https://phabricator.wikimedia.org/T300164 (10Milimetric) > About the webrequest data, I would like a more formal definition of the need before proceedin... [15:56:46] 10Data-Engineering, 10Data-Engineering-Kanban: Some varnishkafka instances dropped traffic for a long time due to the wrong version of the package installed - https://phabricator.wikimedia.org/T300164 (10AndyRussG) >>! In T300164#7714313, @JAllemandou wrote: > I have started a job extracting daily pageviews wi... [16:01:34] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Patch-For-Review: Define the Kubernetes Deployments for Datahub - https://phabricator.wikimedia.org/T301454 (10BTullis) @akosiaris - would you be able to advise a little on this please, just to get me going? I'm not sure, but think that I wa... [16:13:45] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Release Pipeline: Upload required datahub dependencies to Archiva - https://phabricator.wikimedia.org/T301886 (10BTullis) [16:14:21] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Release Pipeline: Upload required datahub dependencies to Archiva - https://phabricator.wikimedia.org/T301886 (10BTullis) p:05Triage→03High [16:17:18] (03CR) 10Elukey: Add configuration for deployment pipeline (031 comment) [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [16:25:35] (03CR) 10Btullis: Add configuration for deployment pipeline (031 comment) [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/762950 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [16:31:41] (03PS27) 10Phuedx: Metrics Platform event schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/676392 (https://phabricator.wikimedia.org/T276379) (owner: 10Jason Linehan) [16:38:22] (03PS28) 10Phuedx: Metrics Platform event schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/676392 (https://phabricator.wikimedia.org/T276379) (owner: 10Jason Linehan) [16:38:35] (03CR) 10Phuedx: Metrics Platform event schema (034 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/676392 (https://phabricator.wikimedia.org/T276379) (owner: 10Jason Linehan) [16:41:30] (03CR) 10Emil Chetty: [C: 03+1] "Looks good to me -> Lets get it out :)" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/676392 (https://phabricator.wikimedia.org/T276379) (owner: 10Jason Linehan) [16:53:02] 10Data-Engineering, 10Metrics-Platform, 10SRE, 10Traffic: VarnishKafka to propagate user agent client hints headers to webrequest - https://phabricator.wikimedia.org/T299401 (10jbond) p:05Triage→03Medium [16:54:31] joal: one question :] do you know if Spark generates success files when executing .insertInto() as opposed to executing .write() ?? [16:55:06] mforns: I just sent an email about that :) [16:55:16] oh! reading :] [16:55:41] mforns: indeed - spark.write.parquet generates a _SUCESS file, while spark.insertInto doesn't [16:55:50] yea, makes sense [16:55:50] or at least that's what I have experienced [16:55:58] that's why the script fails [16:56:24] yeah, I had gone through debugging that last time and forgot to talk about it and create tasks :S my bad [16:56:31] 10Data-Engineering, 10SRE, 10observability, 10serviceops: Upgrade Kafka to 2.x - https://phabricator.wikimedia.org/T300102 (10jbond) p:05Triage→03Medium [16:57:55] no worries, joal, it was a fun troubleshooting [16:57:56] :] [16:58:27] 10Data-Engineering, 10Infrastructure-Foundations, 10Product-Analytics, 10Research, and 2 others: Maybe restrict domains accessible by webproxy - https://phabricator.wikimedia.org/T300977 (10jbond) p:05Triage→03Low [17:01:36] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Release Pipeline: Upload required datahub dependencies to Archiva - https://phabricator.wikimedia.org/T301886 (10BTullis) I have downloaded all of the jetty artifacts from Maven central ` btullis@marlin-wsl:~/tmp$ wget -q https://repo1.maven.... [17:11:10] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Release Pipeline: Upload required datahub dependencies to Archiva - https://phabricator.wikimedia.org/T301886 (10BTullis) Ah, it looks like we already have these jetty artifacts in archiva: https://archiva.wikimedia.org/#artifact/org.eclipse.... [17:24:39] 10Analytics, 10Product-Analytics: Help with data that's not appearing on charts - https://phabricator.wikimedia.org/T301895 (10Iflorez) [17:31:12] hey folks, are we doing the sre sync or still in standup? [17:31:49] Oh sorry Luca. I declined the sync because I'm looking after the kids (not very well). [17:32:16] ahhh [17:32:53] I mean I'm not doing a very good job of looking after them :-) They're glued to computers. [17:33:11] ahahahaha :D [17:34:40] 10Analytics, 10Data-Engineering, 10Product-Analytics: Help with data that's not appearing on charts - https://phabricator.wikimedia.org/T301895 (10nshahquinn-wmf) [17:36:33] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Release Pipeline: Upload required datahub dependencies to Archiva - https://phabricator.wikimedia.org/T301886 (10BTullis) I have uploaded the v20190813 artifacts anyway The strange thing is that the download link is broken on Archiva. It put... [18:00:25] 10Analytics, 10Data-Engineering, 10Event-Platform, 10Patch-For-Review, 10Readers-Web-Backlog (Kanbanana-FY-2021-22): WikipediaPortal Event Platform Migration - https://phabricator.wikimedia.org/T282012 (10Jdrewniak) So I've looked into the event logging migration. The [[ https://wikitech.wikimedia.org/wi... [18:12:10] 10Data-Engineering, 10Data-Engineering-Kanban: Some varnishkafka instances dropped traffic for a long time due to the wrong version of the package installed - https://phabricator.wikimedia.org/T300164 (10JMando) Confirming what @AndyRussG said above about is_banner_beacon and device_family at hourly granularit... [19:02:50] 10Data-Engineering, 10Product-Analytics: conda-create-stacked breaks wmfdata.presto - https://phabricator.wikimedia.org/T301734 (10nettrom_WMF) [19:06:22] 10Data-Engineering, 10Airflow: [Airflow] Research, discuss and decide on DAG/task dependencies VS. success/failure files (Oozie style) - https://phabricator.wikimedia.org/T301568 (10mforns) @JAllemandou @Antoine_Quhen This is the task where we can continue discussions on success files vs. dag dependencies! [19:13:03] 10Analytics, 10Data-Engineering, 10Product-Analytics, 10Superset: Help with data that's not appearing on charts - https://phabricator.wikimedia.org/T301895 (10odimitrijevic) Hi @Iflorez, is the issue related to the time filter as per @Ottomata 's suggestion in the Slack conversation? [19:13:48] 10Analytics, 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics, 10Superset: Help with data that's not appearing on charts - https://phabricator.wikimedia.org/T301895 (10odimitrijevic) [19:26:33] joal: qq re. spark logs, do we use a log4j config file for spark jobs? [19:28:50] (03PS12) 10Sharvaniharan: Add a required variable to app analytics fragment [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/761452 [19:29:18] (03CR) 10Sharvaniharan: "Thank you for putting so much thought towards the naming and maintenance of these fields @Ottomata and @Phuedx." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/761452 (owner: 10Sharvaniharan) [20:59:06] 10Analytics, 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics, 10Superset: Help with data that's not appearing on charts - https://phabricator.wikimedia.org/T301895 (10Milimetric) Looking at this closer, January 2020 and January 2016 are about a year before the first date in the screens... [21:25:24] 10Data-Engineering, 10MediaWiki-Page-editing, 10Editing-team (FY2021-22 Kanban Board), 10Performance-Team (Radar), 10Product-Analytics (Kanban): Update edits_hourly to ingest new legacy wikitext editor change tag - https://phabricator.wikimedia.org/T293406 (10ppelberg) [21:33:07] 10Data-Engineering, 10MediaWiki-Page-editing, 10Editing-team (FY2021-22 Kanban Board), 10Performance-Team (Radar), 10Product-Analytics (Kanban): Update edits_hourly to ingest new legacy wikitext editor change tag - https://phabricator.wikimedia.org/T293406 (10ppelberg) 05Open→03Resolved [21:34:07] (03PS33) 10AGueyte: Basic ipinfo instrument setup [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) [21:34:38] (03CR) 10jerkins-bot: [V: 04-1] Basic ipinfo instrument setup [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) (owner: 10AGueyte) [21:48:24] Hey mforns - I was away this evening - let's talk about that tomorrow? [21:53:26] (03PS34) 10AGueyte: Basic ipinfo instrument setup [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) [22:01:03] 10Analytics, 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics, 10Superset: Help with data that's not appearing on charts - https://phabricator.wikimedia.org/T301895 (10Milimetric) 05Open→03Resolved a:03Milimetric [22:12:27] (03PS35) 10AGueyte: Basic ipinfo instrument setup [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415)