[01:01:02] 10Data-Engineering, 10Cassandra: aqs1004 low disk space warning - https://phabricator.wikimedia.org/T313936 (10Eevans) >>! In T313936#8108763, @Eevans wrote: [ ... ] > As I understand it, this cluster is out of production and the hardware should be coming down. If that is the case, it probably doesn't warra... [01:26:37] PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: monitor_refine_event.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [01:33:15] PROBLEM - Check unit status of monitor_refine_event on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_event https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [07:55:44] 10Data-Engineering, 10Foundational Technology Requests, 10SRE: Add a webrequest sampled topic and ingest into druid/turnilo - https://phabricator.wikimedia.org/T314981 (10fgiunchedi) Thank you @Ottomata for the example and extensive explanation! I'll take a closer look and play with it a little bit it too [09:12:32] 10Data-Engineering, 10Cassandra: aqs1004 low disk space warning - https://phabricator.wikimedia.org/T313936 (10BTullis) Apologies for missing the first ping @Eevans - You're right that this whole cluster aqs100* is shortly up for decommissioning and it's not serving production AQS traffic at the moment, so I t... [09:50:54] RECOVERY - Check unit status of monitor_refine_event on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_event https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [09:51:05] !log restarted monitor-refine-event on an-launcher1002 [09:51:06] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:52:28] RECOVERY - Check systemd state on an-launcher1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [10:31:36] (03PS1) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824425 [10:48:35] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 00), 10Spike: [SPIKE][NEEDS GROOMING] Assess what is required for the enrichment pipline to run on k8 - https://phabricator.wikimedia.org/T315428 (10gmodena) [11:13:43] (03PS1) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824461 [11:18:34] (03CR) 10Nmaphophe: [C: 03+2] Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824425 (owner: 10Nmaphophe) [11:27:37] (03Merged) 10jenkins-bot: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824425 (owner: 10Nmaphophe) [11:30:41] (03PS1) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824470 [11:30:58] (03CR) 10CI reject: [V: 04-1] Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824470 (owner: 10Nmaphophe) [11:37:04] (03PS1) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 [11:37:20] (03CR) 10CI reject: [V: 04-1] Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 (owner: 10Nmaphophe) [13:28:46] (03PS12) 10RhinosF1: mypy: add to tox [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/821244 [13:55:17] (03CR) 10Nmaphophe: [C: 03+2] Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824461 (owner: 10Nmaphophe) [13:55:35] (03CR) 10CI reject: [V: 04-1] Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824461 (owner: 10Nmaphophe) [13:55:51] (03Abandoned) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824242 (owner: 10Nmaphophe) [13:57:05] (03Abandoned) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824470 (owner: 10Nmaphophe) [13:57:16] (03Abandoned) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 (owner: 10Nmaphophe) [13:57:38] (03Restored) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 (owner: 10Nmaphophe) [13:57:49] (03Abandoned) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824461 (owner: 10Nmaphophe) [13:58:00] (03Abandoned) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824247 (owner: 10Nmaphophe) [14:01:53] (03Abandoned) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 (owner: 10Nmaphophe) [14:04:08] !log re-running refine_eventlogging_legacy for helppanel [14:04:10] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:10:41] btullis: o/ in https://gerrit.wikimedia.org/r/c/operations/puppet/+/824503/1/modules/systemd/manifests/timesyncd.pp, how does the avove declaration of service systemmd-timesyncd interact with the systemd::unit one? I 'd assumem that systemd::unit ends up declaring a service systemd-timesyncd, which would cause a duplicate resource error [14:11:35] also, you probably should resopenc the $ensure parameter too? [14:13:06] 10Analytics, 10Analytics-Wikistats, 10Data-Engineering: Merge Ks-Arab and Ks-Deva to ks - https://phabricator.wikimedia.org/T314476 (10Tajamul9) [14:13:43] Good points, thanks. I think it's to do with this override parameter on the defined type: https://github.com/wikimedia/puppet/blob/production/modules/systemd/manifests/unit.pp#L18 [14:17:15] Reused the $ensure parameter as suggested, thanks. Looking into the duplicate resource potential now. How do I add a new host to labs so that I can run pcc against it again? Do you remember? [14:19:02] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 00), 10Spike: [SPIKE] Decide on technical solution for page state stream backfill process - https://phabricator.wikimedia.org/T314389 (10Ottomata) Relevant Incremental MediaWiki History Task and doc, with some Iceberg choice details: - {T258511} - [[... [14:20:17] (03Restored) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 (owner: 10Nmaphophe) [14:22:58] 10Data-Engineering, 10Foundational Technology Requests, 10SRE: Add a webrequest sampled topic and ingest into druid/turnilo - https://phabricator.wikimedia.org/T314981 (10Ottomata) [14:23:02] 10Data-Engineering, 10Event-Platform Value Stream: Declare webrequest as an Event Platform stream - https://phabricator.wikimedia.org/T314956 (10Ottomata) [14:24:53] oh huh okay, so no service declared in systemd::unit, interesting [14:30:50] 10Data-Engineering: Documentathon - https://phabricator.wikimedia.org/T311413 (10JArguello-WMF) Oct 12–13 is the proposed week for this event to take place. Scope: - Data Engineering Space (https://wikitech.wikimedia.org/wiki/Data_Engineering) Ben redesigned the page - Data Catalog Consolidate informati... [14:38:26] 10Analytics, 10Product-Analytics, 10Epic: Data Lake incremental Data Updates - https://phabricator.wikimedia.org/T258511 (10Ottomata) > I don't think that having a high level aspirational task in phabricator will help us prioritize it Hi @odimitrijevic! :) Unless there are other discoverable public refere... [14:43:30] (03CR) 10Vivian Rook: [C: 03+2] mypy: add to tox [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/821244 (owner: 10RhinosF1) [14:48:51] (03Merged) 10jenkins-bot: mypy: add to tox [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/821244 (owner: 10RhinosF1) [14:51:32] > How do I add a new host to labs so that I can run pcc against it again? Do you remember? [14:51:43] ^ scratch that question altegether. I had the wrong hostname [14:52:55] 10Quarry, 10Regression, 10good first task: Bad resultset number case is not handled - https://phabricator.wikimedia.org/T218470 (10rook) [14:53:03] 10Quarry, 10good first task: Define in a single place the pseudoname of unnamed queries - https://phabricator.wikimedia.org/T197029 (10rook) [14:54:00] :) [14:57:05] (03CR) 10CI reject: [V: 04-1] mypy: [DNM] make strict [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/824430 (owner: 10RhinosF1) [14:57:46] (03PS3) 10Vivian Rook: mypy: [Do Not Merge] make strict [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/824430 (owner: 10RhinosF1) [15:03:42] (03CR) 10CI reject: [V: 04-1] mypy: [Do Not Merge] make strict [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/824430 (owner: 10RhinosF1) [16:03:49] (03PS2) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 [16:04:36] (03CR) 10CI reject: [V: 04-1] Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 (owner: 10Nmaphophe) [16:09:31] (03PS3) 10Nmaphophe: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 [16:28:12] (03CR) 10Nmaphophe: [C: 03+2] Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 (owner: 10Nmaphophe) [16:35:59] (03Merged) 10jenkins-bot: Added ArrayAvgUDF to calculate the average two columns by using an array struct. It also ignores nulls [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 (owner: 10Nmaphophe) [16:40:21] 10Data-Engineering, 10Data Pipelines (Sprint 00), 10Patch-For-Review: Install spark3 in analytics clusters - https://phabricator.wikimedia.org/T295072 (10Antoine_Quhen) pyspark 3 is now installed with conda. The pyspark package in the conda forge is marking those as dependencies: - numpy >=1.7 - pandas >... [16:42:49] 10Data-Engineering, 10Data Pipelines (Sprint 00), 10Patch-For-Review: Install spark3 in analytics clusters - https://phabricator.wikimedia.org/T295072 (10Ottomata) +1 I think those are good deps too have. We’d probably add them to our ‘analytics base env’ anyway. I’d pin them at more recent versions though... [16:49:21] (03CR) 10Btullis: "The commit message has many change-id lines. I believe that you can remove them all apart from the last one." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824471 (owner: 10Nmaphophe) [17:00:51] (03PS1) 10Nmaphophe: Added ArrayAvgUDF. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824537 [17:10:31] 10Data-Engineering: Update Search Engine list - https://phabricator.wikimedia.org/T315329 (10Isaac) [17:12:39] (03Abandoned) 10Nmaphophe: Added ArrayAvgUDF. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824537 (owner: 10Nmaphophe) [17:12:56] (03PS1) 10Nmaphophe: Revert "Added ArrayAvgUDF to calculate the average two columns by using an array struct." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824431 [17:15:46] (03PS1) 10Nmaphophe: Revert "Added ArrayAvgUDF to calculate the average two columns by using an array struct." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824432 [17:16:50] (03Abandoned) 10Nmaphophe: Revert "Added ArrayAvgUDF to calculate the average two columns by using an array struct." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824432 (owner: 10Nmaphophe) [17:46:11] 10Data-Engineering: Update Search Engine list - https://phabricator.wikimedia.org/T315329 (10Isaac) Some more notes: * After adding `suche`, the highest new search engine in the `Predicted Other` category is `suche.t-online.de` but it's below 20k per day (and from a high-traffic country) so I think no need to ad... [17:49:10] 10Data-Engineering: Update Search Engine list - https://phabricator.wikimedia.org/T315329 (10Isaac) [18:47:08] (03Abandoned) 10Nmaphophe: Revert "Added ArrayAvgUDF to calculate the average two columns by using an array struct." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/824431 (owner: 10Nmaphophe) [19:40:54] 10Data-Engineering: Update Search Engine list - https://phabricator.wikimedia.org/T315329 (10Isaac) I'll try to put together the patch but the code change at [[https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/SearchEngine.java#L2... [19:53:29] 10Data-Engineering, 10Epic, 10Event-Platform Value Stream (Sprint 00): Integrate Image Suggestions Feedback with Cassandra - https://phabricator.wikimedia.org/T306627 (10lbowmaker) Successfully tested with: ` curl https://image-suggestion.discovery.wmnet:30443/private/image_suggestions/feedback/fawiki/10690... [19:55:19] 10Data-Engineering, 10Epic, 10Event-Platform Value Stream (Sprint 00): Integrate Image Suggestions Feedback with Cassandra - https://phabricator.wikimedia.org/T306627 (10lbowmaker) [19:57:40] !log apply yarn production queue changes to allow analytics-research and analytics-platform-eng users to submit jobs to production queue - T312858 [19:57:42] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:57:43] T312858: New airflow instance related to Image Suggestion Jobs - https://phabricator.wikimedia.org/T312858