[06:34:33] 10Data-Engineering, 10SRE, 10ops-eqiad: Check analytics1086 mgmt's cable - https://phabricator.wikimedia.org/T320458 (10elukey) 05Open→03Resolved Indeed it works now, thanks! [08:25:48] 10Data-Engineering, 10Equity-Landscape: Editorship Input Metrics - https://phabricator.wikimedia.org/T309274 (10ntsako) Sorry didn't see this comment, I'll prepare for productionising this one [09:13:06] (03PS4) 10Michael Große: Track views of EntitySchema namespaces on Wikidata [analytics/refinery] - 10https://gerrit.wikimedia.org/r/811979 (https://phabricator.wikimedia.org/T304793) [09:27:34] (03CR) 10Michael Große: Track views of EntitySchema namespaces on Wikidata (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/811979 (https://phabricator.wikimedia.org/T304793) (owner: 10Michael Große) [09:28:23] (03PS5) 10Michael Große: Track views of EntitySchema namespaces on Wikidata [analytics/refinery] - 10https://gerrit.wikimedia.org/r/811979 (https://phabricator.wikimedia.org/T304793) [09:55:59] 10Analytics-Jupyter, 10Data-Engineering, 10Product-Analytics: Replace anaconda-wmf with smaller, non-stacked Conda environments - https://phabricator.wikimedia.org/T302819 (10BTullis) >>! In T302819#8312492, @nshahquinn-wmf wrote: > Okay, another suggestion about channels! What if we set up a [custom channel... [10:09:55] (03CR) 10Lucas Werkmeister (WMDE): [C: 03+1] Track views of EntitySchema namespaces on Wikidata (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/811979 (https://phabricator.wikimedia.org/T304793) (owner: 10Michael Große) [10:37:29] 10Data-Engineering, 10Machine-Learning-Team, 10observability: Evaluate Benthos as stream processor - https://phabricator.wikimedia.org/T319214 (10elukey) >>! In T319214#8307578, @Ottomata wrote: > The tricky thing about async calls in streams, is that the ordering of the events might get all messed up, as th... [12:38:21] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 02), 10Spike: [SPIKE] Build simple stateless service using Flink SQL - https://phabricator.wikimedia.org/T318856 (10gmodena) Flink has an interface that implements [Loookup Join semantics](https://github.com/ververica/flink-sql-cookbook/blob/main/join... [13:31:25] 10Data-Engineering, 10Equity-Landscape: Editorship Input Metrics - https://phabricator.wikimedia.org/T309274 (10ntsako) a:05JAnstee_WMF→03ntsako [13:43:40] !log cleared airflow job wikidata_dump_to_hive_weekly [13:43:40] after silent sensor failure [13:43:41] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:48:19] (03PS1) 10DLynch: Include client_ip in EditAttemptStep schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) [13:49:13] (03CR) 10DLynch: "@Ottomata Do I need to even bump the schema version for this?" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) (owner: 10DLynch) [14:02:05] (03CR) 10Ottomata: "Yes, this is a schema change. Versions should be immutable, and in many places they are cached as if they are immutable." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) (owner: 10DLynch) [14:03:35] 10Data-Engineering-Kanban, 10Data Engineering Planning, 10SRE, 10serviceops, and 2 others: eventgate chart should use common_templates - https://phabricator.wikimedia.org/T303543 (10gmodena) The patch has been reviewed and merged. @Ottomata and @dcausse helped me out with a deployment of `eventgate-anal... [14:16:05] 10Analytics-Jupyter, 10Data-Engineering, 10Product-Analytics: Replace anaconda-wmf with smaller, non-stacked Conda environments - https://phabricator.wikimedia.org/T302819 (10Ottomata) Gitlab has the ability to work as a [[ https://docs.gitlab.com/ee/user/packages/pypi_repository/ | python (pip) package repo... [14:23:12] (VarnishkafkaNoMessages) firing: (3) varnishkafka on cp2027 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [14:28:12] (VarnishkafkaNoMessages) resolved: (3) varnishkafka on cp2027 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [14:33:19] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 02), 10Spike: [SPIKE] Build simple stateless service using Flink SQL - https://phabricator.wikimedia.org/T318856 (10Ottomata) This is SO COOL. (btw, no code in https://gitlab.wikimedia.org/gmodena/flink-mediawiki-http-connector ?). [14:46:55] 10Analytics, 10API Platform, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0 documentation - https://phabricator.wikimedia.org/T288664 (10apaskulin) [14:53:59] 10Data-Engineering, 10Data Pipelines: Airflow Hackathon (May 2022) - https://phabricator.wikimedia.org/T307500 (10EChetty) [14:59:35] (03CR) 10Michael Große: "Running this on stat1008 I got the following" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/811979 (https://phabricator.wikimedia.org/T304793) (owner: 10Michael Große) [15:06:18] (03CR) 10Mforns: [C: 03+2] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/811979 (https://phabricator.wikimedia.org/T304793) (owner: 10Michael Große) [15:07:29] 10Data-Engineering-Kanban, 10Data Engineering Planning, 10SRE, 10serviceops, and 2 others: eventgate chart should use common_templates - https://phabricator.wikimedia.org/T303543 (10Ottomata) @Clement_Goubert and I will deploy the rest on Monday. [15:07:36] (03CR) 10Mforns: [V: 03+2 C: 03+2] Track views of EntitySchema namespaces on Wikidata [analytics/refinery] - 10https://gerrit.wikimedia.org/r/811979 (https://phabricator.wikimedia.org/T304793) (owner: 10Michael Große) [15:09:50] 10Data-Engineering-Kanban, 10Data Engineering Planning: Investigate Gobblin dataloss during namenode failure - https://phabricator.wikimedia.org/T311263 (10Ottomata) I'm not working on this anymore. Should we close? We never quite figured out exactly why Gobblin failed in this strange way, but fingers crossed... [15:19:48] (03PS2) 10DLynch: Include client_ip in EditAttemptStep schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) [15:20:41] (03CR) 10DLynch: "Ah, I wasn't thinking of caching -- just of how none of the clients sending data were going to actually change anything." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) (owner: 10DLynch) [15:24:18] (03CR) 10Ottomata: Include client_ip in EditAttemptStep schema (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) (owner: 10DLynch) [15:31:47] (03CR) 10DLynch: Include client_ip in EditAttemptStep schema (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) (owner: 10DLynch) [15:56:54] (03CR) 10Ottomata: "+1 aside from needing to reset 1.3.0" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) (owner: 10DLynch) [16:07:33] (03PS3) 10DLynch: Include client_ip in EditAttemptStep schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) [16:07:59] (03CR) 10DLynch: Include client_ip in EditAttemptStep schema (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) (owner: 10DLynch) [16:15:37] 10Analytics-Jupyter, 10Data-Engineering, 10Product-Analytics: Replace anaconda-wmf with smaller, non-stacked Conda environments - https://phabricator.wikimedia.org/T302819 (10nshahquinn-wmf) >>! In T302819#8313696, @BTullis wrote: > However, I wouldn't be keen for it to depend on a network file system and a... [16:29:17] PROBLEM - MegaRAID on analytics1068 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [16:48:56] (03PS1) 10Joal: Add XmlFsImageConverter job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/842503 (https://phabricator.wikimedia.org/T261283) [16:55:12] (03CR) 10CI reject: [V: 04-1] Add XmlFsImageConverter job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/842503 (https://phabricator.wikimedia.org/T261283) (owner: 10Joal) [17:48:03] RECOVERY - MegaRAID on analytics1068 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [19:06:49] PROBLEM - MegaRAID on analytics1068 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [19:18:05] RECOVERY - MegaRAID on analytics1068 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [19:30:12] (03CR) 10Ottomata: [C: 03+1] Include client_ip in EditAttemptStep schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/842452 (https://phabricator.wikimedia.org/T314178) (owner: 10DLynch) [19:51:47] PROBLEM - MegaRAID on analytics1068 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [20:14:19] RECOVERY - MegaRAID on analytics1068 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [20:34:09] 10Data-Engineering, 10SRE, 10serviceops, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10JArguello-WMF) 05Open→03Resolved [21:09:13] (VarnishkafkaNoMessages) firing: varnishkafka on cp5010 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=eqsin%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp5010%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [21:14:12] (VarnishkafkaNoMessages) resolved: varnishkafka on cp5010 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=eqsin%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp5010%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [21:33:11] PROBLEM - MegaRAID on analytics1068 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [21:44:27] RECOVERY - MegaRAID on analytics1068 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [23:36:59] PROBLEM - MegaRAID on analytics1068 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring