[04:39:18] (DruidSegmentsUnavailable) firing: More than 10 segments have been unavailable for mediawiki_history_reduced_2022_12 on the druid_public Druid cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Druid/Alerts#Druid_Segments_Unavailable - https://grafana.wikimedia.org/d/000000538/druid?refresh=1m&var-cluster=druid_public&panelId=49&fullscreen&orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DDruidSegmentsUnavailable [04:59:18] (DruidSegmentsUnavailable) resolved: More than 10 segments have been unavailable for mediawiki_history_reduced_2022_12 on the druid_public Druid cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Druid/Alerts#Druid_Segments_Unavailable - https://grafana.wikimedia.org/d/000000538/druid?refresh=1m&var-cluster=druid_public&panelId=49&fullscreen&orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DDruidSegmentsUnavailable [08:52:12] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 05-06): Update Airflow DAGs code to make it compatible with version V2.3.4 of Airflow - https://phabricator.wikimedia.org/T309552 (10Antoine_Quhen) [08:52:36] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 05-06): Update Airflow DAGs code to make it compatible with version V2.3.4 of Airflow - https://phabricator.wikimedia.org/T309552 (10Antoine_Quhen) [08:53:20] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 05-06), 10Patch-For-Review, 10SecTeam-Processed, 10Vuln-VulnComponent: Upgrade Airflow configuration file in puppet to be compatible with version 2.3.4 - https://phabricator.wikimedia.org/T315580 (10Antoine_Quhen) [08:53:24] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 05-06): Update Airflow DAGs code to make it compatible with version V2.3.4 of Airflow - https://phabricator.wikimedia.org/T309552 (10Antoine_Quhen) [08:57:12] (VarnishkafkaNoMessages) firing: varnishkafka on cp2029 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=codfw%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp2029%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [09:02:12] (VarnishkafkaNoMessages) resolved: varnishkafka on cp2029 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=codfw%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp2029%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [11:01:41] (VarnishkafkaNoMessages) firing: varnishkafka on cp4037 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=ulsfo%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp4037%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [11:01:42] (VarnishkafkaNoMessages) firing: varnishkafka on cp4037 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=ulsfo%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp4037%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [11:06:42] (VarnishkafkaNoMessages) resolved: varnishkafka on cp4037 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=ulsfo%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp4037%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [11:06:42] (VarnishkafkaNoMessages) resolved: varnishkafka on cp4037 is not sending enough cache_text requests - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=ulsfo%20prometheus/ops&var-cp_cluster=cache_text&var-instance=cp4037%3A9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages [11:18:01] 10Data-Engineering, 10Patch-For-Review, 10Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-05)): Create partman recipe for cephosd servers - https://phabricator.wikimedia.org/T324670 (10BTullis) I've completed an installation with manual partitioning and I'm happy enough with it. Here's the output from `ls... [11:49:23] (03CR) 10Mforns: "It just occurred to me that we could add an optional parameter 'snapshot_interval' that could receive the values: month or week (maybe eve" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/870971 (https://phabricator.wikimedia.org/T323614) (owner: 10Xcollazo) [12:35:32] 10Data-Engineering, 10Patch-For-Review, 10Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-05)): Create partman recipe for cephosd servers - https://phabricator.wikimedia.org/T324670 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1001 for host cephosd1001.eqiad.wmne... [14:00:32] 10Data-Engineering, 10serviceops, 10Discovery-Search (Current work), 10Event-Platform Value Stream (Sprint 05), 10Patch-For-Review: Flink on Kubernetes Helm charts - https://phabricator.wikimedia.org/T324576 (10JMeybohm) >>! In T324576#8464284, @Ottomata wrote: > **Ingress**: I don't think we //need// an... [14:04:06] 10Data-Engineering, 10Pageviews-API: Provide a mechanism to notify subscribers when page view data is available - https://phabricator.wikimedia.org/T326229 (10kostajh) [14:30:48] (03PS1) 10Gerrit maintenance bot: Add guc.wikipedia to pageview whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/875328 (https://phabricator.wikimedia.org/T326236) [14:39:26] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 05-06), 10Patch-For-Review, 10SecTeam-Processed, 10Vuln-VulnComponent: Upgrade Puppet code to make Airflow configuration files compatible with version 2.3.4 - https://phabricator.wikimedia.org/T315580 (10Antoine_Quhen) [15:00:19] 10Data-Engineering, 10Patch-For-Review, 10Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-05)): Create partman recipe for cephosd servers - https://phabricator.wikimedia.org/T324670 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1001 for host cephosd1001.eqiad.wmnet wi... [15:09:33] 10Data-Engineering, 10serviceops, 10Discovery-Search (Current work), 10Event-Platform Value Stream (Sprint 05), 10Patch-For-Review: Flink on Kubernetes Helm charts - https://phabricator.wikimedia.org/T324576 (10Ottomata) Ingress: Okay, let's put off working on ingress for the jobmanager UI port until lat... [15:38:20] 10Data-Engineering, 10Patch-For-Review, 10Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-05)): Create partman recipe for cephosd servers - https://phabricator.wikimedia.org/T324670 (10BTullis) The partman recipe is largely working as expected, but I'm still finding some unpredictability regarding the dev... [16:33:35] 10Data-Engineering, 10Patch-For-Review, 10Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-05)): Create partman recipe for cephosd servers - https://phabricator.wikimedia.org/T324670 (10BTullis) I've now even seen `/dev/sda` and `/dev/sdb` swap over on a subsequent boot, with no other changes. ` btullis@ce... [16:46:15] 10Data-Engineering, 10Event-Platform Value Stream, 10Patch-For-Review: Design Schema for page state and page state with content (enriched) streams - https://phabricator.wikimedia.org/T308017 (10Milimetric) @Ottomata I don't have any preference here, it just occurred to me that you could also work around the... [17:00:08] 10Data-Engineering, 10Event-Platform Value Stream, 10Patch-For-Review: Design Schema for page state and page state with content (enriched) streams - https://phabricator.wikimedia.org/T308017 (10Ottomata) It would be: `lang=yaml prior_state: type: object properties: page: type: object pro... [17:04:30] 10Data-Engineering-Radar, 10MW-on-K8s, 10serviceops, 10Patch-For-Review: IPInfo MediaWiki extension depends on presence of maxmind db in the container/host - https://phabricator.wikimedia.org/T288375 (10Clement_Goubert) PSP needs to be updated before we can deploy. [17:14:24] !log Dropped all temporary differential privacy tables with the 'DROP DATABASE tumult_temp_*' pattern. [17:14:25] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:31:18] 10Analytics-Radar, 10Recommendation-API, 10SRE, 10SRE-swift-storage: Run swift-object-expirer as part of the swift cluster - https://phabricator.wikimedia.org/T229584 (10LSobanski) [17:51:05] 10Data-Engineering, 10CheckUser, 10MW-1.38-notes (1.38.0-wmf.26; 2022-03-14), 10MW-1.39-notes (1.39.0-wmf.23; 2022-08-01), and 4 others: Update CheckUser for actor and comment table - https://phabricator.wikimedia.org/T233004 (10Dreamy_Jazz) I don't have my laptop so I can't plus +2 but the above change lo... [18:00:29] 10Data-Engineering, 10Patch-For-Review, 10Shared-Data-Infrastructure (EQ2 Kanban (Sprints 04-05)): Create partman recipe for cephosd servers - https://phabricator.wikimedia.org/T324670 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1001 for host cephosd1002.eqiad.wmne... [19:42:05] (03CR) 10Xcollazo: Modify refinery-drop-older-than to support 'snapshot' partitions (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/870971 (https://phabricator.wikimedia.org/T323614) (owner: 10Xcollazo) [20:03:46] 10Data-Engineering: Requesting Kerberos identity for Hxi-ctr - https://phabricator.wikimedia.org/T325857 (10mpopov) @BTullis: I pinged Hua about this and apparently Google thought it was spam. Just a heads-up for future Kerberos principal requests. [20:04:27] 10Data-Engineering: Requesting Kerberos identity for Hxi-ctr - https://phabricator.wikimedia.org/T325857 (10mpopov) @HXi-WMF: Can't remember if the email includes a link to https://wikitech.wikimedia.org/wiki/Analytics/Systems/Kerberos/UserGuide#Authenticate_via_Kerberos but adding it here just in case. [20:44:40] (03PS1) 10Mforns: Add gor.wiktionary to the pageviews allow list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/875427 [20:45:44] (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM, merging to stop alerts." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/875427 (owner: 10Mforns) [21:08:44] 10Data-Engineering-Planning, 10Equity-Landscape: Split country data into regional classification and main country data - https://phabricator.wikimedia.org/T320985 (10JAnstee_WMF) See the mirror tab for data to hive https://docs.google.com/spreadsheets/d/1kGL-s7EACBjD_z0YjlBm24Q0macxtJCoXmnbnAEPy2Q/edit#gid=129... [21:09:00] 10Data-Engineering-Planning, 10Equity-Landscape: Split country data into regional classification and main country data - https://phabricator.wikimedia.org/T320985 (10JAnstee_WMF) 05Open→03Resolved [21:09:02] 10Data-Engineering, 10Equity-Landscape: Load country data - https://phabricator.wikimedia.org/T310712 (10JAnstee_WMF) [21:23:38] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 05-06): Update Airflow DAGs code to make it compatible with version V2.3.4 of Airflow - https://phabricator.wikimedia.org/T309552 (10Antoine_Quhen) https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/201 [21:54:09] 10Data-Engineering, 10Data-Engineering-Kanban: Pageview Data loss due to wrong version of package installed on some varnishkafka instances - https://phabricator.wikimedia.org/T300164 (10odimitrijevic) A detailed summary report has been published on https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Data_I... [22:06:26] 10Data-Engineering, 10Event-Platform Value Stream: [EPIC] Streaming and event driven Python services - https://phabricator.wikimedia.org/T324689 (10Ottomata) @gmodena, I've been trying to write tests for [[ https://gitlab.wikimedia.org/repos/data-engineering/eventutilities-python | eventutilities-python ]], so... [22:11:54] 10Quarry: it would be useful to run the same Quarry query conveniently in several database - https://phabricator.wikimedia.org/T95582 (10Milimetric) At this point, we are making significant progress on near-real-time dumps generation. If that project continues to work out, we could have an alternate view availa... [22:22:20] 10Quarry: it would be useful to run the same Quarry query conveniently in several database - https://phabricator.wikimedia.org/T95582 (10bd808) >>! In T95582#8500240, @Milimetric wrote: > At this point, we are making significant progress on near-real-time dumps generation. If that project continues to work out,...