[00:33:33] (SystemdUnitFailed) firing: (18) hadoop-yarn-nodemanager.service Failed on an-test-worker1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:33:33] (SystemdUnitFailed) firing: (18) hadoop-yarn-nodemanager.service Failed on an-test-worker1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:55:54] 10Data-Engineering-Planning, 10SRE-swift-storage, 10Event-Platform Value Stream (Sprint 12): Storage request: swift s3 bucket for mediawiki-page-content-change-enrichment checkpointing - https://phabricator.wikimedia.org/T330693 (10gmodena) >>! In T330693#8846250, @Eevans wrote: > Ok, this is setup and has b... [07:18:35] (03CR) 10Kosta Harlan: [C: 03+2] Add pageviews_token to analytics/mediawiki/mentor_dashboard/visit [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919218 (https://phabricator.wikimedia.org/T325117) (owner: 10Urbanecm) [07:20:32] (03CR) 10Kosta Harlan: Add pageviews_token to analytics/mediawiki/mentor_dashboard/visit (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919218 (https://phabricator.wikimedia.org/T325117) (owner: 10Urbanecm) [08:27:00] I'm planning to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/914928 for adding Iceberg support to spark defaults, unless anyone would rather I defer it. [08:33:33] (SystemdUnitFailed) firing: (18) hadoop-yarn-nodemanager.service Failed on an-test-worker1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:40:45] @btullis I think it's safe and desirable, merge away [08:41:30] milimetric: Thanks, that was my reading of it too but I appreciate the second opinion. Merging now. [08:42:15] Done :-) [08:50:25] This is very good: T336544 :) [08:50:26] T336544: Codex, Graph, and Wikistats walk into a bar graph - https://phabricator.wikimedia.org/T336544 [09:21:04] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 12), 10Patch-For-Review: Setup config to allow lineage instrumentation - https://phabricator.wikimedia.org/T333004 (10BTullis) Sadly, I believe that the on-premise version of 3/ is incompatible from a licencing perspective :-( https://github.com/sqlparser... [09:25:38] 10Data-Engineering-Planning, 10Event-Platform Value Stream, 10Machine-Learning-Team: Add a new outlink topic stream for EventGate main - https://phabricator.wikimedia.org/T328899 (10achou) @pfischer As far as I know, the Outlink topic model does not use redirect information to predict the topic of an article... [09:30:36] (03PS2) 10Urbanecm: Add pageviews_token to analytics/mediawiki/mentor_dashboard/visit [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919218 (https://phabricator.wikimedia.org/T325117) [09:32:30] (03CR) 10Urbanecm: Add pageviews_token to analytics/mediawiki/mentor_dashboard/visit (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919218 (https://phabricator.wikimedia.org/T325117) (owner: 10Urbanecm) [10:39:47] 10Data-Engineering, 10Data-Platform-SRE, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 12): Fix hiveserver2 related errors on bullseye hadoop clients and workers - https://phabricator.wikimedia.org/T336281 (10BTullis) OK, this is fixed for some hosts. It came about because installing the `hive` pack... [10:40:23] 10Data-Engineering-Planning, 10Patch-For-Review, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 12): Upgrade Hadoop test cluster to Bullseye - https://phabricator.wikimedia.org/T329363 (10BTullis) [11:06:15] btullis: you gotta help me think of a punchline. When I close that task in like two years' time, I want a really good one [11:08:40] 10Data-Engineering, 10Superset: SQL lab access for Andrew McAllister - https://phabricator.wikimedia.org/T335940 (10AndrewTavis_WMDE) @Manuel, just a note that this would be something to bring up when we attend the office hours next week :) [11:54:15] The best I could come up with so far was: "The bartender says, sorry we don't serve tuples here" [12:00:58] ...but I was still trying for something about RDF or SparQL. [12:33:33] (SystemdUnitFailed) firing: (18) hadoop-yarn-nodemanager.service Failed on an-test-worker1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:59:43] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Patch-For-Review, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 12): Upgrade Hadoop test cluster to Bullseye - https://phabricator.wikimedia.org/T329363 (10BTullis) OK, so the last remaining issue for hadoop-workers on bullseye is the fact that w... [13:19:03] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 12): Upgrade the spark YARN shuffler service on Hadoop workers from version 2 to 3 - https://phabricator.wikimedia.org/T332765 (10BTullis) [14:33:00] 10Data-Engineering-Planning, 10Data Pipelines, 10Shared-Data-Infrastructure: [Iceberg] Debianize and install iceberg support for Spark, Presto, and optionally Hive - https://phabricator.wikimedia.org/T311738 (10xcollazo) [15:03:53] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 12): Upgrade the spark YARN shuffler service on Hadoop workers from version 2 to 3 - https://phabricator.wikimedia.org/T332765 (10BTullis) I'm going to follow the excellent example from @xcollazo in {T335... [15:23:34] 10Quarry, 10cloud-services-team (FY2022/2023-Q4): Consider moving Quarry to be an installation of a community supported analytics tool - https://phabricator.wikimedia.org/T169452 (10Amire80) I saw the notice on Quarry about suggested move to Superset. Sounds interesting, but is there a manual? In Quarry, it is... [15:32:23] 10Quarry, 10cloud-services-team (FY2022/2023-Q4): Consider moving Quarry to be an installation of a community supported analytics tool - https://phabricator.wikimedia.org/T169452 (10rook) >>! In T169452#8847660, @Amire80 wrote: > I saw the notice on Quarry about suggested move to Superset. Sounds interesting,... [15:41:58] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 13): Deprecate old mobile datasets - https://phabricator.wikimedia.org/T329310 (10JArguello-WMF) [15:42:08] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 13), 10Patch-For-Review: Setup config to allow lineage instrumentation - https://phabricator.wikimedia.org/T333004 (10JArguello-WMF) [15:50:29] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 13): Deprecate old mobile datasets - https://phabricator.wikimedia.org/T329310 (10mforns) [15:50:51] (03PS1) 10Mforns: Remove deprecated code for AppSessionMetrics [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/919357 (https://phabricator.wikimedia.org/T329310) [15:51:35] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 13), 10Patch-For-Review: Deprecate old mobile datasets - https://phabricator.wikimedia.org/T329310 (10mforns) [15:52:21] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 13): Enable HA failover for flink-kubernetes-operator - https://phabricator.wikimedia.org/T336185 (10JArguello-WMF) [15:52:24] 10Data-Engineering, 10serviceops-radar, 10Event-Platform Value Stream (Sprint 13): Store Flink HA metadata in Zookeeper - https://phabricator.wikimedia.org/T331283 (10JArguello-WMF) [15:52:26] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 13): Update eventgate helm chart to use automatic kafka egress networkpolicies - https://phabricator.wikimedia.org/T335024 (10JArguello-WMF) [15:52:28] 10Data-Engineering-Planning, 10SRE-swift-storage, 10Event-Platform Value Stream (Sprint 13): Storage request: swift s3 bucket for mediawiki-page-content-change-enrichment checkpointing - https://phabricator.wikimedia.org/T330693 (10JArguello-WMF) [15:52:46] 10Quarry: superset to external db - https://phabricator.wikimedia.org/T336588 (10rook) [15:52:59] 10Data-Engineering-Planning, 10Event-Platform Value Stream (Sprint 13): Flink Enrichment monitoring - https://phabricator.wikimedia.org/T328925 (10JArguello-WMF) [15:53:40] 10Data-Engineering-Planning, 10Event-Platform Value Stream (Sprint 13), 10Patch-For-Review: Improve mediawiki-event-enrichment test suite - https://phabricator.wikimedia.org/T328013 (10JArguello-WMF) [15:53:42] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 13): eventutilities-python should support using Kafka TLS ports - https://phabricator.wikimedia.org/T331526 (10JArguello-WMF) [15:54:15] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 13), 10Patch-For-Review: eventutilities-python manager should set up python logging with ECS format - https://phabricator.wikimedia.org/T335802 (10JArguello-WMF) [15:54:58] 10Data-Engineering-Planning, 10serviceops, 10Event-Platform Value Stream (Sprint 13), 10Patch-For-Review, 10Service-deployment-requests: New Service Request mediawiki-page-content-change-enrichment - https://phabricator.wikimedia.org/T330507 (10JArguello-WMF) [15:55:09] 10Data-Engineering-Planning, 10serviceops-radar, 10Event-Platform Value Stream (Sprint 13): mediawiki-event-enrichment in k8s should use mwapi-async envoy listener for MW api requests - https://phabricator.wikimedia.org/T333575 (10JArguello-WMF) [15:55:20] 10Data-Engineering, 10serviceops, 10Event-Platform Value Stream (Sprint 13), 10Patch-For-Review: New Service Request: flink-kubernetes-operator - https://phabricator.wikimedia.org/T333464 (10JArguello-WMF) [15:55:43] 10Data-Engineering, 10Metrics-Platform-Planning, 10Product-Analytics, 10WMF-Architecture-Team, 10Event-Platform Value Stream (Sprint 13): Major (API) versioning of Event Platform streams - https://phabricator.wikimedia.org/T332212 (10JArguello-WMF) [15:56:12] 10Data-Engineering, 10Machine-Learning-Team, 10Research, 10Event-Platform Value Stream (Sprint 12): Design event schema for ML scores/recommendations on current page state - https://phabricator.wikimedia.org/T331401 (10JArguello-WMF) @achou is it ok if I close this ticket? [16:03:07] 10Quarry: Superset: cannot deploy trove at 10GB - https://phabricator.wikimedia.org/T336589 (10rook) [16:03:53] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 13), 10Patch-For-Review: Deprecate old mobile datasets - https://phabricator.wikimedia.org/T329310 (10mforns) [16:05:06] !log dropped mobile_apps_* hive tables because of https://phabricator.wikimedia.org/T329310 [16:05:09] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:15:01] (03CR) 10Milimetric: "Nice. Overall this is looking good, a few more tweaks." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/914799 (owner: 10Nick Ifeajika) [16:33:33] (SystemdUnitFailed) firing: (18) hadoop-yarn-nodemanager.service Failed on an-test-worker1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:00:25] 10Data-Engineering, 10Data-Persistence, 10Event-Platform Value Stream, 10IP Masking, 10Platform Engineering: MediaWiki user types - https://phabricator.wikimedia.org/T336176 (10pmiazga) After learning about the complexity (types of users, cases when we have no record in `user`, bots handling) I understan... [17:27:56] 10Data-Engineering, 10Machine-Learning-Team, 10Research, 10Event-Platform Value Stream (Sprint 12): Design event schema for ML scores/recommendations on current page state - https://phabricator.wikimedia.org/T331401 (10achou) @JArguello-WMF Yep, it's finished. Thank you! :) [17:28:59] 10Data-Engineering, 10Machine-Learning-Team, 10Research, 10Event-Platform Value Stream (Sprint 12): Design event schema for ML scores/recommendations on current page state - https://phabricator.wikimedia.org/T331401 (10JArguello-WMF) 05Open→03Resolved [17:29:05] 10Data-Engineering, 10Event-Platform Value Stream, 10Machine-Learning-Team, 10Research: Proposal: Create a stream end point for Revision Risk Model - https://phabricator.wikimedia.org/T326179 (10JArguello-WMF) [18:02:49] 10Data-Engineering-Planning, 10SRE-swift-storage, 10Event-Platform Value Stream (Sprint 13): Storage request: swift s3 bucket for mediawiki-page-content-change-enrichment checkpointing - https://phabricator.wikimedia.org/T330693 (10Eevans) 05Open→03Resolved [18:02:55] 10Data-Engineering-Planning, 10serviceops, 10Event-Platform Value Stream (Sprint 13), 10Patch-For-Review, 10Service-deployment-requests: New Service Request mediawiki-page-content-change-enrichment - https://phabricator.wikimedia.org/T330507 (10Eevans) [18:02:57] 10Data-Engineering-Planning, 10SRE-swift-storage, 10Event-Platform Value Stream (Sprint 13): Storage request: swift s3 bucket for mediawiki-page-content-change-enrichment checkpointing - https://phabricator.wikimedia.org/T330693 (10Eevans) >>! In T330693#8846701, @gmodena wrote: >>>! In T330693#8846250, @Eev... [18:16:34] (03CR) 10Kosta Harlan: [C: 03+2] Add pageviews_token to analytics/mediawiki/mentor_dashboard/visit [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919218 (https://phabricator.wikimedia.org/T325117) (owner: 10Urbanecm) [18:17:09] (03Merged) 10jenkins-bot: Add pageviews_token to analytics/mediawiki/mentor_dashboard/visit [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919218 (https://phabricator.wikimedia.org/T325117) (owner: 10Urbanecm) [20:16:36] (03CR) 10Milimetric: "I think maybe the tests are using node 12 or higher, because node 10 hasn't been supported for a long time. If so, that would explain the" [analytics/aqs] - 10https://gerrit.wikimedia.org/r/915678 (owner: 10Nick Ifeajika) [20:22:08] (03PS2) 10Urbanecm: [WIP] Add analytics/mediawiki/mentor_dashboard/interaction [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919236 (https://phabricator.wikimedia.org/T325117) [20:22:35] (03CR) 10CI reject: [V: 04-1] [WIP] Add analytics/mediawiki/mentor_dashboard/interaction [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919236 (https://phabricator.wikimedia.org/T325117) (owner: 10Urbanecm) [20:25:26] (03PS3) 10Urbanecm: [WIP] Add analytics/mediawiki/mentor_dashboard/interaction [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919236 (https://phabricator.wikimedia.org/T325117) [20:26:22] (03CR) 10CI reject: [V: 04-1] [WIP] Add analytics/mediawiki/mentor_dashboard/interaction [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919236 (https://phabricator.wikimedia.org/T325117) (owner: 10Urbanecm) [20:32:48] (03PS4) 10Urbanecm: [WIP] Add analytics/mediawiki/mentor_dashboard/interaction [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919236 (https://phabricator.wikimedia.org/T325117) [20:33:33] (SystemdUnitFailed) firing: (18) hadoop-yarn-nodemanager.service Failed on an-test-worker1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:46:04] (03PS5) 10Urbanecm: [WIP] Add analytics/mediawiki/mentor_dashboard/interaction [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/919236 (https://phabricator.wikimedia.org/T325117)