[01:12:48] (03Abandoned) 10Milimetric: Code clean up [analytics/aqs] - 10https://gerrit.wikimedia.org/r/919894 (owner: 10Nick Ifeajika) [01:15:51] (03CR) 10Milimetric: [C: 04-1] "Please also send updates to the knowledge-gap.yaml pointed out by Joseph here: https://gerrit.wikimedia.org/r/c/analytics/aqs/+/915678/8/v" [analytics/aqs] - 10https://gerrit.wikimedia.org/r/933603 (https://phabricator.wikimedia.org/T337059) (owner: 10Nick Ifeajika) [01:17:18] (03CR) 10Milimetric: [V: 03+2 C: 03+2] "Good work, this worked to load data in production, merging." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/914799 (https://phabricator.wikimedia.org/T337059) (owner: 10Nick Ifeajika) [02:02:49] (03PS1) 10Kimberly Sarabia: Fix editattemptstep ref [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/934032 (https://phabricator.wikimedia.org/T337270) [08:55:04] joal: Should we go for an airflow upgrade on an-launcher1002 today? [08:55:47] hi btullis - Let's do! [09:00:43] joal: OK! [09:34:46] btullis: I'll lunch with my daughter inabout 1/2h - Can we postpone the migration to early afternoon (like in 2h)? [09:35:37] Yes, sure. I'm currently rebasing the gitlab MR, but it was made a bit more complicated by this one: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/417 [09:36:40] Ah yes of course :S Let me know if Icanhelp [09:52:46] The new tests failed, but I'm not sure if I had to trigger a manual run of the build step beforehand. That's running now: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/pipelines/21446 [09:53:36] aqu might have some ideas of whether I've done something wrong during the rebase, or whether it's an issue with the way the new CI pipeline is configured. [09:54:02] Hello btullis, checking... [09:54:17] aqu: Many thanks. [09:56:16] Yes, the test/lint container images needs to be created for this new environment version. [09:57:51] Would it have been done automatically if I had been doing this from scratch (as opposed to rebasing a change from before the CI revamp), or will this step always need to be run manually? [10:08:58] aqu: Thanks for the additional commit [10:09:29] Currently it's a manual step. https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/5e77b014905b7d14d7c2199f12c9f13236b23852/.gitlab-ci.yml#L54 [10:09:29] Automatizing this process touches the subject "how do we manage versions". [10:11:26] I'm going to write some info in the doc + wikitech about the current process. [10:11:32] Yes, it looks like there are at least four or five places in airflow-dags where we have to update the main version number and package name, even before we get to versions of constituent components like python. [10:12:49] One failed test now. [10:14:09] Looks like it's related to the spark-submit path. [10:14:17] Yes, I suppose, some fixtures should be regenerated. [10:17:48] btw btullis, I wrote this doc about upgrading the debian pkg recently: https://wikitech.wikimedia.org/wiki/Data_Engineering/Systems/Airflow/Developer_guide#Airflow_Debian_package_upgrade_process [10:17:48] You may correct with your work to upgrade to 2.6.1. [10:19:01] Great, thanks. I'm just wondering, what about if we used the `changes` parameter in the `build-image-conda_env` to run this job automatically whenever the Dockerfile is changed, or a similar rule? [10:19:04] https://docs.gitlab.com/ee/ci/jobs/job_control.html#complex-rules [11:02:20] (03PS1) 10Btullis: Fix the path to the init.sql file in the mysql-setup container [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/934290 (https://phabricator.wikimedia.org/T329514) [11:10:41] 10Analytics-Radar, 10Data-Engineering-Icebox, 10Discovery-Search, 10Reading-Admin, and 3 others: Image Classification Working Group - https://phabricator.wikimedia.org/T215413 (10Miriam) [11:31:06] aqu: joal: I had a quick go at fixing the test failures , but it didn't work: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/385/diffs?commit_id=eceee33c3a21a56f8048bb6b6c350ffd5a99644b [11:31:40] I confess that I'm not sure what I'm doing very much here. [11:31:55] (03CR) 10Btullis: [C: 03+2] Fix the path to the init.sql file in the mysql-setup container [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/934290 (https://phabricator.wikimedia.org/T329514) (owner: 10Btullis) [11:36:43] Hi btullis - checking [11:38:13] (03CR) 10CI reject: [V: 04-1] Fix the path to the init.sql file in the mysql-setup container [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/934290 (https://phabricator.wikimedia.org/T329514) (owner: 10Btullis) [11:51:07] ping btullis /aqu - isit expected that even for virtual-env spark-jobs we use `spark-submit` and not `venv/bin/spark-submit`? [11:51:32] I'm ok to update the test, but I wonder if this is really the expected behavior [11:54:56] Actually,this looks like a bug [11:55:03] or a regression [11:55:07] if Imaysay [11:55:10] if I may say [11:56:35] I would have to defer to someone more knowledgeable on the matter. [11:57:31] continuing my investigations [11:59:40] btullis: while I'm investigating the airflow stuff, would you have a minute tocheck something for me please? [12:15:50] ok,we have anissue [12:21:13] 10Data-Platform-SRE, 10SRE, 10decommission-hardware, 10ops-eqiad: Decommission an-test-coord1002 - https://phabricator.wikimedia.org/T336062 (10Jclark-ctr) [12:21:20] 10Data-Platform-SRE, 10SRE, 10decommission-hardware, 10ops-eqiad: Decommission an-test-coord1002 - https://phabricator.wikimedia.org/T336062 (10Jclark-ctr) 05Open→03Resolved [12:21:30] joal: Yes, I'm here. How can I help? [12:23:35] btullis: that's weird - the thing that makes our patches fail is the patch I merged yesterday [12:23:41] btullis: batcave quaickly? [12:23:57] Yep. [12:46:29] (03CR) 10Btullis: [C: 03+2] "recheck" [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/934290 (https://phabricator.wikimedia.org/T329514) (owner: 10Btullis) [12:49:00] (03Abandoned) 10Btullis: Experimental refactor of the datahub container build process [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/900310 (https://phabricator.wikimedia.org/T301453) (owner: 10Btullis) [13:03:24] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Hadoop test cluster to Bullseye - https://phabricator.wikimedia.org/T329363 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1001 for host an-test-worker1003.eqiad.wmnet with OS bullseye [13:03:45] aqu,mforns - isanyone of you there? [13:03:59] I's like a quick talk on airflow/python [13:04:00] please :) [13:04:01] Hey [13:04:24] Heya aqu - batcave? [13:04:29] ok [13:05:41] (03Merged) 10jenkins-bot: Fix the path to the init.sql file in the mysql-setup container [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/934290 (https://phabricator.wikimedia.org/T329514) (owner: 10Btullis) [13:08:49] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 14 B): All eventgate clusters should be able to use remote schema repos - https://phabricator.wikimedia.org/T340166 (10JArguello-WMF) 05Open→03Resolved [13:08:52] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 14 B): Use ECS logging fields when adding extra info to mediawiki-event-enrichment - https://phabricator.wikimedia.org/T337399 (10JArguello-WMF) 05Open→03Resolved [13:08:56] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 14 B): Update eventgate and eventstreams helm chart to use automatic kafka egress networkpolicies and envoy service mesh - https://phabricator.wikimedia.org/T335024 (10JArguello-WMF) 05Open→03Resolved [13:08:59] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 14), 10Event-Platform Value Stream (Sprint 14 B): Event partitions missing since 2023-02-21T10:00 for stream without events (canary events not produced?) - https://phabricator.wikimedia.org/T330236 (10JArguello-WMF) 05Open→03Resolved [13:09:01] 10Data-Engineering-Planning, 10Event-Platform Value Stream (Sprint 14 B): Improve Event Platform and MediaWiki Event Enrichment wikitech documentation - https://phabricator.wikimedia.org/T329629 (10JArguello-WMF) 05Open→03Resolved [13:09:03] 10Data-Engineering-Planning, 10Event-Platform Value Stream, 10Epic: Event Platform Value Stream Documentation Tasks - https://phabricator.wikimedia.org/T329628 (10JArguello-WMF) [13:11:02] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Hadoop test cluster to Bullseye - https://phabricator.wikimedia.org/T329363 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1001 for host an-test-worker1003.eqiad.wmnet with OS bullseye executed... [13:12:17] 10Data-Engineering, 10Event-Platform Value Stream: mediawiki-event-enrichment: changes to test image seem to be ignored in CI - https://phabricator.wikimedia.org/T340195 (10JArguello-WMF) [13:12:37] 10Data-Engineering, 10Event-Platform Value Stream, 10serviceops-radar: [NEEDS GROOMING] Store Flink HA metadata in Zookeeper - https://phabricator.wikimedia.org/T331283 (10JArguello-WMF) [13:12:53] 10Data-Engineering, 10Discovery-Search, 10Event-Platform Value Stream, 10serviceops-radar: [NEEDS GROOMING] Store Flink HA metadata in Zookeeper - https://phabricator.wikimedia.org/T331283 (10JArguello-WMF) [13:14:21] 10Data-Engineering, 10serviceops, 10Event-Platform Value Stream (Sprint 14 B): Flink k8s operator in staging sometimes will not sync changes to FlinkDeployments - https://phabricator.wikimedia.org/T340059 (10JArguello-WMF) [13:14:25] 10Data-Engineering, 10Discovery-Search, 10Event-Platform Value Stream (Sprint 14 B): Flink Enrichment job alerting - https://phabricator.wikimedia.org/T340666 (10JArguello-WMF) [13:14:29] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 14 B): mw-page-content-change-enrich should partition by and process by wiki_id,page_id - https://phabricator.wikimedia.org/T338169 (10JArguello-WMF) [13:14:31] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 14 B): jsonschema-tools tests should fail if schema $id does not match title or path - https://phabricator.wikimedia.org/T300404 (10JArguello-WMF) [13:15:15] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Hadoop test cluster to Bullseye - https://phabricator.wikimedia.org/T329363 (10BTullis) I tried downgrading the NIC firmware from `21.81.3` to `21.80.9` but that didn't solve the issue. {F37123139,width=80%} [13:21:01] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Hadoop test cluster to Bullseye - https://phabricator.wikimedia.org/T329363 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1001 for host an-test-worker1003.eqiad.wmnet with OS bullseye [13:27:27] btullis: problem found and solved on airflow (thanks mforns :) [13:27:44] btullis: we can merge and deploy the patch once CI succeeds [13:28:39] btullis stevemunene I think we all forgot about the pairing session ;) . Let's skip for today [13:30:11] was there for a while, thought I read something wrong then left. Thanks inflatador Let's do it next week :) [13:30:33] Cool, sorry for missing the beginning as well... [13:33:23] Gah, sorry inflatador. stevemunene - I got totally wrapped up in what I was doing and missed my calendar pop-up. [13:33:37] joal: Great! [13:33:49] btullis: readyto be merged :) [13:35:49] 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Airflow to version 2.6.1 - https://phabricator.wikimedia.org/T336286 (10CodeReviewBot) btullis merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/385 Update airflow to version 2.6.1 [13:36:25] joal: merged. Should I pause all of the DAGs on the analytics instance? Do you want to do the upgrade in the batcave together? [13:38:11] I'minthe batcave with Marcel [13:39:08] 10Data-Engineering, 10Data-Platform-SRE, 10CAS-SSO, 10Data-Catalog, 10Infrastructure-Foundations: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10JArguello-WMF) [13:40:15] 10Data-Platform-SRE: Add the sparkctl binary to the stat boxes - https://phabricator.wikimedia.org/T318923 (10JArguello-WMF) [13:40:18] 10Data-Platform-SRE: Decide whether to migrate from Presto to Trino - https://phabricator.wikimedia.org/T266640 (10JArguello-WMF) [13:40:22] 10Data-Engineering-Kanban, 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Cassandra, 10User-Eevans: Properly add aqsloader user (w/ secrets) - https://phabricator.wikimedia.org/T305600 (10JArguello-WMF) [13:40:51] 10Data-Engineering-Planning, 10Data-Platform-SRE: Research and test methods for accessing kerberized services from spark running on the DSE K8S cluster - https://phabricator.wikimedia.org/T330162 (10JArguello-WMF) [13:40:54] 10Data-Platform-SRE: Set up Spark SQL Server - https://phabricator.wikimedia.org/T324017 (10JArguello-WMF) [13:40:56] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Data-Catalog: Datahub user records are not being created after login - https://phabricator.wikimedia.org/T327884 (10JArguello-WMF) [13:40:59] 10Data-Platform-SRE: SPIKE: Spin up a Test Trino instance (Evaluate Trino) - https://phabricator.wikimedia.org/T324011 (10JArguello-WMF) [13:41:01] 10Data-Engineering-Planning: Cleanup User Hive Databases - https://phabricator.wikimedia.org/T323884 (10JArguello-WMF) [13:41:03] 10Data-Platform-SRE, 10Data-Catalog: null shown in the user profile dropdown in datahub - https://phabricator.wikimedia.org/T327969 (10JArguello-WMF) [13:41:37] 10Data-Platform-SRE: Deploy spak cli to submit jobs on DSE K8S cluster with K8S config - https://phabricator.wikimedia.org/T331971 (10JArguello-WMF) [13:41:41] 10Data-Engineering-Planning, 10Data-Platform-SRE: Add optional TLS encryption to the druid-public-broker - https://phabricator.wikimedia.org/T331631 (10JArguello-WMF) [13:41:45] 10Data-Engineering-Planning, 10Data-Platform-SRE: Make YARN web interface work with both primary and standby resourcemanager - https://phabricator.wikimedia.org/T331448 (10JArguello-WMF) [13:41:49] 10Data-Platform-SRE: Rework our gitlab runner on VPS Cloud - https://phabricator.wikimedia.org/T330915 (10JArguello-WMF) [13:41:53] 10Data-Engineering-Planning, 10Data-Platform-SRE: Upgrade Stats clients to bullseye - https://phabricator.wikimedia.org/T329360 (10JArguello-WMF) [13:42:01] 10Data-Platform-SRE, 10Patch-For-Review: Deploy spark history - https://phabricator.wikimedia.org/T330176 (10JArguello-WMF) [13:42:05] 10Data-Engineering-Planning, 10Data-Platform-SRE: Move archiva to private IPs + CDN - https://phabricator.wikimedia.org/T317182 (10JArguello-WMF) [13:42:09] 10Data-Engineering-Planning: Late events in wdqs-external.sparql-query? - https://phabricator.wikimedia.org/T310790 (10JArguello-WMF) [13:43:11] 10Data-Platform-SRE, 10Foundational Technology Requests, 10Epic: POC for Running Spark on DSE - https://phabricator.wikimedia.org/T318712 (10JArguello-WMF) [13:43:13] 10Data-Engineering, 10Data-Platform-SRE, 10Epic: Data Infrastructure as a Service MVP - https://phabricator.wikimedia.org/T308317 (10JArguello-WMF) [13:43:16] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Epic: Install Ceph Cluster for Data Engineering - https://phabricator.wikimedia.org/T324660 (10JArguello-WMF) [13:43:18] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Epic: Upgrade the Data Engineering infrastructure to Debian Bullseye - https://phabricator.wikimedia.org/T288804 (10JArguello-WMF) [13:44:09] !log upgrading airflow on an-launcher1002 to version 2.6.1 [13:44:11] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:44:29] 10Data-Engineering, 10Data-Platform-SRE: Send a critical alert to data-engineering if produce_canary_events isn't running correctly - https://phabricator.wikimedia.org/T337055 (10JArguello-WMF) [13:44:31] 10Data-Engineering, 10Data-Platform-SRE, 10Patch-For-Review: Add checksumming of miniconda installer - https://phabricator.wikimedia.org/T337271 (10JArguello-WMF) [13:44:33] 10Data-Engineering, 10Data-Platform-SRE: Decommission kafka-jumbo100[1-6] - https://phabricator.wikimedia.org/T336044 (10JArguello-WMF) [13:44:35] 10Data-Engineering, 10Data-Platform-SRE: Decommission druid100[4-6] - https://phabricator.wikimedia.org/T336043 (10JArguello-WMF) [13:44:37] 10Data-Engineering, 10Data-Platform-SRE: Bring an-coord100[3-4] into service - https://phabricator.wikimedia.org/T336045 (10JArguello-WMF) [13:44:40] 10Data-Engineering, 10Data-Platform-SRE: Bring stat1010 into service with GPU from stat1005 - https://phabricator.wikimedia.org/T336040 (10JArguello-WMF) [13:44:42] 10Data-Engineering, 10Data-Platform-SRE: Bring kafka-jumbo10[09-15] into service - https://phabricator.wikimedia.org/T336041 (10JArguello-WMF) [13:44:44] 10Data-Engineering, 10Data-Platform-SRE: Bring druid10[09-11] into service - https://phabricator.wikimedia.org/T336042 (10JArguello-WMF) [13:44:46] 10Data-Platform-SRE: Provide common spark config for spark jobs - https://phabricator.wikimedia.org/T332913 (10JArguello-WMF) [13:44:49] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Data Pipelines: Add support for Iceberg to the Spark Docker Image - https://phabricator.wikimedia.org/T336012 (10JArguello-WMF) [13:44:51] 10Data-Platform-SRE: Provide common hive config for spark jobs - https://phabricator.wikimedia.org/T332912 (10JArguello-WMF) [13:44:55] 10Data-Engineering-Planning, 10Data-Platform-SRE: Upgrade the druid-analytics cluster to bullseye - https://phabricator.wikimedia.org/T332604 (10JArguello-WMF) [13:44:56] 10Data-Platform-SRE, 10Patch-For-Review: Provide common hadooop config for spark jobs - https://phabricator.wikimedia.org/T332909 (10JArguello-WMF) [13:44:58] 10Data-Engineering-Planning, 10Data-Platform-SRE: Upgrade the druid-public cluster to bullseye - https://phabricator.wikimedia.org/T332589 (10JArguello-WMF) [13:45:01] 10Data-Platform-SRE, 10Patch-For-Review: Spark-deploy need to create secret object in spark namespace - https://phabricator.wikimedia.org/T332908 (10JArguello-WMF) [13:45:04] 10Data-Engineering-Planning, 10Data-Platform-SRE: Upgrade an-launcher1002 to bullseye - https://phabricator.wikimedia.org/T332580 (10JArguello-WMF) [13:45:06] 10Data-Engineering-Planning, 10Data-Platform-SRE: Upgrade hadoop standby master to bullseye - https://phabricator.wikimedia.org/T332578 (10JArguello-WMF) [13:45:09] 10Data-Engineering-Planning, 10Data-Platform-SRE: Upgrade hadoop master to bullseye - https://phabricator.wikimedia.org/T332573 (10JArguello-WMF) [13:45:11] 10Data-Platform-SRE: Cleanup HDFS folders for departed users - https://phabricator.wikimedia.org/T332321 (10JArguello-WMF) [13:45:13] 10Data-Engineering-Planning, 10Data-Platform-SRE: Refresh hadoop coordinators an-coord100[1-2] with an-coord[3-4] - https://phabricator.wikimedia.org/T332572 (10JArguello-WMF) [13:45:15] 10Data-Engineering-Planning, 10Data-Platform-SRE: Upgrade hadoop workers to bullseye - https://phabricator.wikimedia.org/T332570 (10JArguello-WMF) [13:45:16] 10Data-Platform-SRE: Deploy timeline server - https://phabricator.wikimedia.org/T331133 (10JArguello-WMF) [13:45:20] 10Data-Engineering-Planning, 10Data-Platform-SRE: Configure load-balancing approriate for ceph radosgw services on the data-engineering cluster - https://phabricator.wikimedia.org/T330153 (10JArguello-WMF) [13:45:22] 10Data-Platform-SRE, 10cloud-services-team: ceph: introduce puppet logic to purge stale keyfiles - https://phabricator.wikimedia.org/T328010 (10JArguello-WMF) [13:45:24] 10Data-Platform-SRE: Getting the Metrics API (K8) functioning to support Auto Scaling - https://phabricator.wikimedia.org/T318925 (10JArguello-WMF) [13:45:26] 10Data-Engineering, 10Data-Platform-SRE, 10Data Pipelines: Airflow scheduler and webserver logs should be readable by airflow instance admins - https://phabricator.wikimedia.org/T304615 (10JArguello-WMF) [13:45:28] 10Data-Engineering-Planning, 10Data-Platform-SRE: Replace db1108 with db1208 - https://phabricator.wikimedia.org/T334055 (10JArguello-WMF) [13:45:30] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Data-Catalog: Review and improve the build process for DataHub containers - https://phabricator.wikimedia.org/T303381 (10JArguello-WMF) [13:45:32] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Patch-For-Review: Pageview definition relies on X-Analytics to determine special pages - https://phabricator.wikimedia.org/T304362 (10JArguello-WMF) [13:45:34] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:rack/setup/install an-worker11[49-56] - https://phabricator.wikimedia.org/T327295 (10JArguello-WMF) [13:45:37] 10Data-Engineering-Planning, 10Data-Platform-SRE: Refactor analytics-meta MariaDB layout to use an-mariadb100[12] - https://phabricator.wikimedia.org/T284150 (10JArguello-WMF) [13:45:39] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Event-Platform Value Stream, 10EventStreams: Implement server side filtering for EventStreams (if we should) - https://phabricator.wikimedia.org/T152731 (10JArguello-WMF) [13:45:40] 10Data-Engineering, 10Data-Platform-SRE, 10Patch-For-Review: Add a presto query logger - https://phabricator.wikimedia.org/T269832 (10JArguello-WMF) [13:45:43] 10Data-Engineering, 10Data-Platform-SRE, 10SRE, 10User-MoritzMuehlenhoff: Hadoop MapReduce port range cannot be configured to a fixed range - https://phabricator.wikimedia.org/T111433 (10JArguello-WMF) [13:45:46] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Infrastructure-Foundations: Also intake Network Error Logging events into the Analytics Data Lake - https://phabricator.wikimedia.org/T304373 (10JArguello-WMF) [13:46:02] 10Data-Platform-SRE, 10API Platform, 10Anti-Harassment, 10Cloud-Services, and 18 others: Migrate PipelineLib repos to GitLab - https://phabricator.wikimedia.org/T332953 (10JArguello-WMF) [13:49:20] 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Airflow to version 2.6.1 - https://phabricator.wikimedia.org/T336286 (10BTullis) We've paused all of the DAGS on the nalytics instance manually, then merged the puppet patch and is has applied cleanly. ` btullis@an-launcher1002:~$ sudo run-puppet-agent Info: Usi... [13:49:24] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Hadoop test cluster to Bullseye - https://phabricator.wikimedia.org/T329363 (10BTullis) I tried downgrading again, from `21.80.8` to `21.60.16`, but that didn't help either. {F37123155,width=80%} [13:51:54] 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Airflow to version 2.6.1 - https://phabricator.wikimedia.org/T336286 (10BTullis) We applied the migrations by running `sudo -u analytics airflow-analytics db upgrade` ` btullis@an-launcher1002:~$ sudo -u analytics airflow-analytics db check /usr/lib/airflow/lib... [13:57:07] 10Data-Engineering-Planning, 10Event-Platform Value Stream, 10Epic: [Event Platform] Design and Implement realtime enrichment pipeline for MW page change with content - https://phabricator.wikimedia.org/T307959 (10JArguello-WMF) [13:57:09] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 14 B), 10Patch-For-Review: mw-page-content-change-enrich should enable HA with k8s ConfigMaps - https://phabricator.wikimedia.org/T338233 (10JArguello-WMF) 05Open→03Resolved [13:57:11] 10Data-Engineering-Planning, 10Event-Platform Value Stream, 10Epic: [Event Platform] Design and Implement realtime enrichment pipeline for MW page change with content - https://phabricator.wikimedia.org/T307959 (10JArguello-WMF) [13:57:13] 10Data-Engineering-Planning, 10Event-Platform Value Stream (Sprint 14 B), 10Patch-For-Review: [Event Platform] Understand, document, and implement error handling and retry logic when fetching data from the MW api - https://phabricator.wikimedia.org/T309699 (10JArguello-WMF) 05Open→03Resolved [13:57:15] 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Airflow to version 2.6.1 - https://phabricator.wikimedia.org/T336286 (10BTullis) We unpaused all DAGs. So far it looks good, but we will keep an eye out for any failures. [14:00:13] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Hadoop test cluster to Bullseye - https://phabricator.wikimedia.org/T329363 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1001 for host an-test-worker1003.eqiad.wmnet with OS bullseye executed... [14:34:38] (03PS1) 10Nmaphophe: Escaped the quotes in the comments in the hql files [analytics/refinery] - 10https://gerrit.wikimedia.org/r/933676 [14:35:30] 10Analytics-Radar, 10Data-Engineering: stat1005: failing systemd job - https://phabricator.wikimedia.org/T330671 (10Aklapper) (Adding #Data-Engineering project tag for re-triage, as #Analytics-Radar is an inactive project tag after #Analytics got archived) [14:35:33] 10Analytics-Radar, 10Data-Engineering, 10Growth-Team, 10Browser-Support-Apple-Safari: Investigate how Safari in iOS 17 and macOS Sonoma will impact URLs generated in Wikimedia sites - https://phabricator.wikimedia.org/T338571 (10Aklapper) (Adding #Data-Engineering project tag for re-triage, as #Analytics-R... [14:35:55] 10Analytics-Radar, 10Data-Engineering, 10Infrastructure-Foundations, 10netops: Errors for ifup@ens5.service after rebooting Ganeti VMs - https://phabricator.wikimedia.org/T273026 (10Aklapper) (Adding #Data-Engineering project tag for re-triage, as #Analytics-Radar is an inactive project tag after #Analytic... [14:36:26] 10Analytics-Radar, 10Data-Engineering, 10Patch-For-Review, 10User-TheDJ: Remove old origin-when-crossorigin Safari misspelling of referrer policy - https://phabricator.wikimedia.org/T338183 (10Aklapper) (Adding #Data-Engineering project tag for re-triage, as #Analytics-Radar is an inactive project tag afte... [14:45:10] (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/933676 (owner: 10Nmaphophe) [15:11:54] FYI, Cassandra restarts on AQS hosts are completed [15:13:35] moritzm: Many thanks indeed! [15:18:05] Looks like out airflow migration was successful :) Thnaks btullis, mforns and aqu :) [15:18:19] yay [15:18:42] Cool [15:41:13] 10Data-Engineering, 10Epic: Data Catalog MVP - https://phabricator.wikimedia.org/T299910 (10JArguello-WMF) [15:42:48] 10Data-Engineering-Planning: Ingest feature Hive schema into datahub - https://phabricator.wikimedia.org/T326598 (10JArguello-WMF) [15:42:51] 10Data-Engineering-Planning, 10Event-Platform Value Stream: Event Platform and DataHub Integration - https://phabricator.wikimedia.org/T318863 (10JArguello-WMF) [15:42:53] 10Data-Engineering-Planning: Re-enable Public Druid metadata ingestion - https://phabricator.wikimedia.org/T311547 (10JArguello-WMF) [15:42:57] 10Data-Engineering, 10Product-Analytics: Propagate field descriptions from event schemas to Hive event tables - https://phabricator.wikimedia.org/T307040 (10JArguello-WMF) [15:42:59] 10Data-Engineering, 10Data Pipelines: Integrate Airflow with DataHub - https://phabricator.wikimedia.org/T306977 (10JArguello-WMF) [15:43:01] 10Data-Engineering-Planning: DataHub rights assignment is case-sensitive - https://phabricator.wikimedia.org/T309382 (10JArguello-WMF) [15:43:03] 10Data-Engineering-Planning, 10Data-Platform-SRE: Datahub user records are not being created after login - https://phabricator.wikimedia.org/T327884 (10JArguello-WMF) [15:43:05] 10Data-Engineering-Planning, 10Data Pipelines: Spike: Integrate Spark with DataHub - https://phabricator.wikimedia.org/T306896 (10JArguello-WMF) [15:43:08] 10Data-Engineering-Planning, 10Metrics-Platform-Planning, 10Product-Analytics: [Metrics Platform] Catalog/dictionary of available standard fields - https://phabricator.wikimedia.org/T267251 (10JArguello-WMF) [15:43:10] 10Data-Platform-SRE: null shown in the user profile dropdown in datahub - https://phabricator.wikimedia.org/T327969 (10JArguello-WMF) [15:43:12] 10Data-Engineering, 10Data-Platform-SRE, 10CAS-SSO, 10Infrastructure-Foundations: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10JArguello-WMF) [15:43:16] 10Data-Engineering-Planning, 10Data-Platform-SRE: Review and improve the build process for DataHub containers - https://phabricator.wikimedia.org/T303381 (10JArguello-WMF) [15:43:45] 10Data-Engineering-Planning, 10Data-Catalog: Document the Pageviews Dataset - https://phabricator.wikimedia.org/T308047 (10JArguello-WMF) 05Open→03Resolved [15:43:58] 10Data-Engineering-Planning, 10Data-Catalog: Adding Datasets: MediaWiki History - https://phabricator.wikimedia.org/T307701 (10JArguello-WMF) 05Open→03Resolved a:03JArguello-WMF [15:45:53] 10Data-Engineering-Planning: Establish a Business Glossary - https://phabricator.wikimedia.org/T311524 (10JArguello-WMF) [15:46:06] 10Data-Engineering: Connect Kafka to the MVP [Mile Stone 5] - https://phabricator.wikimedia.org/T299899 (10JArguello-WMF) [15:46:35] joal: Great! I can send out some comms to teams about upgrading their instances next week. [15:46:39] 10Data-Engineering-Planning: Data Catalog Documentation Style Guide - https://phabricator.wikimedia.org/T310229 (10JArguello-WMF) [15:46:41] 10Data-Engineering-Planning: Data Catalog Demo - https://phabricator.wikimedia.org/T310203 (10JArguello-WMF) [15:46:45] 10Data-Engineering-Planning: Document Two Additional Canonical Datasets - https://phabricator.wikimedia.org/T308048 (10JArguello-WMF) [15:46:48] 10Data-Engineering-Planning, 10Patch-For-Review: Create Airflow Pipeline for Ingesting/Updating Superset Data - https://phabricator.wikimedia.org/T309622 (10JArguello-WMF) [15:49:08] 10Data-Engineering-Planning, 10Data Pipelines: Dataset Tagging: Curated dataset tag - https://phabricator.wikimedia.org/T307706 (10JArguello-WMF) [15:49:24] 10Data-Engineering: Adding Datasets: Druid Datasets - https://phabricator.wikimedia.org/T307702 (10JArguello-WMF) [15:50:02] 10Data-Engineering: Spike: Evaluate data steward assignment - https://phabricator.wikimedia.org/T307719 (10JArguello-WMF) [15:50:05] 10Data-Engineering: Tagging Policy - Strategy - https://phabricator.wikimedia.org/T307710 (10JArguello-WMF) [15:50:09] 10Data-Engineering: Dataset Tagging: Implement "PII" tag - https://phabricator.wikimedia.org/T307708 (10JArguello-WMF) [15:51:56] 10Quarry: Upgrade quarry os - https://phabricator.wikimedia.org/T340762 (10rook) [15:55:30] (03PS1) 10Btullis: Update the kafka-setup conainer of datahub [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/934375 (https://phabricator.wikimedia.org/T329514) [16:07:08] 10Data-Engineering, 10Event-Platform Value Stream: jsonschema-tools test should fail if fields are removed in new (non major) version - https://phabricator.wikimedia.org/T340765 (10Ottomata) [16:11:31] 10Data-Engineering, 10Event-Platform Value Stream: jsonschema-tools test should fail if fields are removed in new (non major) version - https://phabricator.wikimedia.org/T340765 (10Ottomata) @tchin can you go ahead and do this one along with T300404? [16:13:17] 10Data-Engineering-Kanban, 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Cassandra, and 2 others: Properly add aqsloader user (w/ secrets) - https://phabricator.wikimedia.org/T305600 (10JArguello-WMF) [16:13:22] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Shared-Data-Infrastructure, 10Patch-For-Review: Decommission analytics10[58-69] - https://phabricator.wikimedia.org/T317861 (10JArguello-WMF) [16:13:41] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Shared-Data-Infrastructure, 10Patch-For-Review: Pageview definition relies on X-Analytics to determine special pages - https://phabricator.wikimedia.org/T304362 (10JArguello-WMF) [16:13:58] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Event-Platform Value Stream, 10EventStreams, 10Shared-Data-Infrastructure: Implement server side filtering for EventStreams (if we should) - https://phabricator.wikimedia.org/T152731 (10JArguello-WMF) [16:14:23] 10Data-Platform-SRE: Add optional TLS encryption to the druid-public-broker - https://phabricator.wikimedia.org/T331631 (10JArguello-WMF) [16:14:25] 10Data-Platform-SRE: Configure load-balancing approriate for ceph radosgw services on the data-engineering cluster - https://phabricator.wikimedia.org/T330153 (10JArguello-WMF) [16:14:27] 10Data-Platform-SRE: Upgrade Stats clients to bullseye - https://phabricator.wikimedia.org/T329360 (10JArguello-WMF) [16:14:30] 10Data-Platform-SRE: Move archiva to private IPs + CDN - https://phabricator.wikimedia.org/T317182 (10JArguello-WMF) [16:14:35] 10Data-Platform-SRE: Deploy ceph radosgw processes to data-engineering cluster - https://phabricator.wikimedia.org/T330152 (10JArguello-WMF) [16:14:37] 10Data-Platform-SRE, 10Shared-Data-Infrastructure, 10Patch-For-Review: Pageview definition relies on X-Analytics to determine special pages - https://phabricator.wikimedia.org/T304362 (10JArguello-WMF) [16:14:39] 10Data-Platform-SRE: Research and test methods for accessing kerberized services from spark running on the DSE K8S cluster - https://phabricator.wikimedia.org/T330162 (10JArguello-WMF) [16:14:41] 10Data-Platform-SRE, 10Epic: Upgrade the Data Engineering infrastructure to Debian Bullseye - https://phabricator.wikimedia.org/T288804 (10JArguello-WMF) [16:14:43] 10Data-Platform-SRE: Datahub user records are not being created after login - https://phabricator.wikimedia.org/T327884 (10JArguello-WMF) [16:14:45] 10Data-Platform-SRE, 10Epic: Install Ceph Cluster for Data Engineering - https://phabricator.wikimedia.org/T324660 (10JArguello-WMF) [16:14:47] 10Data-Engineering-Kanban, 10Data-Platform-SRE, 10Cassandra, 10Shared-Data-Infrastructure, 10User-Eevans: Properly add aqsloader user (w/ secrets) - https://phabricator.wikimedia.org/T305600 (10JArguello-WMF) [16:14:51] 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:rack/setup/install an-worker11[49-56] - https://phabricator.wikimedia.org/T327295 (10JArguello-WMF) [16:14:57] 10Data-Platform-SRE, 10Patch-For-Review: Upgrade the spark YARN shuffler service on Hadoop workers from version 2 to 3 - https://phabricator.wikimedia.org/T332765 (10JArguello-WMF) [16:15:01] 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Hadoop test cluster to Bullseye - https://phabricator.wikimedia.org/T329363 (10JArguello-WMF) [16:15:05] 10Data-Platform-SRE: Refactor analytics-meta MariaDB layout to use an-mariadb100[12] - https://phabricator.wikimedia.org/T284150 (10JArguello-WMF) [16:15:45] 10Data-Engineering, 10Data-Platform-SRE, 10Event-Platform Value Stream, 10EventStreams, 10Shared-Data-Infrastructure: Implement server side filtering for EventStreams (if we should) - https://phabricator.wikimedia.org/T152731 (10JArguello-WMF) [16:15:52] 10Data-Platform-SRE, 10Discovery-Search (Current work): Ensure prometheus-blazegraph-exporter-wdqs-* services can start in Bullseye or later - https://phabricator.wikimedia.org/T336540 (10bking) This appears to be finished, based on the last few data transfers to wdqs2022 (as of this writing, our sole Bullseye... [16:17:54] (03CR) 10Btullis: [C: 03+2] Update the kafka-setup conainer of datahub [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/934375 (https://phabricator.wikimedia.org/T329514) (owner: 10Btullis) [16:29:26] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 14): Setup config to allow lineage instrumentation - https://phabricator.wikimedia.org/T333004 (10JArguello-WMF) 05Open→03Resolved [16:29:28] 10Data-Engineering-Planning, 10Data Pipelines, 10Epic: Support for Product Analytics Data Pipelines Migration to Airflow - https://phabricator.wikimedia.org/T332997 (10JArguello-WMF) [16:29:32] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 14), 10Patch-For-Review: Deprecate old mobile datasets - https://phabricator.wikimedia.org/T329310 (10JArguello-WMF) 05Open→03Resolved [16:29:38] 10Data-Engineering-Planning, 10Data Pipelines (Sprint 14), 10Patch-For-Review: Add Python Linter Checks to CI - https://phabricator.wikimedia.org/T318346 (10JArguello-WMF) 05Open→03Resolved [16:33:49] 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Airflow to version 2.6.1 - https://phabricator.wikimedia.org/T336286 (10BTullis) The next step is to schedule the upgrades for other teams' instances. I will look to do this early next week, unless there are any requests to defer them. [16:34:25] 10Data-Platform-SRE, 10Patch-For-Review: Upgrade Datahub to v0.10.0 - https://phabricator.wikimedia.org/T329514 (10BTullis) p:05Medium→03High [20:13:34] 10Data-Platform-SRE, 10Discovery-Search (Current work): Investigate performance differences between wdqs2022 and older hosts - https://phabricator.wikimedia.org/T336443 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS bullseye [21:03:50] 10Data-Engineering, 10Data Pipelines: Refine: Use Spark SQL instead of Hive JDBC - https://phabricator.wikimedia.org/T209453 (10JArguello-WMF) [21:06:53] 10Data-Engineering, 10Product-Analytics: Propagate field descriptions from event schemas to Hive event tables - https://phabricator.wikimedia.org/T307040 (10JArguello-WMF) p:05Triage→03Medium [21:11:26] 10Data-Engineering, 10AQS2.0: Finalize the multi-dc configuration of AQS (nodejs) in codfw - https://phabricator.wikimedia.org/T331115 (10JArguello-WMF) [21:11:51] 10Data-Engineering, 10Data-Platform-SRE, 10AQS2.0: Finalize the multi-dc configuration of AQS (nodejs) in codfw - https://phabricator.wikimedia.org/T331115 (10JArguello-WMF) [21:13:43] 10Data-Platform-SRE: Decide on installation details for new ceph cluster - https://phabricator.wikimedia.org/T326945 (10JArguello-WMF) [21:15:41] 10Data-Engineering: Late events in wdqs-external.sparql-query? - https://phabricator.wikimedia.org/T310790 (10JArguello-WMF) [21:19:19] 10Data-Engineering: Cleanup User Hive Databases - https://phabricator.wikimedia.org/T323884 (10JArguello-WMF) [21:19:40] 10Data-Engineering, 10Documentation: User-centric documentation links - https://phabricator.wikimedia.org/T329550 (10JArguello-WMF) [21:21:47] 10Data-Engineering-Planning: Data Catalog Demo - https://phabricator.wikimedia.org/T310203 (10JArguello-WMF) Data catalog presentation is in shared drive> data plat eng> data eng >data catalog [21:21:51] 10Data-Engineering-Planning: Data Catalog Demo - https://phabricator.wikimedia.org/T310203 (10JArguello-WMF) 05Open→03Resolved [21:22:55] 10Data-Engineering: Check home/HDFS leftovers of echetty - https://phabricator.wikimedia.org/T330834 (10JArguello-WMF) [21:22:57] 10Data-Engineering: Check home/HDFS leftovers of cmacholan - https://phabricator.wikimedia.org/T330121 (10JArguello-WMF) [21:22:59] 10Data-Engineering, 10conftool: an-launcher1002: failed services - https://phabricator.wikimedia.org/T330652 (10JArguello-WMF) [21:23:01] 10Data-Engineering: Check home/HDFS leftovers of akhatun - https://phabricator.wikimedia.org/T326157 (10JArguello-WMF) [21:23:03] 10Data-Engineering: Check home/HDFS leftovers of toddleroux / ryanmax / afandian2 - https://phabricator.wikimedia.org/T325527 (10JArguello-WMF) [21:23:05] 10Data-Engineering: Check home/HDFS leftovers of bmansurov - https://phabricator.wikimedia.org/T320367 (10JArguello-WMF) [21:24:48] 10Data-Engineering-Planning, 10Data-Catalog: Ingest feature Hive schema into datahub - https://phabricator.wikimedia.org/T326598 (10JArguello-WMF) [21:24:51] 10Data-Engineering-Planning, 10Data-Catalog, 10Event-Platform Value Stream: Event Platform and DataHub Integration - https://phabricator.wikimedia.org/T318863 (10JArguello-WMF) [21:24:53] 10Data-Engineering-Planning, 10Data-Catalog: Re-enable Public Druid metadata ingestion - https://phabricator.wikimedia.org/T311547 (10JArguello-WMF) [21:24:55] 10Data-Engineering-Planning, 10Data-Catalog: Establish a Business Glossary - https://phabricator.wikimedia.org/T311524 (10JArguello-WMF) [21:24:57] 10Data-Engineering-Planning, 10Data-Catalog: Data Catalog Documentation Style Guide - https://phabricator.wikimedia.org/T310229 (10JArguello-WMF) [21:25:00] 10Data-Engineering-Planning, 10Data-Catalog: Document Two Additional Canonical Datasets - https://phabricator.wikimedia.org/T308048 (10JArguello-WMF) [21:25:02] 10Data-Engineering-Planning, 10Data Pipelines, 10Data-Catalog: Spike: Integrate Spark with DataHub - https://phabricator.wikimedia.org/T306896 (10JArguello-WMF) [21:25:04] 10Data-Engineering-Planning, 10Data-Catalog: DataHub rights assignment is case-sensitive - https://phabricator.wikimedia.org/T309382 (10JArguello-WMF) [21:25:06] 10Data-Engineering-Planning, 10Data-Catalog, 10Patch-For-Review: Create Airflow Pipeline for Ingesting/Updating Superset Data - https://phabricator.wikimedia.org/T309622 (10JArguello-WMF) [21:25:08] 10Data-Engineering-Planning, 10Data-Catalog: Emit lineage information about Airflow jobs to DataHub - https://phabricator.wikimedia.org/T312566 (10JArguello-WMF) [21:25:10] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Data-Catalog: Review and improve the build process for DataHub containers - https://phabricator.wikimedia.org/T303381 (10JArguello-WMF) [21:25:12] 10Data-Engineering-Planning, 10Data-Catalog, 10Metrics-Platform-Planning, 10Product-Analytics: [Metrics Platform] Catalog/dictionary of available standard fields - https://phabricator.wikimedia.org/T267251 (10JArguello-WMF) [21:28:20] 10Data-Engineering, 10Data-Catalog: Ingest feature Hive schema into datahub - https://phabricator.wikimedia.org/T326598 (10JArguello-WMF) [21:28:22] 10Data-Engineering, 10Data-Catalog: Re-enable Public Druid metadata ingestion - https://phabricator.wikimedia.org/T311547 (10JArguello-WMF) [21:28:24] 10Data-Engineering, 10Data-Catalog: Establish a Business Glossary - https://phabricator.wikimedia.org/T311524 (10JArguello-WMF) [21:28:26] 10Data-Engineering, 10Data-Catalog: Data Catalog Documentation Style Guide - https://phabricator.wikimedia.org/T310229 (10JArguello-WMF) [21:28:29] 10Data-Engineering, 10Data-Catalog: Document Two Additional Canonical Datasets - https://phabricator.wikimedia.org/T308048 (10JArguello-WMF) [21:28:31] 10Data-Engineering, 10Data-Catalog: DataHub rights assignment is case-sensitive - https://phabricator.wikimedia.org/T309382 (10JArguello-WMF) [21:28:33] 10Data-Engineering, 10Data-Catalog: Emit lineage information about Airflow jobs to DataHub - https://phabricator.wikimedia.org/T312566 (10JArguello-WMF) [21:28:35] 10Data-Engineering, 10Data-Catalog, 10Patch-For-Review: Create Airflow Pipeline for Ingesting/Updating Superset Data - https://phabricator.wikimedia.org/T309622 (10JArguello-WMF) [21:39:41] 10Data-Engineering-Planning, 10Data-Platform-SRE, 10Data Pipelines: Replace db1108 with db1208 - https://phabricator.wikimedia.org/T334055 (10JArguello-WMF) [21:40:15] 10Data-Engineering-Planning: --NEWLY ADDED ABOVE -- - https://phabricator.wikimedia.org/T304618 (10JArguello-WMF) 05Stalled→03Declined [21:40:28] 10Data-Engineering-Planning: NEW FEATURE REQUEST: - https://phabricator.wikimedia.org/T322423 (10JArguello-WMF) 05Open→03Declined [21:41:03] 10Data-Platform-SRE: Upgrade the druid-analytics cluster to bullseye - https://phabricator.wikimedia.org/T332604 (10JArguello-WMF) [21:41:06] 10Data-Platform-SRE: Upgrade the druid-public cluster to bullseye - https://phabricator.wikimedia.org/T332589 (10JArguello-WMF) [21:41:10] 10Data-Platform-SRE: Upgrade an-launcher1002 to bullseye - https://phabricator.wikimedia.org/T332580 (10JArguello-WMF) [21:41:12] 10Data-Platform-SRE: Upgrade hadoop standby master to bullseye - https://phabricator.wikimedia.org/T332578 (10JArguello-WMF) [21:41:14] 10Data-Platform-SRE: Upgrade hadoop master to bullseye - https://phabricator.wikimedia.org/T332573 (10JArguello-WMF) [21:41:16] 10Data-Platform-SRE: Refresh hadoop coordinators an-coord100[1-2] with an-coord[3-4] - https://phabricator.wikimedia.org/T332572 (10JArguello-WMF) [21:41:18] 10Data-Platform-SRE: Upgrade hadoop workers to bullseye - https://phabricator.wikimedia.org/T332570 (10JArguello-WMF) [21:41:20] 10Data-Platform-SRE: Make YARN web interface work with both primary and standby resourcemanager - https://phabricator.wikimedia.org/T331448 (10JArguello-WMF) [21:41:40] 10Data-Platform-SRE, 10Data-Catalog: Review and improve the build process for DataHub containers - https://phabricator.wikimedia.org/T303381 (10JArguello-WMF) [21:42:43] 10Data-Engineering-Planning, 10Data Pipelines, 10Epic: [Iceberg] Epic: Icebergify event_sanitized database - https://phabricator.wikimedia.org/T311743 (10JArguello-WMF) [21:43:29] 10Data-Engineering-Planning, 10Data Pipelines: Documentathon - https://phabricator.wikimedia.org/T311413 (10JArguello-WMF) 05Open→03Invalid [21:44:39] 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Metrics Platform Icebox, 10Epic: [EPIC] Deprecate mw.eventLog.logEvent() - https://phabricator.wikimedia.org/T317874 (10JArguello-WMF) [21:44:45] 10Data-Engineering, 10Growth-Team, 10MediaWiki-extensions-EventLogging, 10Metrics Platform Icebox, and 4 others: [EPIC] Deprecate EventLogging::logEvent() - https://phabricator.wikimedia.org/T318263 (10JArguello-WMF) [21:44:51] 10Analytics, 10Metrics Platform Icebox, 10Product-Infrastructure-Team-Backlog-Deprecated, 10Epic: Event Platform Client Libraries - https://phabricator.wikimedia.org/T228175 (10JArguello-WMF) [21:44:57] 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Metrics Platform Icebox, 10Epic: [EPIC] Deprecate EventLogging::schemaValidate - https://phabricator.wikimedia.org/T317793 (10JArguello-WMF) [21:47:33] 10Data-Engineering, 10Event-Platform Value Stream, 10Epic: Event Platform Value Stream Documentation Tasks - https://phabricator.wikimedia.org/T329628 (10JArguello-WMF) [21:47:46] 10Data-Engineering, 10Event-Platform Value Stream, 10MediaWiki-Vagrant: EventBus should not blackhole undeclared streams - https://phabricator.wikimedia.org/T329480 (10JArguello-WMF) [21:47:49] 10Data-Engineering, 10Event-Platform Value Stream: Automated event stream throughput alerting for important state change streams - https://phabricator.wikimedia.org/T329070 (10JArguello-WMF) [21:47:51] 10Data-Engineering, 10Event-Platform Value Stream: Refactor Image Suggestions Feedback > Cassandra Flink Job and Deploy to DSE k8s - https://phabricator.wikimedia.org/T329524 (10JArguello-WMF) [21:47:58] 10Data-Engineering, 10Event-Platform Value Stream: Support topics without a schema in Flink Catalog - https://phabricator.wikimedia.org/T328232 (10JArguello-WMF) [21:48:00] 10Data-Engineering, 10Event-Platform Value Stream: [Flink Operations] Automate Replay of Failed Events - https://phabricator.wikimedia.org/T328565 (10JArguello-WMF) [21:48:02] 10Data-Engineering, 10Data-Platform-SRE, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Epic: Flink Operations - https://phabricator.wikimedia.org/T328561 (10JArguello-WMF) [21:48:07] 10Data-Engineering, 10Event-Platform Value Stream: Support NULL values in RowData in eventutilities - https://phabricator.wikimedia.org/T328211 (10JArguello-WMF) [21:48:19] 10Data-Engineering, 10Event-Platform Value Stream, 10SRE-OnFire, 10serviceops: Incident: 2022-12-09 api appserver worker starvation - https://phabricator.wikimedia.org/T324994 (10JArguello-WMF) [21:48:23] 10Data-Engineering, 10Event-Platform Value Stream: [NEEDS GROOMING] Integrate Flink Table API in eventutils-python - https://phabricator.wikimedia.org/T324953 (10JArguello-WMF) [21:48:26] 10Data-Engineering, 10Event-Platform Value Stream: [EPIC] Flink Applications on Kubernetes - https://phabricator.wikimedia.org/T324578 (10JArguello-WMF) [21:48:29] 10Data-Engineering, 10Event-Platform Value Stream: [EPIC] Streaming and event driven Python services - https://phabricator.wikimedia.org/T324689 (10JArguello-WMF) [21:48:31] 10Data-Engineering, 10Event-Platform Value Stream: [SPIKE] Use Flink for batch backfilling - https://phabricator.wikimedia.org/T324108 (10JArguello-WMF) [21:48:35] 10Data-Engineering, 10Event-Platform Value Stream: Spark Streaming Dumps POC: Backfill content table - https://phabricator.wikimedia.org/T323641 (10JArguello-WMF) [21:48:43] 10Data-Engineering, 10Event-Platform Value Stream: Spark Streaming Dumps POC: Update iceberg tables - https://phabricator.wikimedia.org/T323645 (10JArguello-WMF) [21:48:48] 10Data-Engineering, 10Edit-Review-Improvements-Integrated-Filters, 10Event-Platform Value Stream, 10Growth-Team, and 2 others: Integration of Revert Risk Scores to Recent Changes as a filter - https://phabricator.wikimedia.org/T329071 (10JArguello-WMF) [21:48:54] 10Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform Value Stream, 10MW-1.41-notes (1.41.0-wmf.12; 2023-06-06): cirrusSearchCheckerJob JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322491 (10JArguello-WMF) [21:48:58] 10Data-Engineering, 10Event-Platform Value Stream: Add schema diffing support to jsonschema-tools and run diff in CI - https://phabricator.wikimedia.org/T321850 (10JArguello-WMF) [21:49:02] 10Data-Engineering, 10Event-Platform Value Stream, 10MW-1.40-notes (1.40.0-wmf.8; 2022-10-31): EventBus' stream config destination_event_service setting should move into producers.mediawikI_eventbus specific settings. - https://phabricator.wikimedia.org/T321557 (10JArguello-WMF) [21:49:06] 10Data-Engineering, 10Event-Platform Value Stream: Refactor EventBus extension Hooks to use new hook system - https://phabricator.wikimedia.org/T320655 (10JArguello-WMF) [21:49:10] 10Data-Engineering, 10Event-Platform Value Stream, 10MediaWiki-Core-Hooks: Add $comment and $performer to ArticleRevisionVisibilitySet params - https://phabricator.wikimedia.org/T321411 (10JArguello-WMF) [21:49:20] 10Data-Engineering, 10Event-Platform Value Stream: Document and Promote Image Suggestions Feedback > Cassandra Flink Job - https://phabricator.wikimedia.org/T316112 (10JArguello-WMF) [21:49:24] 10Data-Engineering, 10Event-Platform Value Stream, 10Data-Catalog: Event Platform and DataHub Integration - https://phabricator.wikimedia.org/T318863 (10JArguello-WMF) [21:49:28] 10Data-Engineering, 10Event-Platform Value Stream: Declare webrequest as an Event Platform stream - https://phabricator.wikimedia.org/T314956 (10JArguello-WMF) [21:49:32] 10Data-Engineering, 10Event-Platform Value Stream, 10Patch-For-Review: Move Spark JsonSchemaConverter out of analytics/refinery/source and into wikimedia-event-utilities - https://phabricator.wikimedia.org/T321854 (10JArguello-WMF) [21:49:36] 10Data-Engineering, 10Event-Platform Value Stream, 10Epic: [Event Platform] Design and Implement realtime enrichment pipeline for MW page change with content - https://phabricator.wikimedia.org/T307959 (10JArguello-WMF) [21:49:40] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform Value Stream: Introduce EventBusSendUpdate - https://phabricator.wikimedia.org/T292123 (10JArguello-WMF) [21:49:45] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10Platform Engineering: EventStreams sending same data over and over (page links change) - https://phabricator.wikimedia.org/T290211 (10JArguello-WMF) [21:49:58] 10Data-Engineering, 10Event-Platform Value Stream, 10MediaWiki-libs-HTTP, 10Beta-Cluster-reproducible, 10Wikimedia-production-error: PHP Warning: curl_multi_remove_handle(): supplied resource is not a valid cURL Multi Handle resource - https://phabricator.wikimedia.org/T288624 (10JArguello-WMF) [21:50:12] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10MW-1.41-notes (1.41.0-wmf.15; 2023-06-27), and 2 others: Adopt conventions for server receive and client/event timestamps in non analytics event schemas - https://phabricator.wikimedia.org/T267648 (10JArguello-WMF) [21:50:15] 10Analytics-Radar, 10Data-Engineering, 10ChangeProp, 10Event-Platform Value Stream, and 6 others: Run EventBus tests in MediaWiki core CI - https://phabricator.wikimedia.org/T257583 (10JArguello-WMF) [21:50:22] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10Metrics-Platform-Planning: Source geolocation directly rather than using IP in schema - https://phabricator.wikimedia.org/T290014 (10JArguello-WMF) [21:50:26] 10Data-Engineering, 10Event-Platform Value Stream, 10MW-1.40-notes (1.40.0-wmf.24; 2023-02-20), 10MW-1.41-notes (1.41.0-wmf.2; 2023-03-27), and 2 others: Remove StreamConfig::INTERNAL_SETTINGS logic from EventStreamConfig and do it in EventLogging client instead - https://phabricator.wikimedia.org/T286344 (... [21:50:32] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream: Enable canary events for all streams - https://phabricator.wikimedia.org/T266798 (10JArguello-WMF) [21:50:37] 10Data-Engineering, 10Event-Platform Value Stream, 10SRE-OnFire, 10SRE-Sprint-Week-Sustainability-March2023, and 2 others: Uneven CPU throttling of eventgate-analytics under load - https://phabricator.wikimedia.org/T325068 (10JArguello-WMF) [21:50:43] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform Value Stream, 10Platform Team Workboards (Clinic Duty Team): Duplicated revision_create events - https://phabricator.wikimedia.org/T262203 (10JArguello-WMF) [21:50:50] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform Value Stream: mw.user.generateRandomSessionId should return a UUID - https://phabricator.wikimedia.org/T266813 (10JArguello-WMF) [21:50:52] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10Metrics-Platform-Planning: Document in-schema who sets which fields - https://phabricator.wikimedia.org/T253392 (10JArguello-WMF) [21:51:20] 10Data-Engineering, 10Event-Platform Value Stream, 10Browser-Support-Microsoft-Edge, 10Performance-Team (Radar): Problem with delay caused by intake-analytics.wikimedia.org - https://phabricator.wikimedia.org/T295427 (10JArguello-WMF) [21:51:29] 10Analytics-Kanban, 10Data-Engineering, 10Event-Platform Value Stream, 10MediaWiki-extensions-EventLogging, and 3 others: Modern Event Platform - https://phabricator.wikimedia.org/T185233 (10JArguello-WMF) [21:51:33] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream: Deploy schema repos to analytics cluster and use local uris for analytics jobs - https://phabricator.wikimedia.org/T280017 (10JArguello-WMF) [21:51:37] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10MW-1.41-notes (1.41.0-wmf.10; 2023-05-23), 10User-Elukey: Port architecture of irc-recentchanges to Kafka - https://phabricator.wikimedia.org/T234234 (10JArguello-WMF) [21:51:48] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10Goal: Event Platform: Stream Connectors - https://phabricator.wikimedia.org/T214430 (10JArguello-WMF) [21:51:50] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10Performance-Team (Radar): Avoid extra HTTPS connections for most Event Platform beacons - https://phabricator.wikimedia.org/T263049 (10JArguello-WMF) [21:51:53] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10Release-Engineering-Team (Radar): Stop using puppet + git pull for auto deployment of schema repos - https://phabricator.wikimedia.org/T274901 (10JArguello-WMF) [21:52:09] 10Data-Engineering, 10Event-Platform Value Stream, 10EventStreams: EventStreams (via KafkaSSE) does not consume from newly added partitions in topic - https://phabricator.wikimedia.org/T173006 (10JArguello-WMF) [21:52:13] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform Value Stream, 10SRE, 10serviceops-radar: Consider Julie for managing Kafka settings, perhaps even integrating with Event Stream Config - https://phabricator.wikimedia.org/T276088 (10JArguello-WMF) [21:53:27] 10Data-Platform-SRE, 10cloud-services-team: Review and fix any bugs found in the automated bootstrap process for a ceph mon/mgr server - https://phabricator.wikimedia.org/T332987 (10JArguello-WMF) [21:54:56] 10Data-Engineering-Planning: Data Engineering Pairing system - https://phabricator.wikimedia.org/T327790 (10JArguello-WMF) 05Open→03Resolved [21:58:59] 10Data-Platform-SRE, 10Data Pipelines: Add support for Iceberg to the Spark Docker Image - https://phabricator.wikimedia.org/T336012 (10JArguello-WMF) [21:59:18] 10Analytics, 10Data Pipelines: Add cawiki to clickstream dataset - https://phabricator.wikimedia.org/T327982 (10JArguello-WMF) [22:00:42] 10Data-Engineering, 10Data Pipelines, 10Pageviews-API: pageviews.wmcloud.org shows "Error querying Pageviews API - Not found" for some pages on ko.wp - https://phabricator.wikimedia.org/T316967 (10JArguello-WMF) [22:00:54] 10Data-Engineering, 10Data-Engineering-Wikistats, 10Data Pipelines: Wikistats in Uzbek - https://phabricator.wikimedia.org/T314477 (10JArguello-WMF) [22:01:30] 10Data-Engineering, 10Data-Engineering-Wikistats, 10Data Pipelines: "Pages to date" not loading with "daily" metric - https://phabricator.wikimedia.org/T312717 (10JArguello-WMF) [22:01:42] 10Data-Engineering, 10Data-Engineering-Wikistats, 10Data Pipelines, 10I18n: Merge Ks-Arab and Ks-Deva to ks - https://phabricator.wikimedia.org/T314476 (10JArguello-WMF) [22:02:18] 10Data-Engineering-Kanban, 10Data Pipelines, 10Patch-For-Review: Improvements of artifacts cache - https://phabricator.wikimedia.org/T307115 (10JArguello-WMF) [22:02:26] 10Data-Platform-SRE, 10Data Pipelines: Replace db1108 with db1208 - https://phabricator.wikimedia.org/T334055 (10JArguello-WMF) [22:02:30] 10Data-Engineering-Kanban, 10Data Pipelines: Investigate why airflow sensor tasks fail without sending errors - https://phabricator.wikimedia.org/T311976 (10JArguello-WMF) [22:02:36] 10Analytics-Radar, 10Data-Engineering, 10Data Pipelines, 10Editing-team, and 4 others: WikiEditor records all edits as platform = desktop in EventLogging - https://phabricator.wikimedia.org/T249944 (10JArguello-WMF) [22:02:42] 10Data-Engineering-Kanban, 10Data Pipelines: Investigate Gobblin dataloss during namenode failure - https://phabricator.wikimedia.org/T311263 (10JArguello-WMF) [22:03:26] 10Data-Platform-SRE, 10Infrastructure-Foundations: Also intake Network Error Logging events into the Analytics Data Lake - https://phabricator.wikimedia.org/T304373 (10JArguello-WMF) [22:03:34] 10Data-Engineering, 10Data-Engineering-Wikistats, 10Data Pipelines: Non-mobile UAs on mobile (2g/gprs, etc) IP-blocks - https://phabricator.wikimedia.org/T58628 (10JArguello-WMF) [22:03:46] 10Data-Engineering, 10Pageviews-API: Provide a mechanism to notify subscribers when page view data is available - https://phabricator.wikimedia.org/T326229 (10JArguello-WMF) [22:04:00] 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Metrics-Platform-Planning: Allow JavaScript errors to fail CI builds - https://phabricator.wikimedia.org/T318902 (10JArguello-WMF) [22:04:06] 10Analytics-Radar, 10Metrics-Platform-Planning, 10CSS: Schema code samples popup appears under the JSON table - https://phabricator.wikimedia.org/T272857 (10JArguello-WMF) [22:04:10] 10Analytics-Kanban, 10Pageviews-Anomaly: Article on Carles Puigdemont has inflated pageviews in many projects - https://phabricator.wikimedia.org/T263908 (10JArguello-WMF) [22:04:14] 10Analytics-Radar, 10Data-Engineering, 10GLAM-Tech, 10Pageviews-API: WMF pageview API (404 error) when requesting statistics over around 1000 files on GLAMorgan - https://phabricator.wikimedia.org/T145197 (10JArguello-WMF) [22:04:18] 10Analytics-Radar, 10Data-Engineering, 10MediaWiki-extensions-EventLogging: SearchSatisfaction has validation errors for event.query - https://phabricator.wikimedia.org/T257331 (10JArguello-WMF) [22:04:24] 10Analytics-Radar, 10Data-Engineering, 10Data-Engineering-Wikistats: Negative total number of bytes for German Wikipedia in 2001? - https://phabricator.wikimedia.org/T203906 (10JArguello-WMF) [22:04:34] 10Analytics-Radar, 10Data-Engineering, 10Pageviews-API, 10Tool-Pageviews: 429 Too Many Requests hit despite throttling to 100 req/sec - https://phabricator.wikimedia.org/T219857 (10JArguello-WMF) [22:04:40] 10Analytics-Radar, 10Data-Engineering, 10Pageviews-API, 10RESTBase-API, and 2 others: views error in mostread feed - https://phabricator.wikimedia.org/T267624 (10JArguello-WMF) [22:04:44] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream: mediawiki/page/properties-change schema should use map type for added and removed page properties - https://phabricator.wikimedia.org/T281483 (10JArguello-WMF) [22:04:48] 10Data-Engineering, 10Pageviews-API: Endpoint for average view rate in Pageview API - https://phabricator.wikimedia.org/T162933 (10JArguello-WMF) [22:04:52] 10Analytics, 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Research-Freezer: 20K events by a single user in the span of 20 mins - https://phabricator.wikimedia.org/T202539 (10JArguello-WMF) [22:04:58] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform Value Stream, 10Internet-Archive, 10The-Wikipedia-Library: Store page-links-change data in a database table and make available through a Special page - https://phabricator.wikimedia.org/T221397 (10JArguello-WMF) [22:05:06] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10Platform Team Workboards (Clinic Duty Team): Add expiry info to mediawiki.page-restrictions-change stream - https://phabricator.wikimedia.org/T282057 (10JArguello-WMF) [22:05:10] 10Analytics, 10Data-Engineering, 10Event-Platform Value Stream, 10Metrics-Platform-Planning: Client-side error logging should use Elastic Common Schema (ECS) fields when possible - https://phabricator.wikimedia.org/T267602 (10JArguello-WMF) [22:05:14] 10Data-Engineering, 10API Platform, 10GraphQL, 10Pageviews-API: Responses on pageview API should be lighter - https://phabricator.wikimedia.org/T145935 (10JArguello-WMF) [22:05:18] 10Analytics, 10Data-Engineering, 10MediaWiki-Action-API, 10PageViewInfo, 10Pageviews-API: API Analytics - page views by country - https://phabricator.wikimedia.org/T213221 (10JArguello-WMF) [22:05:24] 10Analytics, 10Data-Engineering, 10Pageviews-API: Filter top pages by namespace/category - https://phabricator.wikimedia.org/T182975 (10JArguello-WMF) [22:05:28] 10Analytics-Radar, 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Product-Infrastructure-Team-Backlog-Deprecated, 10Epic: Explore an API for logging events sampled by session - https://phabricator.wikimedia.org/T168380 (10JArguello-WMF) [22:05:36] 10Data-Engineering, 10Data-Engineering-Wikistats: Determine total number of external links in all Wikipedias - https://phabricator.wikimedia.org/T137984 (10JArguello-WMF) [22:05:40] 10Analytics, 10Data-Engineering, 10Pageviews-API: Yearly endpoint for the /pageviews/top API - https://phabricator.wikimedia.org/T154381 (10JArguello-WMF) [22:05:46] 10Analytics, 10Data-Engineering, 10Pageviews-API, 10RESTBase-API: Pageviews Data : removes 1000 limit in the most viewed articles for a given project and timespan API - https://phabricator.wikimedia.org/T153081 (10JArguello-WMF) [22:06:47] 10Data-Engineering: FYI: Other changes to the CheckUser tables - https://phabricator.wikimedia.org/T327447 (10JArguello-WMF) [22:06:50] 10Data-Engineering: Missconfigured proxies on data-engineering hosts - https://phabricator.wikimedia.org/T326302 (10JArguello-WMF) [22:06:52] 10Data-Engineering: 503 on Superset (reproducible) - https://phabricator.wikimedia.org/T322525 (10JArguello-WMF) [22:06:54] 10Data-Engineering: wmf.webrequest: 'presto error: Corrupted statistics for column "[user_agent] optional binary " in Parquet file ...' - https://phabricator.wikimedia.org/T320926 (10JArguello-WMF) [22:06:57] 10Data-Engineering, 10Product-Analytics: Suspicious user pageview activity in India during June from Android mobile web browsers - https://phabricator.wikimedia.org/T315267 (10JArguello-WMF) [22:06:59] 10Data-Engineering: Document destination_event_service Event Platform stream configuration - https://phabricator.wikimedia.org/T313859 (10JArguello-WMF) [22:07:01] 10Data-Engineering, 10Traffic: varnishkafka / ATSkafka should support setting the kafka message timestamp - https://phabricator.wikimedia.org/T277553 (10JArguello-WMF) [22:07:05] 10Data-Engineering, 10Patch-For-Review: Create conda .deb and docker image - https://phabricator.wikimedia.org/T304450 (10JArguello-WMF) [22:07:09] 10Data-Engineering, 10Patch-For-Review: Migrate pagecounts-ez generation to hadoop - https://phabricator.wikimedia.org/T192474 (10JArguello-WMF) [22:08:08] 10Data-Engineering, 10Projects-Cleanup: Clean up wikimetrics - https://phabricator.wikimedia.org/T318193 (10JArguello-WMF) [22:13:59] 10Data-Platform-SRE, 10KaiOS-Wikipedia-app (Discovery): Implement depool (source only) and keep-downtime options on data-transfer cookbook - https://phabricator.wikimedia.org/T340793 (10bking) [22:36:40] 10Data-Platform-SRE: Upgrade Airflow instances to Bullseye - https://phabricator.wikimedia.org/T335261 (10JArguello-WMF) [22:37:38] 10Data-Engineering, 10Event-Platform: eventutilities-python EventProcessFunction throws NPE if user func returns None - https://phabricator.wikimedia.org/T335706 (10JArguello-WMF) [22:56:45] 10Data-Engineering, 10Discovery-Search, 10Event-Platform: Set up multi DC Kafka stretch cluster - https://phabricator.wikimedia.org/T340492 (10JArguello-WMF) [22:56:47] 10Data-Engineering: spark3 in yarn master mode exhibits warnings when the HDFS namenodes are in the failed over state - https://phabricator.wikimedia.org/T338137 (10JArguello-WMF) [22:56:49] 10Data-Engineering: Send a critical alert to data-engineering if produce_canary_events isn't running correctly - https://phabricator.wikimedia.org/T337055 (10JArguello-WMF) [22:56:51] 10Data-Engineering, 10Observability-Alerting: Reduce IRC/alert noise associated with monitor_refine_ systemd timers from alertmanager - https://phabricator.wikimedia.org/T337052 (10JArguello-WMF) [22:56:54] 10Data-Engineering, 10Patch-For-Review: Add checksumming of miniconda installer - https://phabricator.wikimedia.org/T337271 (10JArguello-WMF) [22:56:56] 10Data-Engineering: Decommission kafka-jumbo100[1-6] - https://phabricator.wikimedia.org/T336044 (10JArguello-WMF) [22:56:58] 10Data-Engineering: Bring an-coord100[3-4] into service - https://phabricator.wikimedia.org/T336045 (10JArguello-WMF) [22:57:00] 10Data-Engineering: Decommission druid100[4-6] - https://phabricator.wikimedia.org/T336043 (10JArguello-WMF) [22:57:02] 10Data-Engineering: Bring druid10[09-11] into service - https://phabricator.wikimedia.org/T336042 (10JArguello-WMF) [22:57:04] 10Data-Engineering: Bring kafka-jumbo10[09-15] into service - https://phabricator.wikimedia.org/T336041 (10JArguello-WMF) [22:57:06] 10Data-Engineering: Bring stat1010 into service with GPU from stat1005 - https://phabricator.wikimedia.org/T336040 (10JArguello-WMF) [22:57:08] 10Data-Engineering: Bring stat1009 into service - https://phabricator.wikimedia.org/T336036 (10JArguello-WMF) [22:57:10] 10Data-Engineering, 10Discovery-Search (Current work), 10Epic, 10Event-Platform: Flink Operations - https://phabricator.wikimedia.org/T328561 (10JArguello-WMF) [22:57:30] 10Data-Engineering, 10AQS2.0: Finalize the multi-dc configuration of AQS (nodejs) in codfw - https://phabricator.wikimedia.org/T331115 (10JArguello-WMF) [22:57:36] 10Data-Engineering, 10SRE, 10User-MoritzMuehlenhoff: Hadoop MapReduce port range cannot be configured to a fixed range - https://phabricator.wikimedia.org/T111433 (10JArguello-WMF) [22:57:38] 10Data-Engineering, 10EventStreams, 10Shared-Data-Infrastructure, 10Event-Platform: Implement server side filtering for EventStreams (if we should) - https://phabricator.wikimedia.org/T152731 (10JArguello-WMF) [23:05:31] 10Data-Engineering, 10Data-Platform-SRE, 10Data-Catalog: DataHub rights assignment is case-sensitive - https://phabricator.wikimedia.org/T309382 (10BTullis) [23:07:07] 10Data-Engineering, 10Data-Platform-SRE, 10Data-Catalog, 10Patch-For-Review: Create Airflow Pipeline for Ingesting/Updating Superset Data - https://phabricator.wikimedia.org/T309622 (10BTullis) [23:09:26] 10Data-Platform-SRE, 10Data-Catalog, 10Patch-For-Review: Create Airflow Pipeline for Ingesting/Updating Superset Data - https://phabricator.wikimedia.org/T309622 (10JArguello-WMF) [23:09:28] 10Data-Platform-SRE, 10Epic: Data Infrastructure as a Service MVP - https://phabricator.wikimedia.org/T308317 (10JArguello-WMF) [23:09:30] 10Data-Platform-SRE, 10Data Pipelines: Airflow scheduler and webserver logs should be readable by airflow instance admins - https://phabricator.wikimedia.org/T304615 (10JArguello-WMF) [23:09:32] 10Data-Platform-SRE, 10Data-Catalog: DataHub rights assignment is case-sensitive - https://phabricator.wikimedia.org/T309382 (10JArguello-WMF) [23:09:39] 10Data-Platform-SRE, 10Patch-For-Review: Add a presto query logger - https://phabricator.wikimedia.org/T269832 (10JArguello-WMF) [23:09:54] 10Data-Engineering, 10Data-Platform-SRE, 10Event-Platform, 10Platform Team Workboards (Clinic Duty Team): Avoid accepting Kafka messages with whacky timestamps - https://phabricator.wikimedia.org/T282887 (10JArguello-WMF) [23:23:03] 10Data-Engineering: --NEWLY ADDED ABOVE -- - https://phabricator.wikimedia.org/T304609 (10JArguello-WMF) 05Stalled→03Invalid [23:34:43] 10Data-Engineering: --NEWLY ADDED ABOVE -- - https://phabricator.wikimedia.org/T306754 (10JArguello-WMF) 05Stalled→03Invalid [23:36:58] 10Data-Engineering: --NEWLY ADDED ABOVE -- - https://phabricator.wikimedia.org/T304611 (10JArguello-WMF) 05Stalled→03Declined [23:41:06] 10Data-Engineering, 10Product-Analytics: Investigate easier methods for WMF staff to access Superset - https://phabricator.wikimedia.org/T258962 (10Dzahn) > the large majority of WMF staff Is it really true that almost everyone needs access to private data to do their job? I used to think we keep that access... [23:43:05] 10Data-Platform-SRE, 10CAS-SSO, 10Infrastructure-Foundations: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10JArguello-WMF) [23:46:46] 10Data-Engineering: --NEWLY ADDED ABOVE -- - https://phabricator.wikimedia.org/T306756 (10JArguello-WMF) 05Open→03Declined [23:46:48] 10Data-Engineering: --NEWLY ADDED ABOVE -- - https://phabricator.wikimedia.org/T306755 (10JArguello-WMF) 05Stalled→03Declined [23:46:50] 10Data-Engineering: --NEWLY ADDED ABOVE -- - https://phabricator.wikimedia.org/T304620 (10JArguello-WMF) 05Stalled→03Declined [23:46:52] 10Data-Engineering: --NEWLY ADDED ABOVE -- - https://phabricator.wikimedia.org/T304612 (10JArguello-WMF) 05Stalled→03Declined [23:46:54] 10Data-Engineering: --NEWLY ADDED ABOVE -- - https://phabricator.wikimedia.org/T304610 (10JArguello-WMF) 05Stalled→03Declined