[02:05:57] PROBLEM - Checks that the local airflow scheduler for airflow @research is working properly on an-airflow1002 is CRITICAL: CRITICAL: /usr/bin/env AIRFLOW_HOME=/srv/airflow-research /usr/lib/airflow/bin/airflow jobs check --job-type SchedulerJob --hostname an-airflow1002.eqiad.wmnet did not succeed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Airflow [02:20:29] 10Data-Engineering: Can't deploy airflow-dags/research anymore - https://phabricator.wikimedia.org/T311336 (10bmansurov) Solved by running the following commands on `airflow1002`: `lang=bash cd /srv/deployment/airflow-dags/research-cache/cache/ sudo -u analytics-research git fetch --tags -f ` [02:20:43] 10Data-Engineering: Can't deploy airflow-dags/research anymore - https://phabricator.wikimedia.org/T311336 (10bmansurov) 05Open→03Resolved a:03bmansurov [07:54:41] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10JAllemandou) I confirm it works for me! Let's maybe give it a try on the prod cluster and ask our end-users to check their queries/dashboards? [08:02:25] 10Data-Engineering, 10Gerrit: Remove unused Gerrit repository mediawiki/services/aqs/deploy - https://phabricator.wikimedia.org/T309731 (10Aklapper) 05Resolved→03Open The repository [still exists and has content](https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/aqs/deploy/+/refs/heads/mast... [09:42:15] 10Analytics, 10API Platform: Establish testing procedure for Druid-based endpoints - https://phabricator.wikimedia.org/T311190 (10JAllemandou) >>! In T311190#8035757, @BPirkle wrote: > # I think I need a "spec" and matching data to ingest Yes! We have examples of spec as well as data for you. The hadoop-ingest... [10:38:45] 10Data-Engineering-Kanban, 10Airflow: Create cassandra loading HQL files from their oozie definition - https://phabricator.wikimedia.org/T311507 (10NOkafor-WMF) [12:47:58] I'm not sure what the Icinga alert above about the airflow job scheduler failure actually means. https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=an-airflow1002&service=Checks+that+the+local+airflow+scheduler+for+airflow+%40research+is+working+properly [12:48:30] I thought I'd run the command that the Icinga check runs and it shows this: [12:48:35] https://www.irccloud.com/pastebin/gOfxogWF/ [12:49:36] Ah, the systemd unit for the scheduler service is running and the check is green, but it seems to show an error. [12:49:41] https://www.irccloud.com/pastebin/ipVkfCqq/ [12:50:16] Should we reach out to the research team, or should we just try restarting the airflow-scheduler service? [12:52:35] 10Data-Engineering, 10Projects-Cleanup, 10Patch-For-Review: Remove unused Gerrit repository mediawiki/services/aqs/deploy - https://phabricator.wikimedia.org/T309731 (10hashar) The archiving of repository is usually done via #cleanup with a placeholder template https://phabricator.wikimedia.org/maniphest/tas... [13:14:13] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4:(Need By: TBD) rack/setup/install an-presto10[06-15].eqiad.wmnet - https://phabricator.wikimedia.org/T306835 (10BTullis) Hi @Cmjohnson - that's really interesting. I think that you're one step closer to a working system than I am, but ultimately I think... [13:17:29] btullis: yeah that looks like a dag parse error maybe? [13:17:38] cc fab [13:20:39] It's interesting that the systemd unit for the scheduler itself still says 👍 I'm not currently certain whether that's the right thing to do here or not. At least we have the other check as well. [13:23:28] 10Analytics, 10API Platform: Establish testing procedure for Druid-based endpoints - https://phabricator.wikimedia.org/T311190 (10JAllemandou) I forgot to add on this: > I was able to execute an canned example druid query via the instructions on wikitech but was not able to query the mediawiki_history_reduced... [13:24:11] 10Data-Engineering, 10Product-Analytics, 10SDAW-MediaSearch, 10Structured-Data-Backlog (Current Work): [M] No data from ptwikinews in event.mediawiki_mediasearch_interaction table - https://phabricator.wikimedia.org/T308815 (10mfossati) @cchen, this query should count the ptwikinews events available in `ev... [13:28:10] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host stat1009.eqiad.wmnet with OS buster [13:34:02] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host stat1009.eqiad.wmnet with OS buster executed with errors: - stat100... [13:34:26] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host stat1009.eqiad.wmnet with OS bullseye [13:40:04] 10Data-Engineering, 10DBA, 10Data-Services: Make linktarget table visible on cloud wiki replicas - https://phabricator.wikimedia.org/T305064 (10Lucas_Werkmeister_WMDE) >>! In T305064#7821932, @Ladsgroup wrote: > That is definitely in the medium-term work (=in a couple of months) to avoid bloating the table b... [13:48:40] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host stat1009.eqiad.wmnet with OS bullseye executed with errors: - stat1... [13:48:51] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10Cmjohnson) [13:53:34] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10Cmjohnson) @BTullis @RobH @Papaul I set the raid up so the raid 1 ssds were first and used the install script for buster. Buster fails to see to the disks, so I... [13:54:53] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10Cmjohnson) @BTullis just read your response on an-presto and see that you're experiencing this with stat1010. Thank you for digging into it more. [14:02:56] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10BTullis) Thanks @Cmjohnson - yes I think that this is very likely to be the same issue. That's useful that you've experienced exactly the same outcome on this as I... [14:04:03] RECOVERY - Check unit status of analytics-dumps-fetch-clickstream on clouddumps1002 is OK: OK: Status of the systemd unit analytics-dumps-fetch-clickstream https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [14:17:09] PROBLEM - Check unit status of analytics-dumps-fetch-clickstream on clouddumps1002 is CRITICAL: CRITICAL: Status of the systemd unit analytics-dumps-fetch-clickstream https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [14:28:27] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10Ottomata) Do we have a way to do little bit more superset testing? I'm worried that some edge case in the latest presto version will break so... [14:30:26] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10Ottomata) We will have to rebuild hadoop for bullsye, eh? {T310643} [14:38:26] (03CR) 10Milimetric: [V: 03+2 C: 03+2] "This is running in production now, so merging (no need for deploy)." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/792215 (https://phabricator.wikimedia.org/T307714) (owner: 10Milimetric) [14:40:01] 10Data-Engineering, 10MediaViewer, 10MediaWiki-extensions-EventLogging, 10MW-1.39-notes (1.39.0-wmf.18; 2022-06-27): Decommission the MediaViewer and MultimediaViewer* instruments - https://phabricator.wikimedia.org/T310890 (10phuedx) [14:42:40] ottomata, I noticed this too. There is a notice 'The scheduler does not appear to be running. Last heartbeat was received 8 hours ago. [14:42:40] The DAGs list may not update, and new tasks will not be scheduled.', I think the dag parse error should not impact this. [14:42:52] cc btullis [14:43:52] hm I did try to run a development airflow instance on that host, maybe that broke the airflow scheduler? [14:45:51] the kerberos auth propagation from cache/keytab to airflow to skein to spark is a pain. aka it doesn't work for me. [14:52:25] RECOVERY - Checks that the local airflow scheduler for airflow @research is working properly on an-airflow1002 is OK: OK: /usr/bin/env AIRFLOW_HOME=/srv/airflow-research /usr/lib/airflow/bin/airflow jobs check --job-type SchedulerJob --hostname an-airflow1002.eqiad.wmnet succeeded https://wikitech.wikimedia.org/wiki/Analytics/Systems/Airflow [14:52:54] fab: btw you shoudl be able to restart the research airflow instance [14:53:04] how would I go about starting airflow/scheduler? `airflow-research scheduler &` seems adhoc but works [14:54:29] (03PS8) 10Snwachukwu: Add projectview hql scripts to analytics/refinery/hql path. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/797240 (https://phabricator.wikimedia.org/T309023) [14:55:06] 10Data-Engineering, 10MediaViewer, 10MediaWiki-extensions-EventLogging, 10MW-1.39-notes (1.39.0-wmf.18; 2022-06-27), 10Patch-For-Review: Decommission the MediaViewer and MultimediaViewer* instruments - https://phabricator.wikimedia.org/T310890 (10phuedx) [14:55:06] sudo -u analytics-research systemctl restart airflow-scheduler@research [14:55:09] fab ^^ [14:55:23] 10Data-Engineering, 10MediaViewer, 10MediaWiki-extensions-EventLogging, 10MW-1.39-notes (1.39.0-wmf.18; 2022-06-27), 10Patch-For-Review: Decommission the MediaViewer and MultimediaViewer* instruments - https://phabricator.wikimedia.org/T310890 (10phuedx) [14:55:31] re kerb auth propagation, what do you mean? in your research instance? or in a dev instance? [14:56:32] using the SkeinHook, I can't get it to work.. neither on the dev instance on a stat machine or the research instance. [14:57:21] cc milimetric who has done this ^ ? [14:57:33] probably need to try it together [14:57:48] re the restart: 'Failed to restart airflow-scheduler@research.service: Access denied' [14:57:52] hmmm [14:58:00] with sudo -u analytisc-research ? [14:58:25] yes [14:58:28] 10Data-Engineering, 10MediaViewer, 10MediaWiki-extensions-EventLogging, 10MW-1.39-notes (1.39.0-wmf.18; 2022-06-27), 10Patch-For-Review: Decommission the MediaViewer and MultimediaViewer* instruments - https://phabricator.wikimedia.org/T310890 (10phuedx) I confirmed with @MarkTraceur (who then confirmed... [14:59:20] huh. [14:59:39] that should work but i see it is not. will investigate. meetings starting now tho... [14:59:49] I saw this from milimetric, https://wikitech.wikimedia.org/wiki/Analytics/Systems/Airflow/Airflow_testing_instance_tutorial#Kerberos_Considerations. What confuses me is that I also can't get it to work on the airflow instance where the user has a keytab [15:00:06] yeah, maybe we can look at this after the meetings [15:02:24] 10Data-Engineering-Kanban, 10Data-Catalog, 10Data Engineering Planning: Document the Pageviews Dataset - https://phabricator.wikimedia.org/T308047 (10EChetty) [15:06:03] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning: Migrate the projectview jobs - https://phabricator.wikimedia.org/T305844 (10EChetty) [15:06:13] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning: [Airflow] Refactor HDFSArchiveOperator to run in Skein - https://phabricator.wikimedia.org/T310542 (10EChetty) [15:07:59] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning, 10Documentation: [Airflow] Kick off documentation in wikitech - https://phabricator.wikimedia.org/T302400 (10EChetty) [15:08:25] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning: Create cassandra loading HQL files from their oozie definition - https://phabricator.wikimedia.org/T311507 (10EChetty) [15:10:15] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning (Sprint 01): Create cassandra loading HQL files from their oozie definition - https://phabricator.wikimedia.org/T311507 (10EChetty) [15:10:27] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning (Sprint 01): [Airflow] Refactor HDFSArchiveOperator to run in Skein - https://phabricator.wikimedia.org/T310542 (10EChetty) [15:11:41] 10Data-Engineering, 10Event-Platform, 10SRE, 10serviceops: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10JArguello-WMF) [15:11:51] 10Data-Engineering-Kanban, 10Event-Platform, 10SRE, 10serviceops, and 2 others: eventgate chart should use common_templates - https://phabricator.wikimedia.org/T303543 (10JArguello-WMF) [15:12:11] 10Data-Engineering-Kanban, 10Data Engineering Planning: Build Bigtop 1.5 Hadoop packages for Bullseye - https://phabricator.wikimedia.org/T310643 (10JArguello-WMF) [15:12:48] 10Data-Engineering-Kanban, 10Data Engineering Planning: Gobblin dataloss during namenode failure - https://phabricator.wikimedia.org/T311263 (10EChetty) [15:13:18] 10Data-Engineering-Kanban, 10Data Engineering Planning: HDFS Namenode failover failure - https://phabricator.wikimedia.org/T310293 (10EChetty) [15:13:20] 10Analytics-Wikistats, 10Data-Engineering-Kanban, 10Data Engineering Planning: Use dedicated Phabricator bug report / feature request forms - https://phabricator.wikimedia.org/T308610 (10JArguello-WMF) [15:13:30] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning, 10Patch-For-Review: Improvements of artifacts cache - https://phabricator.wikimedia.org/T307115 (10JArguello-WMF) [15:13:56] 10Data-Engineering-Kanban, 10Data Engineering Planning: Update ua-parser library for traffic data - https://phabricator.wikimedia.org/T306829 (10JArguello-WMF) [15:14:19] 10Data-Engineering-Kanban, 10Cassandra, 10Data Engineering Planning, 10User-Eevans: Properly add aqsloader user (w/ secrets) - https://phabricator.wikimedia.org/T305600 (10JArguello-WMF) [15:14:30] 10Data-Engineering-Kanban, 10Phabricator, 10Product-Analytics, 10wmfdata-python, 10Data Engineering Planning: Herald rule to add Product Analytics and Data Engineering tags to Wmfdata-Python tasks - https://phabricator.wikimedia.org/T304572 (10JArguello-WMF) [15:14:34] 10Data-Engineering-Kanban, 10Event-Platform, 10Data Engineering Planning: Remove StreamConfig::INTERNAL_SETTINGS logic from EventStreamConfig and do it in EventLogging client instead - https://phabricator.wikimedia.org/T286344 (10JArguello-WMF) [15:14:52] 10Data-Engineering-Kanban, 10DC-Ops, 10SRE, 10ops-eqiad, and 2 others: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10JArguello-WMF) [15:14:58] 10Data-Engineering-Kanban, 10Data Engineering Planning: Build and install spark3 assembly - https://phabricator.wikimedia.org/T310578 (10JArguello-WMF) [15:15:11] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning: Migrate the referrer job - https://phabricator.wikimedia.org/T305842 (10JArguello-WMF) [15:15:30] 10Analytics-Kanban, 10Data-Engineering-Kanban, 10Data Engineering Planning, 10Patch-For-Review: Add logic to purging scripts that requires admin action if it's about to delete a lot of data - https://phabricator.wikimedia.org/T270433 (10JArguello-WMF) [15:15:31] ottomata: with external researchers we are looking at the eventstream of links being added and removed to articles (page-links-change). one thing they observed is that these events can not always be found in the article's revision history (where the link was added/removed). is that an expected behaviour? do you have any thoughts why that might be the case (I could think of links being added to a template instead of the article)? they [15:15:31] wrote more details here: https://github.com/tlarock/pywikibot/issues/1 Thanks! [15:15:36] 10Data-Engineering-Kanban, 10Data-Catalog, 10Data Engineering Planning: Connect MVP to Hive metastore [Mile Stone 4] - https://phabricator.wikimedia.org/T299897 (10JArguello-WMF) [15:15:55] 10Data-Engineering-Kanban, 10Product-Analytics, 10wmfdata-python, 10Data Engineering Planning, 10GitLab (Project Migration): Move Wmfdata-Python from Github to Gitlab - https://phabricator.wikimedia.org/T304544 (10JArguello-WMF) [15:16:09] 10Data-Engineering-Kanban, 10SRE, 10Traffic, 10Data Engineering Planning: Spike: Investigate creating robust alerts to notify that caching nodes are not sending traffic data - https://phabricator.wikimedia.org/T304651 (10JArguello-WMF) [15:16:17] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning: Projectviews by country Airflow job - https://phabricator.wikimedia.org/T303193 (10JArguello-WMF) [15:16:36] 10Analytics, 10Data-Engineering-Kanban, 10Event-Platform, 10Wikidata, and 4 others: Migrate WikibaseTermboxInteraction EventLogging Schema to new EventPlatform thingy - https://phabricator.wikimedia.org/T290303 (10JArguello-WMF) [15:16:47] 10Data-Engineering-Kanban, 10Data Engineering Planning, 10Patch-For-Review: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10JArguello-WMF) [15:22:05] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): Create conda-base-env with last pyspark - https://phabricator.wikimedia.org/T309227 (10JArguello-WMF) [15:23:52] mforns: can you update the hacky docs I put up about testing airflow with keytabs: https://wikitech.wikimedia.org/w/index.php?title=Analytics%2FSystems%2FAirflow%2FAirflow_testing_instance_tutorial&type=revision&diff=1993401&oldid=1983465 [15:24:08] (or tell me and I'll do it) [15:24:28] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): Analyze officewiki requests - https://phabricator.wikimedia.org/T306136 (10JArguello-WMF) [15:24:39] 10Data-Engineering: [Iceburg] Create corresponding event_sanitized tables with Iceberg - https://phabricator.wikimedia.org/T311737 (10EChetty) [15:25:12] PROBLEM - Checks that the local airflow scheduler for airflow @research is working properly on an-airflow1002 is CRITICAL: CRITICAL: /usr/bin/env AIRFLOW_HOME=/srv/airflow-research /usr/lib/airflow/bin/airflow jobs check --job-type SchedulerJob --hostname an-airflow1002.eqiad.wmnet did not succeed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Airflow [15:26:35] 10Analytics-Wikistats, 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): Use dedicated Phabricator bug report / feature request forms - https://phabricator.wikimedia.org/T308610 (10JArguello-WMF) [15:27:56] 10Data-Engineering-Kanban, 10DC-Ops, 10SRE, 10ops-eqiad, and 2 others: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10JArguello-WMF) [15:29:14] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): HDFS Namenode failover failure - https://phabricator.wikimedia.org/T310293 (10JArguello-WMF) [15:31:07] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): Gobblin dataloss during namenode failure - https://phabricator.wikimedia.org/T311263 (10JArguello-WMF) [15:32:56] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): Build and install spark3 assembly - https://phabricator.wikimedia.org/T310578 (10JArguello-WMF) [15:33:15] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning (Sprint 01): Migrate the projectview jobs - https://phabricator.wikimedia.org/T305844 (10JArguello-WMF) [15:33:28] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning (Sprint 01): Migrate the referrer job - https://phabricator.wikimedia.org/T305842 (10JArguello-WMF) [15:33:48] 10Analytics-Kanban, 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: Add logic to purging scripts that requires admin action if it's about to delete a lot of data - https://phabricator.wikimedia.org/T270433 (10JArguello-WMF) [15:34:06] 10Data-Engineering-Kanban, 10Data-Catalog, 10Data Engineering Planning (Sprint 01): Connect MVP to Hive metastore [Mile Stone 4] - https://phabricator.wikimedia.org/T299897 (10JArguello-WMF) [15:34:21] 10Data-Engineering-Kanban, 10Product-Analytics, 10wmfdata-python, 10Data Engineering Planning (Sprint 01), 10GitLab (Project Migration): Move Wmfdata-Python from Github to Gitlab - https://phabricator.wikimedia.org/T304544 (10JArguello-WMF) [15:34:46] 10Data-Engineering-Kanban, 10SRE, 10Traffic, 10Data Engineering Planning (Sprint 01): Spike: Investigate creating robust alerts to notify that caching nodes are not sending traffic data - https://phabricator.wikimedia.org/T304651 (10JArguello-WMF) [15:34:51] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning (Sprint 01): Projectviews by country Airflow job - https://phabricator.wikimedia.org/T303193 (10JArguello-WMF) [15:35:18] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning (Sprint 01), 10Documentation: [Airflow] Kick off documentation in wikitech - https://phabricator.wikimedia.org/T302400 (10JArguello-WMF) [15:35:46] 10Analytics, 10Data-Engineering-Kanban, 10Event-Platform, 10Wikidata, and 4 others: Migrate WikibaseTermboxInteraction EventLogging Schema to new EventPlatform thingy - https://phabricator.wikimedia.org/T290303 (10JArguello-WMF) [15:36:25] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10JArguello-WMF) [15:43:43] 10Data-Engineering-Kanban, 10DC-Ops, 10SRE, 10ops-eqiad, and 2 others: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1001 for host stat1010.eqiad.wmnet with OS bullseye completed: - stat1010 (*... [15:46:38] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: Update and add copies of projectview hql script to analytics/refinery/hql path - https://phabricator.wikimedia.org/T309023 (10JArguello-WMF) [15:47:54] 10Data-Engineering-Kanban, 10Generated Data Platform, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: [Shared Event Platform] - Research Flink Changelog semantics to inform POC MW schema design - https://phabricator.wikimedia.org/T310082 (10JArguello-WMF) [15:48:53] 10Data-Engineering-Kanban, 10Data-Engineering-Radar, 10Event-Platform, 10Generated Data Platform, 10Data Engineering Planning (Sprint 01): Add better support for using Event Platform streams with the Flink DataStream API - https://phabricator.wikimedia.org/T310302 (10JArguello-WMF) [15:50:10] 10Data-Engineering-Kanban, 10Event-Platform, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: [BUG] jsonschema-tools materializes fields in yaml in a different order than in json files - https://phabricator.wikimedia.org/T308450 (10JArguello-WMF) [15:51:20] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: Pageview definition relies on X-Analytics to determine special pages - https://phabricator.wikimedia.org/T304362 (10JArguello-WMF) [15:52:13] 10Data-Engineering, 10Data³: Audit JSON schemas for Gerrit events - https://phabricator.wikimedia.org/T311615 (10hashar) I am using the Java library `com.github.victools:jsonschema-generator:4.25.0` https://victools.github.io/jsonschema-generator/ . Reading the source code it apparently has support for multip... [15:52:17] 10Data-Engineering-Kanban, 10MediaWiki-extensions-EventLogging, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: Generate $wgEventLoggingSchemas from $wgEventStreams - https://phabricator.wikimedia.org/T303602 (10JArguello-WMF) [15:54:14] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10Ottomata) [15:54:29] 10Data-Engineering, 10Data³: Audit JSON schemas for Gerrit events - https://phabricator.wikimedia.org/T311615 (10hashar) If the draft-07 schemas at https://people.wikimedia.org/~hashar/T304947/schemas-3.4.4-draft-07/ looks kind of okish, is there a git repo to which I should propose a change to add them? Wha... [16:18:20] 10Data-Engineering-Kanban, 10Event-Platform, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: [BUG] jsonschema-tools materializes fields in yaml in a different order than in json files - https://phabricator.wikimedia.org/T308450 (10Ottomata) [16:19:26] 10Data-Engineering-Kanban, 10Event-Platform, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: [BUG] jsonschema-tools materializes fields in yaml in a different order than in json files - https://phabricator.wikimedia.org/T308450 (10Ottomata) Need to do some testing to see if rematerializing all... [16:24:38] (03PS9) 10Snwachukwu: Add projectview hql scripts to analytics/refinery/hql path. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/797240 (https://phabricator.wikimedia.org/T309023) [16:30:03] (03CR) 10Joal: [V: 03+2 C: 03+2] "LGTM! Thanks a lot Sandra for the persistence in making this awesome :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/797240 (https://phabricator.wikimedia.org/T309023) (owner: 10Snwachukwu) [16:34:04] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10JAllemandou) No, superset staging doesn't use presto-test - there is almost no data nor computation power under that one... [16:37:31] 10Data-Engineering-Kanban, 10Data Engineering Planning: HDFS Namenode failover failure - https://phabricator.wikimedia.org/T310293 (10EChetty) [16:43:21] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning (Sprint 01): Create cassandra loading HQL files from their oozie definition - https://phabricator.wikimedia.org/T311507 (10EChetty) [16:47:58] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning (Sprint 01): [Airflow] Refactor HDFSArchiveOperator to run in Skein - https://phabricator.wikimedia.org/T310542 (10EChetty) [16:49:52] 10Analytics-Wikistats, 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): Use dedicated Phabricator bug report / feature request forms - https://phabricator.wikimedia.org/T308610 (10EChetty) a:03Milimetric [16:51:24] 10Analytics-Wikistats, 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): Use dedicated Phabricator bug report / feature request forms - https://phabricator.wikimedia.org/T308610 (10Milimetric) my bad, this is done [16:56:19] 10Data-Engineering-Kanban, 10DC-Ops, 10SRE, 10ops-eqiad, and 2 others: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10EChetty) [17:01:44] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10EChetty) [17:06:03] 10Analytics, 10Data-Engineering-Kanban, 10Event-Platform, 10Wikidata, and 4 others: Migrate WikibaseTermboxInteraction EventLogging Schema to new EventPlatform thingy - https://phabricator.wikimedia.org/T290303 (10EChetty) [17:07:37] 10Data-Engineering-Kanban, 10Airflow: Projectviews by country Airflow job - https://phabricator.wikimedia.org/T303193 (10EChetty) [17:09:41] 10Data-Engineering, 10Data-Engineering-Kanban, 10SRE, 10Traffic: Spike: Investigate creating robust alerts to notify that caching nodes are not sending traffic data - https://phabricator.wikimedia.org/T304651 (10EChetty) [17:11:29] 10Data-Engineering-Kanban, 10Product-Analytics, 10wmfdata-python, 10GitLab (Project Migration): Move Wmfdata-Python from Github to Gitlab - https://phabricator.wikimedia.org/T304544 (10EChetty) [17:17:11] 10Analytics-Kanban, 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01), 10Patch-For-Review: Add logic to purging scripts that requires admin action if it's about to delete a lot of data - https://phabricator.wikimedia.org/T270433 (10EChetty) [17:19:03] 10Data-Engineering-Kanban, 10Airflow, 10Data Engineering Planning (Sprint 01): Migrate the projectview jobs - https://phabricator.wikimedia.org/T305844 (10EChetty) [17:21:52] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): Build and install spark3 assembly - https://phabricator.wikimedia.org/T310578 (10EChetty) [17:25:04] 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): Investigate Gobblin dataloss during namenode failure - https://phabricator.wikimedia.org/T311263 (10EChetty) [17:27:53] 10Analytics-Wikistats, 10Data-Engineering-Kanban, 10Data Engineering Planning (Sprint 01): Use dedicated Phabricator bug report / feature request forms - https://phabricator.wikimedia.org/T308610 (10Milimetric) there are so many boards now! [17:29:46] 10Data-Engineering-Kanban: Investigate Gobblin dataloss during namenode failure - https://phabricator.wikimedia.org/T311263 (10EChetty) [17:31:17] 10Data-Engineering-Kanban, 10Event-Platform, 10Patch-For-Review: [BUG] jsonschema-tools materializes fields in yaml in a different order than in json files - https://phabricator.wikimedia.org/T308450 (10EChetty) [18:26:31] great btullis i am glad you liked it, more interesting things here: https://www.usenix.org/conference/pepr22/conference-program [19:30:59] 10Data-Engineering, 10Data³: Audit JSON schemas for Gerrit events - https://phabricator.wikimedia.org/T311615 (10Ottomata) > is there a git repo to which I should propose a change to add them? https://gerrit.wikimedia.org/r/plugins/gitiles/schemas/event/secondary/+/refs/heads/master It will def be easier to r... [20:14:35] 10Data-Engineering-Kanban, 10Data Engineering Planning, 10Data-Catalog: Data Catalog Demo - https://phabricator.wikimedia.org/T310203 (10Milimetric) [21:52:12] 10Data-Engineering, 10Product-Analytics, 10SDAW-MediaSearch, 10Structured-Data-Backlog (Current Work): [M] No data from ptwikinews in event.mediawiki_mediasearch_interaction table - https://phabricator.wikimedia.org/T308815 (10cchen) 05In progress→03Resolved @mfossati I see the data is available now! t... [22:10:06] (03PS1) 10Cwhite: Add logging/sal/1.0.0 schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/810115 (https://phabricator.wikimedia.org/T222826)