[00:15:24] RECOVERY - Check systemd state on an-launcher1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [00:16:43] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [00:21:32] PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: monitor_refine_eventlogging_analytics.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [00:21:43] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:56:44] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:01:43] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:26:43] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:31:43] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:36:41] 10Quarry, 10superset.wmcloud.org, 10cloud-services-team (FY2022/2023-Q4): Move Quarry to be an installation of Superset - https://phabricator.wikimedia.org/T169452 (10EpicPupper) I completely disagree that Superset is a “very bad tool”. IMO, the design is so much better, and there’s more features. [03:37:28] 10Quarry, 10superset.wmcloud.org, 10cloud-services-team (FY2022/2023-Q4): Move Quarry to be an installation of Superset - https://phabricator.wikimedia.org/T169452 (10EpicPupper) “now we are going to lose them” the requests will still be archived on Phabricator; I don’t see the point of your argument… [06:33:34] (03CR) 10Kosta Harlan: Add section title and ordinal in the image_suggestion_interaction schema (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/924879 (https://phabricator.wikimedia.org/T335716) (owner: 10Sergio Gimeno) [07:31:44] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:52:16] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10dcausse) @pfischer since we might want to change the current behavior in CirrusSearch (stop treating double redi... [09:26:44] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:31:43] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:43:05] 10Quarry, 10superset.wmcloud.org, 10cloud-services-team (FY2022/2023-Q4): Move Quarry to be an installation of Superset - https://phabricator.wikimedia.org/T169452 (10Stuartyeates) > Exactly. These ones are already there, by request or spoken need, and now we are going to lose them. As explained above, no... [09:52:22] 10Data-Engineering-Planning, 10Event-Platform Value Stream, 10Wikidata, 10Wikidata-Query-Service, and 2 others: Upgrade the WDQS streaming updater to latest flink (1.16) - https://phabricator.wikimedia.org/T289836 (10Gehel) 05Open→03Resolved [09:56:44] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:01:43] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:04:29] 10Data-Engineering, 10DBA, 10Data-Persistence, 10Data-Services, and 2 others: Investigate if maintain-replica-indexes is still needed - https://phabricator.wikimedia.org/T337734 (10Ladsgroup) >>! In T337734#8895852, @MusikAnimal wrote: >>>! In T337734#8895190, @Ladsgroup wrote: >> So I repooled the ones th... [10:09:19] 10Data-Engineering, 10DBA, 10Data-Services, 10TaxonBot, and 3 others: Rebuild sanitarium hosts - https://phabricator.wikimedia.org/T337446 (10Ladsgroup) [10:10:10] 10Data-Engineering, 10DBA, 10Data-Persistence, 10Data-Services, and 2 others: Investigate if maintain-replica-indexes is still needed - https://phabricator.wikimedia.org/T337734 (10Ladsgroup) 05Open→03Resolved a:03Ladsgroup [10:14:48] 10Data-Engineering, 10DBA, 10Data-Services, 10TaxonBot, and 3 others: Rebuild sanitarium hosts - https://phabricator.wikimedia.org/T337446 (10Ladsgroup) 05Open→03Resolved The hosts have been fully rebuilt and working as expected without any major replag anymore. The indexes have been added too. So I'm... [10:20:38] Heya mforns - I'm sorry I was gone yesterday evening [10:20:46] Thank you for fixing those issues<3 [10:23:12] 10Quarry, 10superset.wmcloud.org, 10cloud-services-team (FY2022/2023-Q4): Move Quarry to be an installation of Superset - https://phabricator.wikimedia.org/T169452 (10IKhitron) >>! In T169452#8897787, EpicPupper wrote: > I completely disagree that Superset is a “very bad tool”. IMO, the design is so much bet... [10:52:42] 10Data-Engineering, 10Event-Platform Value Stream, 10Machine-Learning-Team: Create new mediawiki.page_links_change stream based on fragment/mediawiki/state/change/page - https://phabricator.wikimedia.org/T331399 (10dcausse) @pfischer thanks for compiling all this! Some fields like `to` can be of different me... [10:55:02] (03PS3) 10Sergio Gimeno: Add section title and ordinal in the image_suggestion_interaction schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/924879 (https://phabricator.wikimedia.org/T335716) [10:55:11] (03CR) 10Sergio Gimeno: Add section title and ordinal in the image_suggestion_interaction schema (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/924879 (https://phabricator.wikimedia.org/T335716) (owner: 10Sergio Gimeno) [11:29:31] (03PS1) 10Sergio Gimeno: Add section title and ordinal in image suggestions submission events [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/926468 (https://phabricator.wikimedia.org/T335716) [11:29:58] (03CR) 10CI reject: [V: 04-1] Add section title and ordinal in image suggestions submission events [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/926468 (https://phabricator.wikimedia.org/T335716) (owner: 10Sergio Gimeno) [11:36:44] (03PS2) 10Sergio Gimeno: Add section title and ordinal in image suggestions submission events [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/926468 (https://phabricator.wikimedia.org/T335716) [12:26:44] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:31:44] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:00:20] 10Data-Engineering, 10Event-Platform Value Stream, 10Machine-Learning-Team: Create new mediawiki.page_links_change stream based on fragment/mediawiki/state/change/page - https://phabricator.wikimedia.org/T331399 (10Ottomata) Thanks! > no way to distinguish local uploads from media uploaded to commons for me... [13:13:03] 10Data-Engineering, 10DBA: Clean up clouddb1021 - https://phabricator.wikimedia.org/T337961 (10Andrew) @Ladsgroup can I assign this to you? [13:14:46] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 14 A): eventutilities-python: review and clean up in preparation for a GA release. - https://phabricator.wikimedia.org/T336488 (10CodeReviewBot) gmodena updated https://gitlab.wikimedia.org/repos/data-engineering/eventutilities-python/-/merge_requests/6... [13:21:27] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10Ottomata) Thanks for the data modeling work yall! Now that we are looking at the options in T331399#8898566, I... [13:45:18] 10Data-Engineering, 10DBA, 10Data-Services, 10TaxonBot, and 3 others: Rebuild sanitarium hosts - https://phabricator.wikimedia.org/T337446 (10MusikAnimal) >>! In T337446#8898231, @Ladsgroup wrote: > The hosts have been fully rebuilt and working as expected without any major replag anymore. The indexes have... [13:49:49] 10Data-Engineering, 10Event-Platform Value Stream, 10Machine-Learning-Team: Create new mediawiki.page_links_change stream based on fragment/mediawiki/state/change/page - https://phabricator.wikimedia.org/T331399 (10dcausse) >>! In T331399#8898566, @Ottomata wrote: > Thanks! > >> no way to distinguish local... [14:06:44] (SystemdUnitFailed) firing: (3) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:07:34] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10pfischer) @Ottomata, I'm not sure about redirects always being local. According to [[ https://www.mediawiki.org/... [14:11:41] 10Analytics, 10Data-Engineering, 10Data-Engineering-Wikistats: Wikistats Bug: Small countries not displayed on the map - https://phabricator.wikimedia.org/T338033 (10Robertsky) [14:13:58] 10Data-Engineering, 10DBA, 10Data-Services, 10TaxonBot, and 3 others: Rebuild sanitarium hosts - https://phabricator.wikimedia.org/T337446 (10TheDJ) Also much thanks to especially @Marostegui from my side. >>! In T337446#8898661, @MusikAnimal wrote: > I wanted to ask something I've genuinely been curious... [14:21:43] (SystemdUnitFailed) firing: (3) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:37:28] 10Data-Engineering, 10Event-Platform Value Stream, 10Machine-Learning-Team: Create new mediawiki.page_links_change stream based on fragment/mediawiki/state/change/page - https://phabricator.wikimedia.org/T331399 (10pfischer) > I wonder if we can just reuse page entity in the link fragment schema. If we excl... [14:56:43] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:01:43] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:23:19] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10daniel) >>! In T325315#8898688, @pfischer wrote: > @Ottomata, I'm not sure about redirects always being local. A... [16:09:52] no problemo joal, it was more of an informative ping, we already had discussed the code! [16:20:34] 10Data-Engineering, 10Event-Platform Value Stream, 10Machine-Learning-Team: Create new mediawiki.page_links_change stream based on fragment/mediawiki/state/change/page - https://phabricator.wikimedia.org/T331399 (10Ottomata) > Why did you $ref: /fragment/mediawiki/state/change/page/1.0.0 instead /fragment/me... [16:21:35] 10Data-Engineering, 10Event-Platform Value Stream, 10Machine-Learning-Team: Create new mediawiki.page_links_change stream based on fragment/mediawiki/state/change/page - https://phabricator.wikimedia.org/T331399 (10Ottomata) And > fragment/mediawiki/state/change/page_link Should have been fragment/mediawiki... [16:24:55] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10Ottomata) @daniel, do you know if we can get a canonical name (wiki_id?) of the wiki for interwiki link? If so,... [16:31:14] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10daniel) >>! In T325315#8899114, @Ottomata wrote: > @daniel, do you know if we can get a canonical name (wiki_id?... [16:35:42] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10Ottomata) > But it should be possible for "local" interwiki links Does "local" here mean local to the current wi... [17:51:47] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10daniel) >>! In T325315#8899158, @Ottomata wrote: >> But it should be possible for "local" interwiki links > Does... [17:52:26] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10daniel) >>! In T325315#8899402, @daniel wrote: >>>! In T325315#8899158, @Ottomata wrote: >> Does "local" here me... [17:55:54] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10Ottomata) Phewf okay thank you! [18:12:05] 10Data-Engineering, 10Product-Analytics: Check home/HDFS leftovers of xihua - https://phabricator.wikimedia.org/T337711 (10mpopov) a:05mpopov→03None Can confirm it is safe to delete all instances of /home/xihua on the stat* machines, including on stat1005 where there are files. They are not needed. [18:12:48] 10Data-Engineering, 10Product-Analytics: Check home/HDFS leftovers of xihua - https://phabricator.wikimedia.org/T337711 (10mpopov) p:05Medium→03Triage [18:13:14] 10Data-Engineering, 10Product-Analytics: Remove home/HDFS leftovers of xihua - https://phabricator.wikimedia.org/T337711 (10mpopov) [18:13:37] 10Data-Engineering, 10Product-Analytics: Remove home/HDFS leftovers of xihua - https://phabricator.wikimedia.org/T337711 (10mpopov) [18:53:15] 10Data-Engineering-Planning, 10Event-Platform Value Stream (Sprint 14 A): Improve Event Platform and MediaWiki Event Enrichment wikitech documentation - https://phabricator.wikimedia.org/T329629 (10Ottomata) Updated https://wikitech.wikimedia.org/wiki/Event_Platform/Stream_Processing/Flink more today, reorged... [19:01:43] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:06:08] 10Data-Engineering, 10DBA: Clean up clouddb1021 - https://phabricator.wikimedia.org/T337961 (10Ladsgroup) I prefer if someone else check what clean be cleaned up first. I can do the compression of the tables but cleaning up logs and such should be rather easy. Still, if noone does it by early next week, I can... [19:27:11] 10Data-Engineering-Planning, 10Epic, 10Event-Platform Value Stream (Sprint 14 A), 10Patch-For-Review: Deploy mediawiki-page-content-change-enrichment to wikikube k8s - https://phabricator.wikimedia.org/T325303 (10CodeReviewBot) otto opened https://gitlab.wikimedia.org/repos/data-engineering/eventutilities-... [19:31:07] 10Data-Engineering, 10Event-Platform Value Stream (Sprint 14 A), 10Patch-For-Review: Release mediawiki.page_change.v1 stream - https://phabricator.wikimedia.org/T336817 (10Ottomata) [19:52:30] 10Data-Engineering-Planning, 10Data Pipelines, 10Shared-Data-Infrastructure: [Iceberg] Debianize and install iceberg support for Spark, Presto, and optionally Hive - https://phabricator.wikimedia.org/T311738 (10xcollazo) 05In progress→03Resolved [19:52:34] 10Data-Engineering-Planning, 10Epic: [Iceberg] Epic: Icebergify event_sanitized database - https://phabricator.wikimedia.org/T311743 (10xcollazo) [20:20:20] 10Data-Engineering, 10Event-Platform Value Stream, 10Machine-Learning-Team: Create new mediawiki.page_links_change stream based on fragment/mediawiki/state/change/page - https://phabricator.wikimedia.org/T331399 (10pfischer) Alright, so if we reduce the scope from general link to page link -- that is a link... [20:24:36] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10pfischer) If we do not resolve non-local page IDs, would it suffice to pass the iw_prefix for now (to indicate t... [21:11:36] 10Data-Engineering, 10Event-Platform Value Stream, 10Machine-Learning-Team: Create new mediawiki.page_links_change stream based on fragment/mediawiki/state/change/page - https://phabricator.wikimedia.org/T331399 (10Ottomata) Based on all the info you've gathered, and comments in T325315, I think we can avoid... [21:15:39] 10Data-Engineering, 10Data-Engineering-Wikistats, 10I18n: Fill in missing qqq.json and enforce in future - https://phabricator.wikimedia.org/T338072 (10Reedy) [21:17:18] 10Data-Engineering, 10Data-Engineering-Wikistats, 10I18n: Fill in missing qqq.json and enforce in future - https://phabricator.wikimedia.org/T338072 (10Reedy) [21:17:21] 10Data-Engineering, 10Event-Platform Value Stream, 10Discovery-Search (Current work), 10Patch-For-Review: Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10Ottomata) So I guess: `lang=yaml redirect_target_page: type: object properties: page_id: $ref: '... [22:43:58] 10Data-Engineering, 10Data-Engineering-Wikistats: Wikistats Bug: Small countries not displayed on the map - https://phabricator.wikimedia.org/T338033 (10Reedy) [23:01:44] (SystemdUnitFailed) firing: (2) monitor_refine_eventlogging_analytics.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed