[00:02:40] FIRING: [2x] SystemdUnitFailed: generate_vrts_aliases.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [00:07:25] FIRING: [2x] SystemdUnitFailed: generate_vrts_aliases.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:27:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:32:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:37:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:02:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:07:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:02:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:07:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:32:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:37:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:32:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:37:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:27:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:32:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:37:25] FIRING: [3x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:19:52] 10Mail, 06Infrastructure-Foundations, 06SRE, 10vrts, 10Znuny: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325722 (10revi) (Adding SRE and infra-foundations based on tasks at #mail.) For those without vrt-wiki access but have WMF-NDA, you have P710... [09:24:45] 10Mail, 06Infrastructure-Foundations, 06SRE, 10vrts, 10Znuny: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325754 (10revi) > Additionally, it appears to be routed via Google., which perhaps has never been correct. If I recall correctly, `mx{1001|20... [09:33:25] 10Mail, 06Infrastructure-Foundations, 06SRE, 10vrts, 10Znuny: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325767 (10taavi) for some reason the alias generator script thinks the alias is handled by google and does not route it to VRTS: ` Nov 15 09:0... [09:36:06] 10Mail, 06Infrastructure-Foundations, 06SRE, 10vrts, 10Znuny: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325776 (10revi) I can definitely say it did not work that way yesterday: there was an incoming info-ko@wikimedia.org ticket at `2024-11-14T02:... [09:43:53] 10Mail, 06Infrastructure-Foundations, 06SRE, 10vrts, 10Znuny: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325786 (10revi) It seems like it's entire `@wikimedia.org` that is refusing to route to VRTS. My test email to `oversight-ko-wp@wikimedia.org`... [09:47:09] 10Mail, 06Infrastructure-Foundations, 06SRE, 10vrts, 10Znuny: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325804 (10Krd) That actually make VRTS fubar. Unbreak now please. [10:16:21] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325873 (10Jelto) [10:18:10] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325882 (10revi) [10:19:05] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325876 (10Ladsgroup) p:05Triage→03Unbreak! Nothing in recently merged patches of puppet stands out neither anything i... [10:35:19] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325930 (10Ladsgroup) Asking ITS if anything changed on their side recently. [10:44:08] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325964 (10Ladsgroup) As a fast fix, we can put vrt transport rule before gmail rule to make sure it gets checked first. [10:46:59] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10325989 (10Krd) Can't we just check what was changed yesterday, and undo that? [10:48:57] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10326004 (10taavi) >>! In T380009#10325964, @Ladsgroup wrote: > As a fast fix, we can put vrt transport rule before gmail r... [10:59:28] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10326046 (10Ladsgroup) My postfix knowledge is not really good but what I mean is this order: ` transport_maps = regexp:/et... [11:00:18] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10326051 (10Ladsgroup) >>! In T380009#10325989, @Krd wrote: > Can't we just check what was changed yesterday, and undo that... [11:12:08] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10326074 (10eoghan) So the issue is coming from the vrts_aliases.py cron job. Something has changed in how gmail is respond... [11:37:25] FIRING: [2x] SystemdUnitFailed: generate_vrts_aliases.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:42:25] FIRING: [2x] SystemdUnitFailed: generate_vrts_aliases.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:56:39] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 3 others: VRTS e-mail address unreachable / e-mail routing issue - https://phabricator.wikimedia.org/T380009#10326236 (10eoghan) p:05Unbreak!→03High We've made a change to the aliases routing script which we believe has fixed th... [12:07:25] FIRING: [2x] SystemdUnitFailed: generate_vrts_aliases.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:09:02] hi, can i pm a member of the infrastructure foundations team about something potentially private? [12:12:25] FIRING: [3x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:15:16] A_smart_kitten: Sure, what topic? [12:15:27] slyngs: a phab task [12:15:34] i'll send you a pm [12:15:58] Okay, I'll see if it's something I can do, otherwise I'll route you in the right direction [12:17:25] FIRING: [3x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:37:25] RESOLVED: SystemdUnitFailed: generate_vrts_aliases.service on mx-in2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:46:09] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: CloudVPS: IPv6 in codfw1dev - https://phabricator.wikimedia.org/T245495#10326497 (10aborrero) [12:48:48] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: openstack: work out IPv6 and designate integration - https://phabricator.wikimedia.org/T374715#10326506 (10aborrero) 05Stalled→03Resolved a:03aborrero this was done by means of {T378192} [12:51:53] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: dns: integrate PTR support for 2a02:ec80:a100::/48 - https://phabricator.wikimedia.org/T376462#10326522 (10aborrero) 05In progress→03Resolved [12:52:19] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: CloudVPS: IPv6 in codfw1dev - https://phabricator.wikimedia.org/T245495#10326524 (10aborrero) [12:53:57] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: CloudVPS: IPv6 in codfw1dev - https://phabricator.wikimedia.org/T245495#10326527 (10aborrero) 05Open→03Resolved a:03aborrero I think we can consider IPv6 to be fully working on codfw1dev. [13:01:22] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 13Patch-For-Review: gNMI module in Spicerack - https://phabricator.wikimedia.org/T344325#10326578 (10ayounsi) 05Stalled→03Declined Going to close that task as we're not planning on using gNMI for automation any further, due to various shortcom... [13:02:26] 10SRE-tools, 06Infrastructure-Foundations: Package pyGNMI and dictdiffer to be used by cookbooks - https://phabricator.wikimedia.org/T340045#10326590 (10ayounsi) 05Open→03Declined Thanks for dictdiffer, because of a change in priorities and current limitations in pyGNMI, there is no more need to packag... [13:03:48] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Add Dell switches support to Homer/Cookbooks - https://phabricator.wikimedia.org/T320638#10326595 (10ayounsi) 05Stalled→03Declined Because of the various limitations listed in {T340045} we're not going to proceed any further on Del... [13:04:53] 10netops, 06Infrastructure-Foundations, 06SRE: Put Dell SONiC switches in production - https://phabricator.wikimedia.org/T335028#10326604 (10ayounsi) 05Stalled→03Declined Because of the various limitations listed in {T342673} (plus the ones from pygnmi) we're not going to proceed any further on Dell... [14:04:54] 07Puppet: Keepalived Puppet module: Support IPv6 - https://phabricator.wikimedia.org/T380057 (10taavi) 03NEW [14:32:12] 07Puppet, 07IPv6, 13Patch-For-Review: Keepalived Puppet module: Support IPv6 - https://phabricator.wikimedia.org/T380057#10326995 (10taavi) [14:34:17] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: Decom prod infra side of the ulsfo-office link - https://phabricator.wikimedia.org/T379778#10326998 (10RobH) > "Created by: cbissell Cross connect has been disconnected, per our policy it will be removed after 48 hours" I'... [15:47:04] 10netops, 06Infrastructure-Foundations, 06SRE: Manange fundraising network elements from Netbox - https://phabricator.wikimedia.org/T377996#10327189 (10cmooney) [15:47:05] 10netops, 10fundraising-tech-ops, 06Infrastructure-Foundations, 06SRE: Manage frack switches with Netbox - https://phabricator.wikimedia.org/T268802#10327190 (10cmooney) [15:48:22] jhathaway: o/ didn't manage to complete the provision of the thanos nodes since we need to upgrade their firmware first :( [15:49:30] ahh firmware upgrades, it is really unfortunate that server class hardware hasn't embraced fwupd, it works so well on the desktop side [15:49:51] when do they expect to finish? [17:29:38] jhathaway: In theory we can upgrade the firmware from the BMC Web UI, but for some reason I wasn't able to reach/use it :( [17:30:03] otherwise I'd have already completed it, but maybe before provisioning we need the manual/cart console [17:30:14] got it, thanks [19:56:45] 10SRE-tools, 06cloud-services-team, 06Infrastructure-Foundations, 07IPv6: Some WMCS clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271139#10328185 (10taavi) >>! In T271139#10151973, @Volans wrote: > I guess that the clouddb are expected and they **all** don't have the AAAA rec... [21:36:52] stupid question, how do I run a cookbook from another cookbook? I thought it'd just be a `cookbook_instance.get_runner().run()` or something like that, but I don't see such a thing anywhere in the code