[00:04:25] (SystemdUnitFailed) firing: (2) prometheus_lvs_realserver_mss.service on ncredir1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:04:25] (SystemdUnitFailed) firing: (2) prometheus_lvs_realserver_mss.service on ncredir1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:04:25] (SystemdUnitFailed) firing: (2) prometheus_lvs_realserver_mss.service on ncredir1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:31:46] 06Traffic, 10MW-on-K8s, 06serviceops, 06SRE: Move 100% of external traffic to Kubernetes (excluding Votewiki and Commons) - https://phabricator.wikimedia.org/T362323#9717001 (10jijiki) [09:35:43] 06Traffic, 10MoveComms-Support, 10MW-on-K8s, 06serviceops, 06SRE: Move 100% of external traffic to Kubernetes (excluding Votewiki and Commons) - https://phabricator.wikimedia.org/T362323#9717014 (10jijiki) [11:31:25] 10netops, 06Infrastructure-Foundations: mr1-eqsin performance issue - https://phabricator.wikimedia.org/T362522#9717511 (10cmooney) FWIW I changed the key-exchange algo configured on mr1-eqsin to see if it would make any difference, from some brief searching the ec21159 one seems to use less cpu than dh group-... [12:04:25] (SystemdUnitFailed) firing: (2) prometheus_lvs_realserver_mss.service on ncredir1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:29:25] (SystemdUnitFailed) firing: (2) prometheus_lvs_realserver_mss.service on ncredir1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:39:25] (SystemdUnitFailed) resolved: (2) prometheus_lvs_realserver_mss.service on ncredir1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:02:44] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw: use old asw switches from row A and B as msw switches in row C and D - https://phabricator.wikimedia.org/T361871#9717872 (10Papaul) 05Open→03Resolved Since Monday I setup in rack D1 and D2 the juniper switch as management switch and... [13:03:55] 06Traffic, 06DC-Ops, 10ops-codfw, 10ops-eqiad, 10SRE-swift-storage: Reimage cookbook on new eqiad hosts stuck at PXE booting - https://phabricator.wikimedia.org/T350179#9717882 (10Papaul) @ssingh unfortunately using the fs DAC didn't fix the issue. So we are back to zero. I am still working on it [13:54:22] 06Traffic, 10MW-on-K8s, 06serviceops, 06SRE, 10Release-Engineering-Team (Seen): Rename X-Wikimedia-Debug k8s-experimental option - https://phabricator.wikimedia.org/T362662 (10Clement_Goubert) 03NEW [13:54:30] 06Traffic, 10MW-on-K8s, 06serviceops, 06SRE, 10Release-Engineering-Team (Seen): Rename X-Wikimedia-Debug k8s-experimental option - https://phabricator.wikimedia.org/T362662#9718155 (10Clement_Goubert) p:05Triage→03Low [15:59:40] (VarnishPrometheusExporterDown) firing: Varnish Exporter on instance cp1115:9331 is unreachable - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/000000304/varnish-dc-stats?viewPanel=17 - https://alerts.wikimedia.org/?q=alertname%3DVarnishPrometheusExporterDown [16:00:27] downtime expired [16:00:31] renewing, host is depooled [22:38:28] 06Traffic, 06DC-Ops, 10ops-magru: Q4:rack/setup/install cp70[01-16] - https://phabricator.wikimedia.org/T362729 (10RobH) 03NEW [22:38:47] 06Traffic, 06DC-Ops, 10ops-magru: Q4:rack/setup/install cp70[01-16] - https://phabricator.wikimedia.org/T362729#9720811 (10RobH) [22:42:39] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730 (10RobH) 03NEW [22:42:42] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9720829 (10RobH) [22:43:36] 10netops, 06Traffic, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9720831 (10RobH)