[13:51:49] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9595953 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=63abb5d8-03a7-48ae-abcc-214900c13c28) set by akosiaris@cumin1002 for 2:00:0... [14:10:10] 06Traffic: Reimage one of each Traffic hosts before magru - https://phabricator.wikimedia.org/T359053 (10ssingh) [14:12:08] 06Traffic: Reimage one of each Traffic hosts before magru - https://phabricator.wikimedia.org/T359053#9596040 (10Fabfur) a:03Fabfur [14:13:09] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9596041 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse1010.eqiad.wmnet with OS bullseye [14:45:35] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9596165 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse1010.eqiad.wmnet with OS bullseye comp... [15:29:30] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9596436 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse1011.eqiad.wmnet with OS bullseye [16:03:21] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9596708 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse1011.eqiad.wmnet with OS bullseye comp... [16:05:34] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9596718 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse1020.eqiad.wmnet with OS bullseye [16:20:50] 10netops, 06Infrastructure-Foundations, 06SRE, 10ops-codfw: Migrate hosts from codfw row A/B ASW to new LSW devices - https://phabricator.wikimedia.org/T355544#9596886 (10cmooney) [16:38:59] 06Traffic: Investigate why Traffic SLO Grafana dashboard has negative values on combined SLI - https://phabricator.wikimedia.org/T341606#9597018 (10BCornwall) 05In progress→03Resolved [16:39:21] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597019 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse1020.eqiad.wmnet with OS bullseye comp... [16:41:10] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597042 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse1021.eqiad.wmnet with OS bullseye [16:47:38] (LVSRealserverMSS) firing: (4) Unexpected MSS value on 198.35.26.98:443 @ ncredir4001 - TODO - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=ulsfo&var-cluster=ncredir - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [16:55:25] (SystemdUnitFailed) firing: nginx.service on ncredir4001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:55:55] Hm, I've downtimed it, not sure why it's complaining [17:14:49] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597298 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse1021.eqiad.wmnet with OS bullseye comp... [17:16:36] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597325 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse1022.eqiad.wmnet with OS bullseye [17:20:25] (SystemdUnitFailed) resolved: nginx.service on ncredir4001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:22:38] (LVSRealserverMSS) resolved: (4) Unexpected MSS value on 198.35.26.98:443 @ ncredir4001 - TODO - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=ulsfo&var-cluster=ncredir - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [17:36:55] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597456 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse1012.eqiad.wmnet with OS bullseye [17:47:38] (LVSRealserverMSS) firing: (4) Unexpected MSS value on 198.35.26.98:443 @ ncredir4001 - TODO - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=ulsfo&var-cluster=ncredir - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [17:47:55] stahppp [17:49:30] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597506 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse1022.eqiad.wmnet with OS bullseye comp... [17:52:58] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597521 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse1023.eqiad.wmnet with OS bullseye [17:53:25] (SystemdUnitFailed) firing: nginx.service on ncredir4001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:02:14] 06Traffic, 13Patch-For-Review: Simplify maintenance of DNS/NTP hosts to reduce toil around reboots, reimages, and other work - https://phabricator.wikimedia.org/T347054#9597579 (10ssingh) [18:09:35] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597590 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse1012.eqiad.wmnet with OS bullseye comp... [18:23:25] (SystemdUnitFailed) resolved: nginx.service on ncredir4001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:26:28] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597622 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse1023.eqiad.wmnet with OS bullseye comp... [18:27:05] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597623 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse1024.eqiad.wmnet with OS bullseye [18:39:25] (SystemdUnitFailed) firing: nginx.service on ncredir4001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:40:52] 06Traffic, 06Content-Transform-Team, 10MW-on-K8s, 06SRE, and 3 others: Reimage parse* hosts as kubernetes nodes - https://phabricator.wikimedia.org/T358752#9597703 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse1024.eqiad.wmnet with OS bullseye exec... [18:56:19] 06Traffic, 13Patch-For-Review: Simplify maintenance of DNS/NTP hosts to reduce toil around reboots, reimages, and other work - https://phabricator.wikimedia.org/T347054#9597754 (10ssingh) 05Open→03Resolved a:03ssingh == Final Update == We have finished rolling the changes today, so all state management... [20:07:38] (LVSRealserverMSS) resolved: (4) Unexpected MSS value on 198.35.26.98:443 @ ncredir4001 - TODO - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=ulsfo&var-cluster=ncredir - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS