[00:06:25] RESOLVED: SystemdUnitFailed: haproxy_stek_job.service on cp2039:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [00:51:43] hmm [00:53:12] pystemd.dbusexc.DBusTimeoutError: [err -110]: b'Connection timed out' [00:53:20] seems to be transient and resolved so not looking any further [00:53:48] ah, also matches the reboot [06:41:52] 10netops, 06Infrastructure-Foundations: Capirca setup for routed Ganeti VMs - https://phabricator.wikimedia.org/T367265 (10Volans) 03NEW [07:39:55] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-f6-eqiad - https://phabricator.wikimedia.org/T365983#9883256 (10ABran-WMF) [07:40:38] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-f7-eqiad - https://phabricator.wikimedia.org/T365984#9883270 (10ABran-WMF) [07:40:59] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e5-eqiad - https://phabricator.wikimedia.org/T365986#9883271 (10ABran-WMF) [07:41:36] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e6-eqiad - https://phabricator.wikimedia.org/T365987#9883272 (10ABran-WMF) [07:41:48] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e7-eqiad - https://phabricator.wikimedia.org/T365988#9883273 (10ABran-WMF) [09:29:34] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e1-eqiad - https://phabricator.wikimedia.org/T365993#9883482 (10jcrespo) backup1010 is in intermittent usage to support mediabackups disk space, but mostly idle at the t... [09:35:09] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e3-eqiad - https://phabricator.wikimedia.org/T365995#9883497 (10jcrespo) backup1009 is the main backup node for bacula on eqiad. Most backups happen during the night- so... [09:36:12] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-f1-eqiad - https://phabricator.wikimedia.org/T365996#9883498 (10jcrespo) backup1011 is a mediabackups storage server. Ideally, mediabackups are paused during the mainten... [09:40:59] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 -lsw1-f3-eqiad - https://phabricator.wikimedia.org/T365998#9883516 (10jcrespo) db1205 is the secondary media backups metadata db server, usually just a standby to db1204. Unles... [10:14:31] 10Wikimedia-Apache-configuration, 06serviceops: 2030.wikimedia.org is a double redirect - https://phabricator.wikimedia.org/T367013#9883619 (10akosiaris) 05Open→03Declined Given the above, I am gonna close this as `Declined` in the interest of not having lingering tasks, but feel free to reopen. [13:31:55] 06Traffic: LVSRealserverMSS alert is broken for ferm based hosts - https://phabricator.wikimedia.org/T367204#9884292 (10Vgutierrez) a:05Vgutierrez→03CDobbins [14:04:20] 06Traffic, 13Patch-For-Review: Use IPIP encapsulation on lvs<-->text cluster - https://phabricator.wikimedia.org/T366466#9884490 (10Vgutierrez) 05Open→03Resolved [14:04:35] 🎉 🍻 [14:18:40] 🎉 [14:18:42] very very cool [14:22:36] 06Traffic, 10CirrusSearch, 06Discovery-Search, 10Elasticsearch, 06Infrastructure-Foundations: Migrate services behind high-traffic2 LVS to IPIP encapsulation - https://phabricator.wikimedia.org/T367312 (10Vgutierrez) 03NEW [14:22:58] 06Traffic, 10CirrusSearch, 06Discovery-Search, 10Elasticsearch, 06Infrastructure-Foundations: Migrate services behind high-traffic2 LVS to IPIP encapsulation - https://phabricator.wikimedia.org/T367312#9884670 (10Vgutierrez) p:05Triage→03High [14:41:47] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9884904 (10BCornwall) [16:32:59] FIRING: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [16:33:59] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [16:43:59] RESOLVED: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [16:45:18] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885645 (10BCornwall) [16:46:47] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885674 (10BCornwall) [16:47:58] RESOLVED: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [16:52:50] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885722 (10BCornwall) [16:54:25] FIRING: SystemdUnitFailed: haproxy_stek_job.service on cp2038:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:58:28] hello! just submitted [Configurably remove varnish handling of /beacon/event (1042278)](https://gerrit.wikimedia.org/r/c/operations/puppet/+/1042278) [16:58:49] I added bblack as reviewer, please let me know if there is a better person. [16:59:01] There are also probably better ways to do this, do let me know! [17:03:08] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885761 (10BCornwall) [17:04:36] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885762 (10BCornwall) [17:13:21] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885810 (10CDobbins) [17:18:24] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885829 (10CDobbins) [17:19:09] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885830 (10BCornwall) [17:22:25] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885854 (10BCornwall) [17:23:38] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885856 (10BCornwall) [17:23:45] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885860 (10CDobbins) [17:29:51] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885889 (10CDobbins) [17:43:26] 06Traffic, 06DC-Ops, 10ops-ulsfo, 06SRE: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9885932 (10BCornwall) [19:27:44] FIRING: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [19:30:19] FIRING: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [19:32:44] RESOLVED: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [19:35:19] RESOLVED: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [19:50:30] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:00:30] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:01:39] 06Traffic, 06DC-Ops, 10ops-ulsfo: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9886475 (10BCornwall) a:05RobH→03BCornwall [20:02:03] 06Traffic, 06DC-Ops, 10ops-ulsfo: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9886473 (10BCornwall) [20:02:15] 06Traffic, 06DC-Ops, 10ops-ulsfo: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9886476 (10BCornwall) 05Open→03In progress [20:02:39] FIRING: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:05:30] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:07:39] RESOLVED: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:10:30] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:15:30] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:19:44] FIRING: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:20:30] FIRING: [3x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:25:30] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:30:30] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:35:30] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:40:30] RESOLVED: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:51:27] FIRING: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:53:39] ^I went ahead and created a blanket site=magru,name=SLOMetricAbsent silence [20:53:42] 12h [20:54:40] FIRING: SystemdUnitFailed: haproxy_stek_job.service on cp2038:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:56:27] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [20:57:57] well, fat lot of good that did [20:59:22] Why do some of these not have the site? [21:04:39] 06Traffic, 06DC-Ops, 10ops-ulsfo, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9886731 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp4037.ulsfo.wmnet with OS bullseye [21:05:25] 06Traffic, 06DC-Ops, 10ops-ulsfo, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9886733 (10BCornwall) [21:06:27] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [21:11:27] RESOLVED: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [21:18:45] FIRING: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [21:19:48] 06Traffic, 06DC-Ops, 10ops-ulsfo, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9886776 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp4037.ulsfo.wmnet with OS bullseye execu... [21:19:58] 06Traffic, 06DC-Ops, 10ops-ulsfo, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9886778 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp4037.ulsfo.wmnet with OS bullseye [21:23:45] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [21:33:45] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [21:38:45] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [21:43:45] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [21:53:45] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [22:07:30] 06Traffic, 06DC-Ops, 10ops-ulsfo: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9886850 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp4037.ulsfo.wmnet with OS bullseye completed: - cp4037 (**PASS... [22:07:48] yay [22:13:45] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [22:14:51] 06Traffic, 06DC-Ops, 10ops-ulsfo: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9886858 (10BCornwall) [22:18:45] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [22:19:14] 06Traffic, 06DC-Ops, 10ops-ulsfo: Q4: install PCIe NVMe SSDs into ulsfo text cp40(3[789]|4[01234] - https://phabricator.wikimedia.org/T364891#9886862 (10BCornwall) [22:38:45] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [22:43:45] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [23:27:37] 06Traffic, 10Phabricator (Upstream), 10Release-Engineering-Team (Priority Backlog 📥), 07Upstream: Consider using preconnect for https://phab.wmfusercontent.org CDN - https://phabricator.wikimedia.org/T367290#9887003 (10Aklapper) Would be trivial four lines in Phabricator code but would love to hear from #T...