[00:34:00] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760024 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host lvs7001.magru.wmnet with OS bullseye [01:26:55] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760118 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host lvs7001.magru.wmnet with OS bullseye compl... [01:37:49] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760148 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host lvs7002.magru.wmnet with OS bullseye [01:39:27] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760151 (10ssingh) [01:45:25] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:05:25] RESOLVED: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:29:02] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760193 (10ssingh) [02:31:07] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760197 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host lvs7002.magru.wmnet with OS bullseye compl... [07:44:05] 10netops, 06Infrastructure-Foundations, 07sre-alert-triage: Alert in need of triage: BGP status (instance cr2-eqord) - https://phabricator.wikimedia.org/T363895 (10LSobanski) 03NEW [09:59:58] 10netops, 06Infrastructure-Foundations, 07sre-alert-triage: Alert in need of triage: BGP status (instance cr2-eqord) - https://phabricator.wikimedia.org/T363895#9760669 (10cmooney) p:05Triage→03Low These are direct peerings to Equinix tehmselves over their own exchange. We are waiting on them to complet... [10:01:40] 10netops, 06Infrastructure-Foundations: Alert in need of triage: BGP status (instance cr2-eqord) - https://phabricator.wikimedia.org/T363895#9760673 (10cmooney) [10:02:26] 10netops, 06Infrastructure-Foundations: BGP status (instance cr2-eqord) - April 2024 - Equinix peering AS15830 - https://phabricator.wikimedia.org/T363895#9760676 (10cmooney) [10:13:45] I'm getting some funny results from debmonitor, affecting cephosd100[1-5]. I'd be grateful if someone could help me check what is wrong, please. [10:15:29] When I run `btullis@cephosd1001:~$ apt list --upgradable` I see 20 binary packages, all of which come from the `ceph` source package. [10:16:23] The `apt-update` post-invoke trigger says: [10:16:29] https://www.irccloud.com/pastebin/cGG7QE8P/ [10:17:44] The debmonitor web interface only shows 4 upgradable packages. https://debmonitor.wikimedia.org/hosts/cephosd1001.eqiad.wmnet [10:19:27] I tried a `sudo debdeploy deploy -u 2024-05-01-ceph.yaml -s cephosd` (specifying `bullseye: 18.2.2-1~bpo12+1`) and it said that all hosts were up to date. What am I doing wrong? Thanks. [10:22:13] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760695 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host lvs7003.magru.wmnet with OS bullseye [10:30:49] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760717 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host lvs7003.magru.wmnet with OS bullseye execu... [10:30:59] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760718 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host lvs7003.magru.wmnet with OS bullseye [11:19:12] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760772 (10ssingh) [11:24:18] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760774 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host lvs7003.magru.wmnet with OS bullseye compl... [11:26:51] Problem solved. It was an issue with uninstallable packages being held back. [13:30:03] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9760937 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host dns7001.wikimedia.org with OS bookworm [15:15:18] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9761225 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host dns7001.wikimedia.org with OS bookworm exe... [15:34:08] btullis: glad you figured it out! apt error messages are often cryptic [17:12:43] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9761576 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host dns7001.wikimedia.org with OS bookworm [18:15:33] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9761743 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host dns7001.wikimedia.org with OS bookworm com... [18:16:27] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9761747 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host dns7002.wikimedia.org with OS bookworm [18:31:03] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9761762 (10ssingh) [18:36:35] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9761770 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host dns7002.wikimedia.org with OS bookworm exe... [18:36:43] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9761774 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin1002 for host dns7002.wikimedia.org with OS bookworm [19:00:25] FIRING: [2x] SystemdUnitFailed: postfix@-.service on mx-out1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:10:25] FIRING: [4x] SystemdUnitFailed: postfix@-.service on mx-out1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:25:25] FIRING: [4x] SystemdUnitFailed: postfix@-.service on mx-out1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:30:25] FIRING: [4x] SystemdUnitFailed: postfix@-.service on mx-out1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:40:50] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9762008 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin1002 for host dns7002.wikimedia.org with OS bookworm com... [19:42:23] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, 06Traffic: Q4:rack/setup/install magru misc servers - https://phabricator.wikimedia.org/T362730#9762017 (10ssingh) [20:27:49] FIRING: PuppetFailure: Puppet has failed on mx-out1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [20:32:49] FIRING: [2x] PuppetFailure: Puppet has failed on mx-out1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [21:07:48] FIRING: [2x] PuppetFailure: Puppet has failed on mx-out1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [21:12:48] RESOLVED: [2x] PuppetFailure: Puppet has failed on mx-out1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [23:30:25] FIRING: [2x] SystemdUnitFailed: prometheus-postfix-exporter.service on mx-out1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed