[02:20:47] (SystemdUnitFailed) firing: generate_os_reports.service Failed on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:53:35] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate servers in codfw rack A3 from asw-a3-codfw to lsw1-a3-codfw - https://phabricator.wikimedia.org/T355862 (10Marostegui) The databases are ready to be moved any time. [06:20:47] (SystemdUnitFailed) firing: generate_os_reports.service Failed on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:04:09] 10netops, 10Infrastructure-Foundations, 10SRE, 10SRE-swift-storage, 10ops-codfw: Migrate servers in codfw rack A2 from asw-a2-codfw to lsw1-a2-codfw - https://phabricator.wikimedia.org/T355861 (10MoritzMuehlenhoff) I've kicked off a rebalance of ganeti/A now that the maintenance is over. [09:38:35] soo for sso-debmon.sso.eqiad1.wikimedia.cloud should I just kill the instance? [09:43:25] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate hosts from codfw row A/B ASW to new LSW devices - https://phabricator.wikimedia.org/T355544 (10cmooney) [09:44:09] 10netops, 10Infrastructure-Foundations, 10SRE, 10SRE-swift-storage, 10ops-codfw: Migrate servers in codfw rack A2 from asw-a2-codfw to lsw1-a2-codfw - https://phabricator.wikimedia.org/T355861 (10cmooney) 05Open→03Resolved a:03cmooney >>! In T355861#9523826, @MoritzMuehlenhoff wrote: > I've kicked... [10:20:47] (SystemdUnitFailed) firing: generate_os_reports.service Failed on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:32:26] 10Mail, 10Infrastructure-Foundations, 10User-notice: Stop sending change notification email if edit is done by a bot - https://phabricator.wikimedia.org/T356984 (10Ladsgroup) [12:40:47] (SystemdUnitFailed) firing: (3) generate_os_reports.service Failed on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:45:47] (SystemdUnitFailed) firing: (3) generate_os_reports.service Failed on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:23:44] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) >>! In T349619#9521720, @Volans wrote: > We could either catch the exception and retry or acquire a lock for all puppetserver ca operatio... [14:24:36] volans: didn't see it, +1 to killing sso-debmon in cloud [14:24:50] ok will do :) [15:20:03] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate hosts from codfw row A/B ASW to new LSW devices - https://phabricator.wikimedia.org/T355544 (10ssingh) moss-be* hosts should be @MatthewVernon unless I am mistaken, in which case, please accept my apologies in advance :) [15:21:35] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate hosts from codfw row A/B ASW to new LSW devices - https://phabricator.wikimedia.org/T355544 (10cmooney) >>! In T355544#9525282, @ssingh wrote: > moss-be* hosts should be @MatthewVernon unless I am mistaken, in which case, please accept my... [15:35:26] 10SRE-tools, 10Infrastructure-Foundations, 10Toolforge, 10cloud-services-team: spicerack: introduce GridEngine controller - https://phabricator.wikimedia.org/T300032 (10taavi) 05Stalled→03Declined [15:40:52] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate hosts from codfw row A/B ASW to new LSW devices - https://phabricator.wikimedia.org/T355544 (10MatthewVernon) >>! In T355544#9525282, @ssingh wrote: > moss-be* hosts should be @MatthewVernon unless I am mistaken, in which case, please acce... [15:43:23] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate hosts from codfw row A/B ASW to new LSW devices - https://phabricator.wikimedia.org/T355544 (10cmooney) [15:48:24] 10netops, 10Infrastructure-Foundations, 10SRE, 10SRE-swift-storage, 10ops-codfw: Migrate servers in codfw rack A7 from asw-a7-codfw to lsw1-a7-codfw - https://phabricator.wikimedia.org/T355867 (10cmooney) >>! In T355867#9498001, @MatthewVernon wrote: > Once complete I'll want to check the backends, but t... [15:50:20] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate servers in codfw rack A3 from asw-a3-codfw to lsw1-a3-codfw - https://phabricator.wikimedia.org/T355862 (10cmooney) >>! In T355862#9523604, @Marostegui wrote: > The databases are ready to be moved any time. Great, thanks! [16:09:39] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate servers in codfw rack A3 from asw-a3-codfw to lsw1-a3-codfw - https://phabricator.wikimedia.org/T355862 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=a24ae7f4-1952-434f-9ee8-3ff0973f1444) set by cmooney@cumin... [16:10:21] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate servers in codfw rack A3 from asw-a3-codfw to lsw1-a3-codfw - https://phabricator.wikimedia.org/T355862 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=06c4fbb3-382e-4660-b308-79bf9f5106d5) set by cmooney@cumin... [16:21:38] moritzm, slyngs FYI sso-debmon.sso.eqiad1.wikimedia.cloud is no more \o/ :D [16:26:18] nice :-) [16:28:00] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate hosts from codfw row A/B ASW to new LSW devices - https://phabricator.wikimedia.org/T355544 (10ssingh) [16:30:12] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate hosts from codfw row A/B ASW to new LSW devices - https://phabricator.wikimedia.org/T355544 (10ssingh) As discussed in [[ https://gerrit.wikimedia.org/r/c/operations/puppet/+/998431 | 998431 ]], Traffic will be taking care of `conf2004`, s... [16:36:45] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate servers in codfw rack A3 from asw-a3-codfw to lsw1-a3-codfw - https://phabricator.wikimedia.org/T355862 (10cmooney) Work completed! No errors to report all working well. [16:39:16] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate servers in codfw rack A3 from asw-a3-codfw to lsw1-a3-codfw - https://phabricator.wikimedia.org/T355862 (10Marostegui) Thanks - I am starting to repool the databases. [16:45:47] (SystemdUnitFailed) firing: generate_os_reports.service Failed on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:53:51] 10Mail, 10Infrastructure-Foundations, 10SRE: Access to DMARCIAN - https://phabricator.wikimedia.org/T356920 (10Dzahn) @DBu-WMF I think that other ticket I linked would be valuable info for you but I realized you currently don't have access to that. We can look into that. [20:45:47] (SystemdUnitFailed) firing: generate_os_reports.service Failed on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:55:31] 10Mail, 10Infrastructure-Foundations, 10Patch-For-Review, 10User-notice: Stop sending change notification email if edit is done by a bot - https://phabricator.wikimedia.org/T356984 (10Ladsgroup) Hi, something along the lines of: > If you have "Email me when a page or a file on my watchlist is changed " opt...