[00:35:25] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Decom lsw1-a1-codfw - https://phabricator.wikimedia.org/T364097#9779420 (10Papaul) [00:37:00] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Decom lsw1-a1-codfw - https://phabricator.wikimedia.org/T364097#9779435 (10Papaul) 05Open→03Resolved [00:42:22] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw row C/D upgrade racking task - https://phabricator.wikimedia.org/T360789#9779440 (10Papaul) [09:22:20] 10netops, 06Infrastructure-Foundations, 06Traffic: mgmt ssh access for prometheus hosts in magru - https://phabricator.wikimedia.org/T364454 (10fgiunchedi) 03NEW [10:58:37] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Comms to msw-d2-codfw down - https://phabricator.wikimedia.org/T364464 (10cmooney) 03NEW p:05Triage→03High [11:03:46] 10SRE-tools, 10Cloud-VPS, 06Infrastructure-Foundations, 10Spicerack: spicerack.puppet.PuppetHostsError: Unable to find CSR fingerprints for all hosts, detected errors are: Another puppet instance is already running and the waitforlock setting is set to 0; e... - https://phabricator.wikimedia.org/T361218#9780281 [11:14:33] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Comms to msw-d2-codfw down - https://phabricator.wikimedia.org/T364464#9780328 (10cmooney) [12:34:55] o/ [12:52:55] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Comms to msw-d2-codfw down - https://phabricator.wikimedia.org/T364464#9780633 (10Papaul) @cmooney I think this is just a human error issue. We were racking all the lsw1-d* yesterday and maybe we accidentally bumped into the cable. We will check o... [13:32:34] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Comms to msw-d2-codfw down - https://phabricator.wikimedia.org/T364464#9780842 (10cmooney) >>! In T364464#9780633, @Papaul wrote: > @cmooney I think this is just a human error issue. We were racking all the lsw1-d* yesterday and maybe we accidenta... [13:55:52] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9780908 (10MoritzMuehlenhoff) [14:04:53] 10netops, 06Infrastructure-Foundations, 06Traffic: mgmt ssh access for prometheus hosts in magru - https://phabricator.wikimedia.org/T364454#9780923 (10ssingh) On https://librenms.wikimedia.org/alerts, I see the following for `mr1-magru`: ` #1: last_polled => '2024-05-08 14:01:37' last_polled_timetaken =... [14:11:50] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Comms to msw-d2-codfw down - https://phabricator.wikimedia.org/T364464#9780928 (10Jhancock.wm) 05Open→03Resolved a:03Jhancock.wm port 47 on the maw was going up and down on it's own. replaced the rj-45 terminator. remained steady. [14:32:29] 10netops, 06Infrastructure-Foundations, 06Traffic: mgmt ssh access for prometheus hosts in magru - https://phabricator.wikimedia.org/T364454#9780995 (10ssingh) On `mr1-magru`, I see `10.140.1.18` (`prometheus7001`) and `denied by policy`, which makes me wonder if we need to run `https://netbox.wikimedia.org/... [15:02:50] 10netops, 06Infrastructure-Foundations, 06SRE: Extend BGP peer automation via Netbox to include VMs - https://phabricator.wikimedia.org/T364480 (10cmooney) 03NEW p:05Triage→03Medium [15:34:09] 10netops, 06Infrastructure-Foundations, 06Traffic: mgmt ssh access for prometheus hosts in magru - https://phabricator.wikimedia.org/T364454#9781207 (10cmooney) >>! In T364454#9780995, @ssingh wrote: > On `mr1-magru`, I see `10.140.1.18` (`prometheus7001`) and `denied by policy`, which makes me wonder if we... [16:06:18] 10netops, 06Infrastructure-Foundations, 06Traffic: mgmt ssh access for prometheus hosts in magru - https://phabricator.wikimedia.org/T364454#9781281 (10cmooney) 05Open→03Resolved a:03cmooney Sorry for the delay, the capirca script times out a lot for some reason will need to look at that. Working... [17:38:46] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: MD Raid monitoring: add the failed disk physical location to the auto-generated task - https://phabricator.wikimedia.org/T364496 (10Volans) 03NEW p:05Triage→03Medium