[05:43:47] 10netbox, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Grant cn=nda some sort of read only access to Netbox - https://phabricator.wikimedia.org/T302870 (10ayounsi) After talking to Willy I also granted `dcim | device`, the reasoning is that it would take considerable efforts to be able to pull... [06:52:51] 10netops, 10Infrastructure-Foundations: Lumen link between cr2-eqiad and cr2-esams down - July 2022 - https://phabricator.wikimedia.org/T313783 (10ayounsi) [06:57:37] 10netops, 10Infrastructure-Foundations: Lumen link between cr2-eqiad and cr2-esams down - July 2022 - https://phabricator.wikimedia.org/T313783 (10ayounsi) From Lumen diagonstic tool: > SERVICE ALARMS - NEEDS ATTENTION - We have detected equipment alarms. Further Investigation is required. Opened Repair Ticke... [08:30:58] there is ganeti1020 down since 4 days and AFAICT also cuminunpriv1001 (VM on the same ganeti row) [08:30:58] I can't find anything on Phabricator, should I assume noone had a look yet? [08:31:06] * volans having a look [08:31:44] host is up, seems a network problem [08:36:23] volans: uh? [08:36:32] but link is up on the switch [08:36:34] ge-6/0/10 up up ganeti1020 {#3769} [08:36:42] I'm checking dmesg [08:37:29] just drbd complaining [08:38:00] checking other logs [08:39:27] volans: I think I know what's up [08:39:43] I'm all ears [08:41:02] you fixed it! [08:41:16] volans: https://netbox.wikimedia.org/dcim/interfaces/26423/changelog/ [08:41:31] then Cookbook sre.network.configure-switch-interfaces for host ganeti1020 [08:41:49] why the vlan was null? [08:42:50] probably miss-configured after https://phabricator.wikimedia.org/T308331 [08:43:30] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): asw2-c5-eqiad crash - https://phabricator.wikimedia.org/T313382 (10Marostegui) [08:45:29] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): asw2-c5-eqiad crash - https://phabricator.wikimedia.org/T313382 (10Marostegui) pc1013 is no longer a master in pc3. All the tasks owned by #dba team have been completed. Though, we'd appreciate a heads up before the maintenace t... [08:48:17] XioNoX: weird that it failed 4 days ago not earlier [09:01:12] XioNoX: given you know it all... we also have 3 mgmt hosts down since 6d5h42m (all 3, same time) mw2376, ores1004, thumbor2004 [09:01:16] any idea for those too? [09:02:22] volans: 302 dcops [10:13:39] 10netbox, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Grant cn=nda some sort of read only access to Netbox - https://phabricator.wikimedia.org/T302870 (10ayounsi) a:03ayounsi @taavi let me know if that works as expected on netbox.wikimedia.org and feel free to close the task if so. [10:15:47] 10netbox, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Grant cn=nda some sort of read only access to Netbox - https://phabricator.wikimedia.org/T302870 (10taavi) 05Open→03Resolved looks good, thank you! [13:48:15] 10netops, 10Infrastructure-Foundations: Add descriptions to BGP peers - https://phabricator.wikimedia.org/T313805 (10ayounsi) p:05Triage→03Low [14:08:23] jobo: jbond just did his first spicerack release with my custom script... yay!!! :) [14:16:01] Woohoo [17:43:47] 10netops, 10Infrastructure-Foundations, 10SRE: Lumen link between cr2-eqiad and cr2-esams down - July 2022 - https://phabricator.wikimedia.org/T313783 (10ayounsi) Circuit back up as of 2022-07-26 12:18:32 UTC (05:19:01 ago). Lumen got back to me saying it's working as expected for them. I asked for an RFO a... [18:19:59] 10netops, 10Infrastructure-Foundations, 10SRE: Lumen link between cr2-eqiad and cr2-esams down - July 2022 - https://phabricator.wikimedia.org/T313783 (10Volans) p:05Triage→03Medium