[08:45:44] 10netbox, 06Infrastructure-Foundations, 13Patch-For-Review: Netbox: Remove leftovers of CAS auth - https://phabricator.wikimedia.org/T371892#10308478 (10SLyngshede-WMF) 05Open→03Resolved [10:49:56] 10CAS-SSO, 06Data-Engineering, 10Data-Engineering-Jupyter, 06Data-Platform-SRE, 07Epic: Improve the JupyterHub services and use CAS/SSO - https://phabricator.wikimedia.org/T260386#10309027 (10BTullis) [10:54:15] 10CAS-SSO, 06Data-Engineering, 10Data-Engineering-Jupyter, 06Data-Platform-SRE, 07Epic: Improve the JupyterHub services and use CAS/SSO - https://phabricator.wikimedia.org/T260386#10309041 (10BTullis) a:05BTullis→03None Unassigning myself from this epic, for now. It looks like phase 1 is going to go... [11:45:50] 10netops, 06Infrastructure-Foundations, 06SRE: Extend sre.network.configure-switch-interfaces cookbook to add sflow and qos config - https://phabricator.wikimedia.org/T379549 (10cmooney) 03NEW p:05Triage→03Low [11:48:07] 10netops, 06Infrastructure-Foundations, 06SRE: Extend sre.network.configure-switch-interfaces cookbook to add sflow and qos config - https://phabricator.wikimedia.org/T379549#10309316 (10cmooney) a:03cmooney [12:14:02] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 3 others: Frack eqiad network upgrade: design, installation and configuration - https://phabricator.wikimedia.org/T377381#10309416 (10Jclark-ctr) @cmooney thanks for the list I have populated both new switches up to port 27 wit... [13:57:25] FIRING: SystemdUnitFailed: netbox_ganeti_codfw_sync.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:12:00] moritzm, slyngs - if you have a min https://gerrit.wikimedia.org/r/c/operations/alerts/+/1089714 [14:12:25] RESOLVED: SystemdUnitFailed: netbox_ganeti_codfw_sync.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:12:25] lookinh [14:15:42] Slightly wondering if these should be downgraded to warning [14:16:41] do we have anything else that alerts if irc.w.org doesn't work? [14:16:49] I am kinda lost now about those alerts :D [14:17:07] the old irc.w.o was paging IIRC [14:17:26] but arguably with just 10ish remaining bots should it really [14:17:36] Yes it was. We also have the TCP probe for if it's completely down [14:18:27] My concern is false positives at insanely slow times, like Christmas Eve, New Year or my birthday. [14:18:53] we can always tune-down them later on [14:19:32] oh, how cool: several people update the list at https://wikitech.wikimedia.org/wiki/Ircstream#Bots_still_using_the_legacy_setup following my wikitech announcement [14:19:46] now the list seems far more complete than the initial tallying I did [14:21:12] and it mentions wikimon seems behind http://listen.hatnote.com/ [14:21:13] We up to 12 now [14:22:00] minus the two defunct, yet running Twitter bots [14:29:47] The Danish Tax office needs to learn to use Diff. "We marked the changes to your tax with a "B" in this PDF... Good luck searching for a "B". [14:47:06] ahahahah [17:24:18] 10netops, 06Infrastructure-Foundations, 06SRE: Add per-output queue monitoring for Juniper network devices - https://phabricator.wikimedia.org/T326322#10310260 (10Reedy) [19:04:35] 10netops, 06Infrastructure-Foundations, 06SRE: Manange fundraising network elements from Netbox - https://phabricator.wikimedia.org/T377996#10310435 (10cmooney) [19:04:48] 10netops, 06Infrastructure-Foundations, 06SRE: Manange fundraising network elements from Netbox - https://phabricator.wikimedia.org/T377996#10310436 (10cmooney) a:03cmooney