[03:22:05] (PyBalBGPUnstable) firing: (2) PyBal BGP sessions on instance lvs2013 are failing - https://wikitech.wikimedia.org/wiki/PyBal#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPyBalBGPUnstable [06:09:40] (VarnishHighThreadCount) firing: (10) Varnish's thread count on cp1102:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:14:40] (VarnishHighThreadCount) firing: (32) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:19:40] (VarnishHighThreadCount) firing: (32) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:24:40] (VarnishHighThreadCount) firing: (36) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:29:40] (VarnishHighThreadCount) firing: (38) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:34:40] (VarnishHighThreadCount) firing: (42) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:39:40] (VarnishHighThreadCount) firing: (47) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:44:40] (VarnishHighThreadCount) firing: (46) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:49:40] (VarnishHighThreadCount) firing: (45) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:54:40] (VarnishHighThreadCount) firing: (41) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [06:57:20] 06Traffic: Connection failed for a few minutes - https://phabricator.wikimedia.org/T360982#9660408 (10Bugreporter) [06:59:40] (VarnishHighThreadCount) firing: (36) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:04:40] (VarnishHighThreadCount) firing: (29) Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:09:40] (VarnishHighThreadCount) firing: (24) Varnish's thread count on cp1104:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:14:40] (VarnishHighThreadCount) resolved: (14) Varnish's thread count on cp1104:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [07:22:05] (PyBalBGPUnstable) firing: (2) PyBal BGP sessions on instance lvs2013 are failing - https://wikitech.wikimedia.org/wiki/PyBal#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPyBalBGPUnstable [10:33:36] 06Traffic, 07Wikimedia-production-error: Connection failed for a few minutes - https://phabricator.wikimedia.org/T360982#9660786 (10SunAfterRain) [11:22:05] (PyBalBGPUnstable) firing: (2) PyBal BGP sessions on instance lvs2013 are failing - https://wikitech.wikimedia.org/wiki/PyBal#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPyBalBGPUnstable [13:23:50] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9661385 (10RobH) [13:24:04] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE: esams text cp nvme upgrade - https://phabricator.wikimedia.org/T360430#9661386 (10RobH) Remote work task is via CS1553796, remote hands has confirmed receipt of the SSDs and work to take place on March 27th @ 11AM CET. [13:54:42] 10netops, 06SRE, 13Patch-For-Review, 10SRE Observability (FY2023/2024-Q3): 14Icinga BFD check failing - 14https://phabricator.wikimedia.org/T359198#9661583 (10fgiunchedi) 05Open→03Resolved 14This is fixed, I've undone my symlink bandaid. I've also reported the issue at https://bugs.debian.org/cgi-... [14:41:40] 06Traffic, 06Data-Persistence, 06SRE, 10SRE-swift-storage, and 5 others: Change default image thumbnail size - https://phabricator.wikimedia.org/T355914#9661806 (10Ladsgroup) We are also considering implementing {T360589} to allow for improved storage and caching which would in turn enable SREs to change t... [14:47:42] sukhe: seems like we're seeing many "PyBal BGP sessions on instance lvs2013 are failing" errors since I restarted pybal on the host yesterday. Is there anything I can help with? [14:57:32] brouberol: thanks for checking [14:57:43] those were due to a host that was marked as pooled but down so I depooled it and it should be fine [14:57:56] related to AQS? [14:57:57] it's the elastic hosts [14:58:02] no, not related to the AQS change [14:58:03] oh, ok, gotcha [14:58:17] ???? [14:58:52] inflatador: hi :) see PM [14:59:53] I heard you were out yesterday so I didn't ping you [15:22:05] (PyBalBGPUnstable) firing: (2) PyBal BGP sessions on instance lvs2013 are failing - https://wikitech.wikimedia.org/wiki/PyBal#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPyBalBGPUnstable [15:22:29] yeah we need to check what's up here [15:51:58] omg, aqs1 service is being deleted? [15:52:01] thank you all <3 [15:53:17] Yep, we took care of it yesterday, from popular demand :D [16:18:34] 06Traffic, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 4 others: ☂️ Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T357547#9662223 (10jijiki) [16:41:50] (PyBalBGPUnstable) firing: (2) PyBal BGP sessions on instance lvs2013 are failing - https://wikitech.wikimedia.org/wiki/PyBal#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPyBalBGPUnstable [16:46:50] (PyBalBGPUnstable) resolved: (2) PyBal BGP sessions on instance lvs2013 are failing - https://wikitech.wikimedia.org/wiki/PyBal#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPyBalBGPUnstable [18:37:41] 06Traffic, 06Data-Engineering, 10Observability-Logging, 10Event-Platform, 13Patch-For-Review: Remove extra fields currently sent to Kafka - https://phabricator.wikimedia.org/T360642#9662826 (10Ottomata) >> meta.id > Do you know who set these fields with the current webrequest flow? It isn't set for curr... [18:48:48] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: 14Connect two hosts in codfw row A/B for switch migration testing - 14https://phabricator.wikimedia.org/T345803#9662842 (10Papaul) 05Open→03Resolved