[10:22:25] (SystemdUnitFailed) firing: benthos@haproxy_cache.service on cp4037:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:27:25] (SystemdUnitFailed) resolved: benthos@haproxy_cache.service on cp4037:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:28:36] ^^ me sorry [13:44:36] 06Traffic, 06Data Products, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#9614889 (10Fabfur) Update: yesterday we modified the HAProxy log destination to send them into Benthos and repooled cp4037 fo... [13:48:29] 06Traffic: Benthos: add specific unit tests for different logs - https://phabricator.wikimedia.org/T359626 (10Fabfur) 03NEW [13:48:49] 06Traffic, 06Data Products, 06Data-Engineering, 10Observability-Logging: Benthos: better management for unparsable logs - https://phabricator.wikimedia.org/T359627 (10Fabfur) 03NEW [13:50:10] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Add per-output queue monitoring for Juniper network devices - https://phabricator.wikimedia.org/T326322#9614966 (10cmooney) @ayounsi finally got back to this for a closer look. Really great work, I tried to make a device-centric dashboard... [14:08:52] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Add per-output queue monitoring for Juniper network devices - https://phabricator.wikimedia.org/T326322#9615146 (10cmooney) @fgiunchedi wondering if you'd any thoughts on the above suggestion to allow more series through from the gnmic pipe... [15:34:48] 06Traffic, 06Data Products, 06Data-Engineering, 10Observability-Logging, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#9615392 (10Fabfur) Note: I've repooled cp4037 for the next days as I'll be busy on the SRE Summit to work on it. All modifi... [16:26:50] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Add per-output queue monitoring for Juniper network devices - https://phabricator.wikimedia.org/T326322#9615636 (10fgiunchedi) Yeah having some ballpark numbers will be a great help @cmooney, unless we're talking hundreds of thousands more... [19:37:02] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Add per-output queue monitoring for Juniper network devices - https://phabricator.wikimedia.org/T326322#9616215 (10cmooney) >>! In T326322#9615636, @fgiunchedi wrote: > Yeah having some ballpark numbers will be a great help @cmooney, unless...