[09:37:06] Some of the long logs are also access logs (very long URLs, either API calls or broken clients) and get truncated by rsyslog, I increased the limit a while ago but I don't think it was all the way to 64k, I can find that change if relevant [14:46:16] hello folks [14:46:24] I think that titan1001 is having troubles [14:46:40] I was checking istio metrics in the thanos UI and I noticed it being very slow [14:46:48] and now no host metrics are published for 1001 [14:47:18] any maintenance ongoing? [14:48:25] from the mgmt console I don't get a tty :( [14:48:52] anything against me rebooting it? [14:57:05] elukey: no maintenance no, we're in the team meeting, rebooting sounds good, thank you [14:57:46] ack proceeding [14:58:01] <3 [15:02:25] one thing that I noticed: https://config-master.wikimedia.org/pybal/eqiad/thanos-web [15:02:32] titan1002 is not pooled for thanos-web [15:02:52] that explains why the thanos ui wasn't available [15:06:13] iirc thats our manual sticky session implementation, but I think soon-ish we'll be able to enable both when cdn can stick users to a consistent backend. maybe there's a better option now though, not sure? [15:15:06] ahh right okok [19:06:25] herron et al: There's a ton of changes for the slo dashboard when running 'grr diff slo_dashboards.jsonnet'. Should I be waiting for a larger deployment schedule or is it okay for me to apply all of them? [19:06:50] lets see [19:07:41] oh, lord [19:07:45] I'm sorry, my repo was out of date [19:07:50] * brett bows his head in shame [19:08:04] brett: ahh that would explain it, was gonna say it looks good to deploy [19:08:15] Sorry for the noise ._. [19:08:28] no worries at all better safe than sorry