[02:06:55] <jinxer-wm>	 FIRING: [3x] SystemdUnitFailed: cassandra-a.service on sessionstore1006:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[03:01:55] <jinxer-wm>	 FIRING: [4x] SystemdUnitFailed: cassandra-a.service on sessionstore1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[03:36:55] <jinxer-wm>	 FIRING: [4x] SystemdUnitFailed: cassandra-a.service on sessionstore1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[03:41:55] <jinxer-wm>	 RESOLVED: [4x] SystemdUnitFailed: cassandra-a.service on sessionstore1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[07:48:34] <volans>	 federico3: good morning! Answering here to some questions you had on friday in another channel regarding the support of multiple hosts when calling list_host_instances(). What elu.key suggested is the way if you need it right now. As for the general support I suggested a multi-host implementation in the original CR in [1] (see the comment) that would return something like [2] (expand 
[07:48:40] <volans>	 the comment), but was deemed not ...
[07:48:43] <volans>	 ... necessary at the time. It can surely be added if needed.
[07:48:46] <volans>	 [1] https://gerrit.wikimedia.org/r/c/operations/software/spicerack/+/1005531/43..78/spicerack/mysql_legacy.py#b122
[07:48:49] <volans>	 [2] https://gerrit.wikimedia.org/r/c/operations/software/spicerack/+/1005531/32..78/spicerack/mysql_legacy.py#b108
[08:01:29] <federico3>	 thanks, I put together a workaround for now, in the long term the mysql module might need some tweaks
[08:04:25] <volans>	 everything needs continuous improvement indeed, last year sprint effort on the mysql module added a lot of features to it, but the work was supposed to continue after that effort but some team changes got in the way
[08:21:51] <federico3>	 volans: BTW thanks for https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1130977 - can I merge it?
[08:34:40] <volans>	 t your will, it's all yours. I didn't live-tested as I didn't know if there was a host I could use and also I didn't want to cause conflicts if there were other CRs for the same cookbook.
[08:34:44] <volans>	 *At
[08:44:29] <federico3>	 ok, merging it, thanks
[08:46:43] <volans>	 anytime
[11:16:40] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on s8 on db1211 is CRITICAL: 935.8 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1211&var-port=9104
[11:20:40] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on s8 on db1211 is OK: (C)10 ge (W)5 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1211&var-port=9104
[13:22:31] <_joe_>	 i might be a couple minutes late to the meeting
[13:32:40] <marostegui>	 sobanski: you joining our meeting today?
[14:21:58] <_joe_>	 urandom: tbh, I think this bot might be the cause of the issues https://logstash.wikimedia.org/goto/c2d778382850a85389191ee174cceb79
[14:22:10] <_joe_>	 the timing is striking
[14:24:23] <urandom>	 _joe_: so... this would manifest as a session being overwritten at a high rate?
[14:24:57] <urandom>	 and the high storage utilization then being unreclaimed/tombstoned data?
[14:25:16] <_joe_>	 yes that's kind of where I'm going
[14:25:35] <_joe_>	 or maybe we write multiple sessions for the same user!
[14:25:53] <_joe_>	 I've written to the bot author