[02:03:29] * bd808 off
[09:06:09] <dcaro>	 morning
[09:06:27] <taavi>	 o/
[09:40:49] <arturo>	 o/
[10:45:56] <dcaro>	 there has been a speed up in the response time delay for nova_api starting earlier today, has anyone changed anything?
[10:46:37] <arturo>	 not me!
[10:50:04] <dcaro>	 https://usercontent.irccloud-cdn.com/file/WckBcq1g/image.png
[10:51:16] <arturo>	 definitely an increase
[11:23:49] * arturo sorry for not being more helpful, I'm multitasking on other stuff at the moment.
[11:29:13] <dcaro>	 it's ok, I'm not focused myself either
[11:30:34] <dcaro>	 there's a lot of errors trying to connect to mysql on openstack.codfw1dev.wikimediacloud.org for backups
[11:38:53] <taavi>	 dcaro: on cloudbackup100[12]-dev? or somewhere else?
[11:39:03] <dcaro>	 cloudcontrol1005
[11:39:08] <dcaro>	 let me double check
[11:39:18] <taavi>	 why is cloudcontrol1005 trying to connect to the codfw1dev database?
[11:39:28] <dcaro>	 oh yes cloudbackup1001-dev
[11:39:40] <dcaro>	 I was wondering that myself
[11:40:12] <taavi>	 i was wondering those hosts earlier myself today, as according to T344065 those should not exist
[11:40:13] <dcaro>	 the openstack logstash dashboard also shows the 100* backup nodes (though technically they are from the codfw setup), note taken
[11:40:13] <stashbot>	 T344065: Replace cinder-backup process with backy2 - https://phabricator.wikimedia.org/T344065
[11:40:42] <taavi>	 and we indeed don't have any hosts running cinder-backups for the eqiad1 deployment, which matches that task
[11:41:13] <taavi>	 so I wonder if those two VMs (in eqiad, but for the codfw1dev deployment) were just forgotten, or what's going on with them
[11:41:37] <dcaro>	 I think they might be forgotten yep
[11:46:19] <taavi>	 ok, filed T358855 so we don't forget. I'm happy to decom those once andrewbogott confirms they should not exist
[11:46:20] <stashbot>	 T358855: Maybe decom cloudbackup100[12]-dev - https://phabricator.wikimedia.org/T358855
[11:46:30] <dhinus>	 where did you see the mysql errors?
[11:46:42] <dhinus>	 ah cloudcontrol1005 sorry
[11:47:20] <taavi>	 no, the mysql errors are on cloudbackup100[12]-dev
[11:47:47] <taavi>	 if there are also mysql errors on cloudcontrol1005, that is both news to me and also much more worrying than the mysql errors on cloudbackup100[12]-dev
[11:48:19] <dcaro>	 only on the backup nodes yes
[11:48:52] <taavi>	 however, if I understand things correctly, the nova api on cloudcontrol1005 is currently slow to respond?
[11:50:09] <dcaro>	 yep, it started to get slow ~5am UTC today
[11:50:18] <dcaro>	 that's why I'm looking at logs around that time
[11:50:51] <dhinus>	 there are some sqlalchemy errors in cloudcontrol1005 but they're related to keystone and not too frequent
[11:51:39] <dhinus>	 dcaro: were you looking at logs across all instances? I'm curious to understand how you spotted the backup errors
[11:52:03] <dcaro>	 yep, cluster-wide
[11:52:16] <dcaro>	 (eqiad cluster, though it's eqiad site, not cluster)
[11:52:23] <dhinus>	 in logstash
[11:53:29] <dcaro>	 yes
[11:53:34] <dcaro>	 sorry for the lack of context
[11:55:00] <dhinus>	 np, thanks for clarifying :)
[13:59:02] * dcaro off for a bit
[16:06:55] <arturo>	 dcaro: when you are around, could you please help me sort this python type checking problem? https://gerrit.wikimedia.org/r/c/cloud/wmcs-cookbooks/+/1006529
[16:11:20] <dcaro>	 on my way back from the airport
[16:12:59] <arturo>	 no rush!
[16:15:36] <taavi>	 arturo: so the the thing it's complaining about is that the output type of `KubernetesController.get_object` depends on the value of the `missing_ok` parameter, so the use of a parameter as `missing_ok` in `is_pod_running` (line 395) makes it confused
[16:16:43] <arturo>	 yes, ok
[16:17:12] <dcaro>	 might be missing an overload on one of the get_object defs
[16:22:24] <taavi>	 https://gerrit.wikimedia.org/r/c/cloud/wmcs-cookbooks/+/1007942 seems to fix it
[16:25:30] <arturo>	 thanks, yes, I was writing the exact same here
[16:25:37] <arturo>	 let's merge yours 
[16:30:09] <taavi>	 also left a comment on your patch, once that's fixed I'm happy to +1
[16:32:07] <bd808>	 the blank color and "false" username thing in etherpad is apparently https://github.com/ether/etherpad-lite/issues/5401. Somebody in -sre said they will look into patching.
[16:39:01] <arturo>	 bd808: I'm glad things are a bit better wrt. etherpad maintenance nowadays
[16:40:13] * taavi off
[17:13:08] * arturo off
[18:36:32] * bd808 lunch