[07:34:01] <taavi>	 toolforge is now serving web traffic over ipv6
[07:44:13] <dcaro>	 \o/
[08:10:52] <dcaro>	 hmm, got failing probe for the toolforge api on v4
[08:11:08] <dcaro>	  I can curl though
[08:11:18] <taavi>	 [#wikimedia-cloud-admin-feed] <taavi> ^ just deleted instances, will clear soon, sorry
[08:11:39] <dcaro>	 ack
[08:11:44] <taavi>	 the remove instance cookbook tries to silence alerts for that host but that does not work for blackbox probes for some reason
[08:12:11] <dcaro>	 there's no labels with the instance name
[08:12:17] <dcaro>	 just the external fqdn I think
[08:12:46] <dcaro>	 I guess it just blindly uses `instance=hostname` for the silence
[08:12:47] <taavi>	 huh
[08:12:53] <taavi>	 in that case, let me have a look
[08:13:13] <dcaro>	 I started adding `service=ceph,mgr,...` labels to some alerts, so we could 
[08:13:29] <dcaro>	 filter alerts not specific to that instance but to derived services too
[08:15:01] <taavi>	 anyway, the alert seems to have cleared for now
[08:15:17] <dcaro>	 yep :)
[08:15:51] <taavi>	 i think the issue might be that the instance= label has a port number appended on the blackbox metrics that's not there for the rest
[08:36:08] <dcaro>	 quick review https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/32
[08:36:54] <dcaro>	 (just checked the email) the instance for the api had `instance = api.svc.toolforge.org:443`, nothing related to the proxies :/
[08:56:50] <dcaro>	 hmm... how does svc.beta.toolforge.org show up in the list of domains for webproxies in toolsbeta? (toolsbeta.org does not show up), is it a config somewhere?
[08:57:35] <dcaro>	 probably https://gerrit.wikimedia.org/g/cloud/instance-puppet/+/e7cc9efe6897d7ebedbe8c46ce9190d869700e5f/project-proxy/proxy.yaml#43
[08:58:13] <taavi>	 https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Web_proxy#Enable_per-project_subdomain_delegation
[08:59:57] <dcaro>	 thanks, I think we might not have followed that process for `toolsbeta.org` (or missed some steps I guess)
[09:00:02] <dcaro>	 I'll add a task to clean up
[11:38:05] * dcaro lunch
[12:16:49] <taavi>	 well I found where all the toolforge NFS space went: https://phabricator.wikimedia.org/T395020
[13:10:51] <dcaro>	 dhinus: chuckonwu I have left a few tasks flagged with 'good first task' that you can pick up when you are done with the current one, can you give them a review and let me know if they are clear/easy to understand?
[13:15:38] <dhinus>	 dcaro: sure, is that on the toolforge phab board?
[13:16:24] <dcaro>	 yep
[13:16:54] <dcaro>	 https://phabricator.wikimedia.org/project/view/7905/
[13:18:32] <dhinus>	 found them, they look good thanks! the ones we identified with arturo were: T384251, T394276, T349775
[13:18:33] <stashbot>	 T384251: [jobs-cli] If the pod exists and it has no logs, read the message status from it and output that - https://phabricator.wikimedia.org/T384251
[13:18:33] <stashbot>	 T394276: [components-api] Add basic prometheus metrics - https://phabricator.wikimedia.org/T394276
[13:18:33] <stashbot>	 T349775: [toolforge,jobs] "toolforge jobs logs" fails when job has not started yet - https://phabricator.wikimedia.org/T349775
[13:18:40] <dhinus>	 the prometheus one might be a bit too big
[13:19:29] <dcaro>	 I think that Raymond_Ndibe might have started it
[13:21:03] <dcaro>	 the other two look ok, I'd start with the components-api ones though, as they would help with the hypothesis work ( so more reportable™ :) )
[13:21:20] <dhinus>	 dcaro: good point
[13:22:00] <dhinus>	 maybe starting from T394994 and T394990
[13:22:00] <stashbot>	 T394994: [components-cli] make `toolforge components deployment show` show the latest deployment if no id passed - https://phabricator.wikimedia.org/T394994
[13:22:01] <stashbot>	 T394990: [components-api] add `GET` endpoint `/v1/tool/<toolname>/deployments/latest` - https://phabricator.wikimedia.org/T394990
[13:22:27] <dcaro>	 sounds good yes, note that the cli depends on the api one
[13:47:09] <andrewbogott>	 dcaro, can I assign T394333 to you to double check racking balance? That's for the first order of jumbo-sized osds.
[13:47:09] <stashbot>	 T394333: Q4:rack/setup/install cloudcephosd10[48-51] & relocate cloudcephosd1039 - https://phabricator.wikimedia.org/T394333
[13:47:36] <dcaro>	 andrewbogott: ack, I'll try to give it a look, when is it needed?
[13:48:26] <andrewbogott>	 hm, you're about to leave for two weeks aren't you?
[13:49:28] <andrewbogott>	 It's possible but unlikely that the hardware will show up before you're back.
[13:49:46] <andrewbogott>	 If you don't have time I can make an attempt, I just know you have a plan already :)
[13:50:23] <dcaro>	 I'll let you know then if I have time to get to it :)
[13:51:01] <andrewbogott>	 thx
[14:30:25] <andrewbogott>	 taavi: do you have thoughts or a task about the recent increase in DNS leaks? If not I'll open a task.
[14:30:45] <taavi>	 no!
[14:34:18] <taavi>	 andrewbogott: https://phabricator.wikimedia.org/T395020#10848314 asks if our NFS mounts support `atime`. do you happen to know that already?
[14:37:11] <andrewbogott>	 I responded on the task
[14:37:23] <andrewbogott>	 they don't.
[14:37:55] <taavi>	 thanks!
[14:39:50] * andrewbogott makes T395037
[14:40:13] <andrewbogott>	 um... T395037
[14:40:14] <stashbot>	 T395037: new, frequent DNS record leaks in wmcs - https://phabricator.wikimedia.org/T395037
[15:37:41] <bd808>	 Is there any HTTPS frontend for the Cloud VPS S3 storage stuff? I'm wondering if a project that uses S3 storage needs to build it's own web ui to look at stored things or if buckets can be marked as public and then just browsed/deep linked into.
[15:39:50] <taavi>	 bd808: if you mark a bucket as public you can access individual files directly via https, but there isn't a graphical browser like mod_index
[15:40:06] <taavi>	 (you may or may not have an xml index listing all the files, don't remember the exact details)
[15:40:38] <bd808>	 ack. Thanks taavi. The possible use case here is Zuul job logs and I think that direct URL access is what it would need.
[15:41:46] <taavi>	 yeah, as long as you know the bucket and file name you can construct the URL manually
[15:45:16] <andrewbogott>	 I guess we could expand openstack browser to consume swift apis? But that seems like a lot of scope creep.
[15:45:40] <taavi>	 listing buckets maybe, but otherwise i don't think that's in scope for that tool
[15:47:50] <andrewbogott>	 I think I agree...
[15:48:01] <andrewbogott>	 it would be easy enough to expose objects but then that has me thinking about scrapers :(
[15:50:05] <bd808>	 Something like https://github.com/rufuspollock/s3-bucket-listing might be a better thing than adding file viewing to openstack-browser.
[18:14:02] <andrewbogott>	 Just got an alert about tools nfs
[18:14:32] <andrewbogott>	 ...and a recovery
[18:15:36] <dcaro>	 I think tools-static might have gotten borked, restarting nginx
[19:12:45] <dcaro>	 andrewbogott: yep, there's a few workers now with processes stuck on nfs :/, can you handle it? I'm in an uncomfortable platfrom (not laptop)
[19:13:04] <andrewbogott>	 yep, I was waiting for them to show  up
[19:19:42] <andrewbogott>	 I'm rebooting 8 stuck workers  -- need to step out but will take care of any new/remaining workers when I'm back
[20:35:00] <dcaro>	 andrewbogott: thanks!