[08:33:40] leloiandudu: What tool is it? Also, how does your tool connect to wikipedia, does it keep a connection open or it opens a new one every time you want to connect? (if you can show the code that does it that could help too) [09:10:55] dcaro: it opens a new HTTP connection and fails. I don't think the code will help. this only started happening recently. I'm not sure if it's related to k8s migration or not, but it definitely coincided https://github.com/Leloiandudu/ChieBot/blob/master/Browser.cs [09:11:47] rarely I see in the logs that 1-2 retries is enough and it works again. most of the time 5 retries after 5 minutes still can't connect [09:17:38] Are you setting the user-agent according to https://meta.wikimedia.org/wiki/User-Agent_policy ? [09:17:42] (covering bases) [09:18:13] https://github.com/Leloiandudu/ChieBot/blob/master/Program.cs#L68 I do [09:18:27] everything worked perfectly for years [09:20:58] Can you open a task with the chat info? we'll have to debug more in depth, but usually a connection reset is send by the other side (wikipedia servers in this case) or some firewall in-between, but we don't really have any firewalls for outgoing traffic, and the fact that it works sometimes makes me think that it might be some rate-limiting going on [09:21:29] (as opposed to being a network issue, as we only have one path outside of the network, if it was broken, it'd be broken all the time) [09:21:47] of course, strange stuff happens :), so yep, needs looking into [09:23:26] alright! thanks! on phabricator? [09:24:19] yes please, can you tag it with 'toolforge' please? [09:24:43] actually, just saw https://phabricator.wikimedia.org/T356160 [09:24:46] seems to be the same issue [09:25:06] yeah sounds like exactly the same issue! [09:26:56] feel free to create the task and link it there (or pass me the task-id and I'll link it, probably will create a parent one more 'generic') [09:33:26] !log tools cleaning up old schedules on harbor (T356037) [09:33:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [09:33:31] T356037: [harbor] cleanup execution + task tables - https://phabricator.wikimedia.org/T356037 [09:34:08] dcaro: https://phabricator.wikimedia.org/T356163 [09:37:51] thanks! [09:41:08] thank you too! [09:49:04] leloiandudu: the logs ou passed, are from yesterday right? [10:16:57] !log tools restarting harbor and flushing redis to regenerate cache data (T356037) [10:17:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:17:01] T356037: [harbor] cleanup execution + task tables - https://phabricator.wikimedia.org/T356037 [10:24:18] dcaro: yes, 29 Jan 2024 23:50 UTC [10:24:47] I have more times if you need [10:27:31] leloiandudu: yes please, that's very helpful [10:30:58] today: 9:40, 8:15, 6:30, 6:20, 6:13, 4:15, 3:10, 2:15 (utc) [10:34:00] looks like from 6:13 to 6:30 was a single episode (my tool runs periodically) [12:39:20] !log tools rebuilding all the toolforge images (T354320) [12:39:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:39:25] T354320: [webservice] php 7.4 containers don't pass through the environment variables to the scripts - https://phabricator.wikimedia.org/T354320 [12:43:26] On a buildpack service, can I write to a directory where the code is stored ? (example a public directory storing commons files) [12:43:56] Or should that be redirected to TOOL_DATA_DIR ? [12:44:36] Context, trying to fix croptool :) (re @sohom_datta: On a buildpack service, can I write to a directory where the code is stored ? (example a public directory storing commons files)) [12:46:39] sohom_datta the code directory is not persisted, you should be able to write to it, but on the next restart of the container it will disappear, if you want persistent storage, use TOOL_DATA_DIR yep [12:49:57] or include the files with your code/download at build time [12:50:13] (depends on what they are and how you want to used them) [12:51:34] It's files that are only accessed during a editing session, so non-persistent storage works [12:52:01] For logs and such, I'll probably redirect to TOOL_DATA_DIR [13:08:31] !log tools create no-op DMARC record T354112 [13:08:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:08:36] T354112: Ensure Toolforge and Cloud VPS comply with Google's new email sender guidelines - https://phabricator.wikimedia.org/T354112 [13:24:47] Is there a chance we can cache build steps for buildpacks (similar to how docker does it) ? (Aptfile builds take a pretty long while) [13:37:21] sohom_datta currently is not possible, we are working on implementing persistent volumes on k8s, that would allow to keep those caches around, but until then we can´t cache in the same way [15:14:07] !log tools restart harbor now that the db is clean (T3543) [15:14:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:14:12] T3543: Please create the Võro Wikipedia - https://phabricator.wikimedia.org/T3543 [15:16:30] oops, wrong task [15:16:53] !log tools restart harbor now that the db is clean (T356037) [15:16:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:16:56] T356037: [harbor] cleanup execution + task tables - https://phabricator.wikimedia.org/T356037 [15:22:31] Heads up on cloudelastic: I'll be merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/993764 shortly. No impact is expected, but there's a small chance that CE would be inaccessible for as long as it takes me to revert the patch (~30m) [16:24:38] bd808: re https://wikitech.wikimedia.org/w/index.php?diff=2143862, 'should' was in the original. https://wikitech.wikimedia.org/w/index.php?diff=2139979 was when 'must' was introduced. [16:28:49] well, fuck. how did we get that so wrong in 2016? I was being overly conservative I guess. [16:30:28] https://wikitech.wikimedia.org/w/index.php?title=Help:Toolforge/Right_to_fork_policy&direction=next&oldid=1205852 was apparently the first "official" version of the policy that we should compare things to. [16:55:52] I made a request for a Cloud VPS instance to workaround this the url length limit. https://phabricator.wikimedia.org/T356195 (re @dpriskorn: I have an issue with my new app. [16:55:53] I suspect a limit on url length is the cause, can someone tell me what the limit imposed is? [16:55:55] ht...) [17:00:02] @dpriskorn: I think folks have tried to figure that out before without finding a clear answer. As I recall any limit in place is about nginx default values and possibly nginx compile time constraints. [17:01:30] I have to hop into a meeting, but I will try to find the past ticket later today to see if it has clues [17:02:22] wasn't the issue in the request between your tool and mix-n-match? how would moving to cloud vps solve that? [18:22:30] is there a way to set up a cname in horizon? [18:38:35] roy649: if you are a projectadmin there would be a 'DNS -> Zones" in the menu on the left.unter that "Create record set" and I see CNAMEs in the list to pick from, so I think yeas [18:40:22] ah, cool, yeah, that looks like what I need. Thanks. [18:42:15] yw [20:19:08] Re: my earlier message: the first cloudelastic host migrated to private IPs successfully, the maintenance is over for now [20:23:04] inflatador: is it back in the load balancer configs too already, or will that follow later? [20:30:30] taavi it won't be added back...elastic already handles routing internally, so as long as there's 1 host left in the pool it will work (although we don't ever plan on getting that low) [20:31:06] detailed plan is at https://etherpad.wikimedia.org/p/cloudelastic-T355617 if you're curious [20:31:06] T355617: Migrate cloudelastic from public to private IPs - https://phabricator.wikimedia.org/T355617 [20:31:24] subject to change ofc ;) [21:07:28] inflatador: I'm a bit confused by your plan - how are you planning to handle the non-standard ports through the CDN? [21:19:24] taavi good point, one I haven't considered. Would we need a separate domain name per port in hieradata/common/profile/trafficserver/backend.yaml ? Or if there's a better way to do it LMK [21:20:17] I'd prefer not to add new domain names if possible [21:21:09] I can confirm there are non standard ports in backend.yaml, such as replacement: https://etherpad.discovery.wmnet:7443 [21:21:56] I dont see an example of the same name with 2 different ports though [21:22:51] Maybe it's possible to specify a port in the "target" section? Don't see anyone doing it now though [21:22:53] I suppose you could though with the port being part of the replacement line [21:23:25] I'll ask in traffic [21:23:40] that is the destination port and has nothing to do with the source port. I don't think you can do anything else that the default https 443 there which won't work with cloudelastic [21:26:26] I also don't see what's wrong with the current separate LVS service solution [21:29:11] Could I get you to elaborate on the ticket or the etherpad? I'm happy to do it a different way, but I need more details [21:29:43] taavi, dcaro do you guys need more times when connection fails or do you have enough data now? [21:38:00] inflatador: https://phabricator.wikimedia.org/T355720#9500329 [21:40:09] taavi thanks. I guess I'm overthinking it. Will adjust plan accordingly [21:41:31] leloiandudu: we're still missing a way to reproduce it on-demand but I don't think any more times of when it happened in the past will be much helpful. [21:48:06] taavi can the private IP hosts use the public LVS IP? I see it's bound to my canary as expected, just wondering if it will "just work" or there needs to be other LVS changes [21:53:08] inflatador: you can have backends for a 'public' LVS VIP in a 'private' VLAN and it'll just work [21:56:15] taavi excellent, thanks again...should make things a lot easier [22:21:19] !log library-upgrader Restarted libup-db02 instance via Horizon to try to fix "Instance 7d785002-371c-4a72-973f-629a6a4f3084 is not currently available for an action to be performed (instance status was ACTIVE)." [22:21:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Library-upgrader/SAL