[08:49:53] <wm-bot>	 !log lucaswerkmeister-wmde@tools-bastion-13 tools.phpunit-results-cache webservice restart # clear cache, maybe it fixes https://gerrit.wikimedia.org/r/1164312 (cc hashar)
[08:49:55] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.phpunit-results-cache/SAL
[20:43:20] <FastLizard4>	 Hey folks! I'm seeing some pretty serious clock skew on one of the instances in account-creation-assistance; it looks like NTP is not working because it's not able to reach any of the internal NTP servers: https://app.warp.dev/block/PmaF4FfaWVYtAJE73bcKfD
[20:44:30] <FastLizard4>	 I thought this was maybe due to security groups, but I tried adding rules to allow ingress and egress on UDP port 123 to the internal CIDR and that didn't help either; plus, another instance with the same base security group (and the extra security groups don't do anything with UDP) is able to talk to the NTP servers just fine
[20:45:25] <FastLizard4>	 So I'm a little stumped as to what could be the cause. Maybe I've just missed something because dum, but would appreciate any insights/thoughts. The clock drift is causing failures in validating JWTs and OAuth identity tickets from Wikipedia
[20:47:02] <FastLizard4>	 (I did just reboot the instance which improved the situation, but even just minutes after the reboot the clock is behind the RTC by a noticable amount)
[22:15:44] <bd808>	 FastLizard4: hmmm... I see that log showing that it is trying to call the same NTP relays that all Cloud VPS instances should be using.
[22:22:20] <FastLizard4>	 Indeed
[22:38:22] <FastLizard4>	 Would this be something I should open a Phabricator ticket for? I'm genuinely quite stumped as to what could be causing this
[22:39:28] <wm-bb>	 <jeremy_b> reduce a test case? tcpdump the NTP?
[22:40:03] <bd808>	 FastLizard4: yeah, it is worth a ticket. I'm trying something on accounts-appserver7 -- I added the WMCS managed "default" security group and then `sudo timedatectl set-ntp true` to restart things.
[22:40:10] <wm-bb>	 <jeremy_b> try a different NTP server instead of official?
[22:41:25] <bd808>	 `timedatectl` is still not saying "System clock synchronized: yes" as hoped, but I'm not seeing connection failures in `journalctl -u systemd-timedated.service --no-pager --follow` yet
[22:42:46] <wm-bb>	 <jeremy_b> also can block NTP in iptables on a host that's not already broken and see what the logs look like
[22:42:50] <FastLizard4>	 I've only been seeing the connection failures in the logs for systemd-timesyncd, not timedated
[22:43:55] <FastLizard4>	 Just restarted it, still seeing the timeouts alas
[22:44:24] <bd808>	 FastLizard4: yeah. I was looking in the wrong place. This is really weird
[22:44:58] <wm-bb>	 <jeremy_b> like movie plot weird? :)
[22:45:02] <bd808>	 https://serverfault.com/a/972336 is the troubleshooting tips I was looking at
[22:47:14] <bd808>	 I took the "default" Security Group back off that instance. I think you probably should be using that one to make sure you catch changes to the monitoring servers and such, but I don't want to mess up your opentofu setup.