[11:24:47] lunch [13:03:19] o/ [13:31:56] \o [13:33:03] o/ [13:52:24] .o/ [14:28:15] opensearch-eqiad is currently scaling up, will do codfw next [14:35:51] inflatador: not sure if it's relatd to the problems from before, but a pod came up on dse-k8s-worker1013.eqiad.wmnet and is showing the same networking problems we saw before [14:36:01] as in, it can't find the cluster [14:48:14] Does anyone feel a strong need to retro today? [14:49:11] not really [14:54:17] same [14:54:45] dcausse: oh, i found the mjolnir problem yesterday...it's the same problem as glent. Should have checked more carefully for other occurances: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/2075/diffs [14:55:25] unix_timestamp returned null, it propagated it's way through and made session id's null [14:56:25] :/ [14:56:27] looking [14:57:05] * ebernhardson sometimes wishes sql-like things were less null-safe and blew up instead of passing nulls through everything [14:57:29] it's when we moved to spark-sql instead of hql? [14:57:46] dcausse: no, the events started emitting millisecond precision on timestamps, but that was declaring a specific timestamp format [14:58:21] ah ok, yes strict date parsing like this could not work well :/ [14:58:24] cast'ing avoids that since it doesn't require a specific format, it accepts a variety of formats [15:00:12] will have to check source data, hopefully we can re-run the past few months [15:04:32] sure [15:05:43] opensearch-semantic-search-eqiad is now scaled up, 16 pods, ~600GB of memory, ~2.8TB disk. I'm still restarting masters to pick up the new settings, but it should be good for indexing [15:22:08] {◕ ◡ ◕} [15:27:07] all restarted, doing codfw now [15:49:58] codfw done now too, both clusters should be ready for loading data [15:56:19] thanks! [15:56:57] I could start shipping pt on both [16:08:27] hmm, i tried to reset query_clicks_hourly tasks via kubectl exec...and all the pods restarted :S [16:08:31] maybe it was coincidence... [16:11:04] well, worked the second time [18:46:29] dinner [19:07:39] almost amazing, checkout out opensearch 2.19 branch and ran `./gradlew clean`...took 4m31s [19:07:45] s/checkout/checked/ [19:08:06] I hate to ask, but is that good or bad? [19:13:07] thats bad :P Most things it takes a couple seconds, but opensearch (and elasticsearch) have deep integration with the build system [19:15:39] (╯°□°)╯︵ ┻━┻