[10:39:59] Lunch [11:27:06] lunch [13:52:13] inflatador: you should probably name the db airflow_search [13:52:14] see [13:52:47] https://github.com/wikimedia/operations-puppet/blob/production/modules/profile/manifests/airflow.pp#L125 [13:53:09] steve is working on adapting this for new airflow and postgres now, but the default db name will be the same [13:53:32] that allows us to use the same convention for all wmf airflow hosts. [13:53:42] by setting use_wmf_defaults: true in hiera [14:18:00] ottomata :eyes [14:18:30] o/ [14:46:38] sigh "No space left on device" on an-airflow1001 while deploying :/ [14:51:16] scheduler logs were 14M/day vs 280M/day since 2023-02-07 [14:51:44] oops my mad, it's compression [14:54:46] * dcausse wonders if we can cleanup scap cache /srv/deployment/wikimedia/discovery/analytics-cache [14:58:11] dcausse ~5m late to mtg [15:01:24] np [15:52:41] inflatador: it seems OK to remove the scap cache [15:52:50] (cd /srv/deployment/wikimedia/discovery/analytics-cache/revs && rm -rf 1d3ba411524aa8c7e6750cd4c84109f1bd2d8ca7 5a19b9d11e7017414c209c736b7e07623404e51d b4d31fbd9b12051a4f631284496aa89a36187b27 dc3cd56b553aa10b6ffaed994945e293fe4a92a2 e988b5e91b5f64acd58155176b13d0dae2fcab24) [15:53:15] these are old revisions [15:53:52] then we can decrease the number of cached revs [15:53:59] e.g. cache_revs = 3 [15:54:00] dcausse cool, I can delete it if you like. As I said I'm pretty cavalier about deleting stuff with 'cache' in the name ;) [15:54:23] well there's a single one that needs to stay [15:55:12] the one that's pointed by /srv/deployment/wikimedia/discovery/analytics-cache/current [15:55:57] understood. I won't delete anything unless you or Erik say it's OK. I don't know where the cached_revs setting lives either [15:56:11] currently 99a3e6f99cbe985b88385d4245090476491c8502 is actibe [15:56:18] will update the scap config [16:01:56] dcausse: ryankemper, mpham: retrospective coming up: https://meet.google.com/eki-rafx-cxi [16:59:07] errand [17:01:08] hmm...all the tables in mjolnir database of hive have no partitions [17:02:44] hmm, not all. but some [17:09:22] inflatador: are the various restarts and switch changes done in eqiad? I'd like to start the wikidata / commonswiki reindexes but those will take a few days and potentially fail if a node it's using is lost [17:09:57] ebernhardson eqiad doesn't start until Mar 1 , so we should be good if they'll only take a few days [17:11:58] yea i think it's usually a week or so, i'll kick it off now so it mostly runs on the weekend when hopefully people aren't breaking things :) [17:12:28] ACK [17:15:18] hmm, actually the page_id doc values change is still pending. I guess need to do a full-cluster reindex [18:56:19] back, (sorry, was at lunch) [18:56:49] Draft version of annual planning is out: https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/Product_%26_Technology [18:58:31] The part that concerns our team the most is the last objective in https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/Product_%26_Technology#Objectives_2 [19:00:06] reindexing is started, wikidata and commonswiki are running by themselves, and then a big loop for the rest of the wikis (*3 clusters, =9 processes in parallel) [19:02:01] ryankemper, inflatador: I'm still in meetings and haven't had dinner yet. I'll skip our pairing session :/ [19:02:02] sorry! [19:02:20] gehel np, enjoy your food [19:02:21] gehel: understood! go get some calories :) [19:02:26] thanks! [20:56:00] can't decide if mjolnir should have a separate deploy repo, or contain a scap dir in the mjolnir repo. In the past we used a deploy repo because it needed to use git-fat and have various artifacts which would pollute the main repo. Now though it only needs a URL from which to fetch the environment. [20:57:00] But it ends up a bit awkward because you would release a new tag of mjolnir, which would trigger the environment build, but then the tag wouldn't have the correct url in it since the url doesn't exist until after tagged [20:57:17] maybe we do still need a separate deploy repo then [20:57:49] and i suppose the deploy doesn't need the main repo anyways, it doesn't want to sync all that code out. it wants to download the env with that code embedded already [20:58:36] another awkward part, each search-loader will separately download the environment instead of having it synced from the deployment server...but for only 2 instances maybe not worth thinking about [21:17:32] in that case we could continue using the existing deploy repo, although it's in gerrit...move to gitlab purely for consistency purposes? [21:35:41] just as an FYI, relforge elastic logging is in a broken state. ryankemper and I are working on it and we'll roll back if we can't fix by tomorrow [21:35:55] re https://phabricator.wikimedia.org/T324335 [21:39:39] break relforge all you want, it's what it's there for :)