[08:49:30] dcausse: thanks for your maven-resource-plugin bump/fix, I just noticed it’s flaky (with symlinks?) but didn’t look deeper into it. [08:51:04] ^ is this something we should fix in our parent pom? Or at least document there? [08:52:42] Well, it would not hurt to bump versions regularly. I could run the dependency plugin to check for updates. [08:53:10] Not a bad idea! I do that every now and then, but I don't have a regular schedule for it. And it's been a while! [08:53:37] No worries, I’ll do that. [08:54:46] dcausse: I started looking at https://phabricator.wikimedia.org/T341227 (dedupe uploaded files in ES) but I’m not sure I found the right places in CirrusSearch where the deduplication is happening at the moment. Would you have time for a pairing session, some time today? [08:55:39] pfischer: sure, would 3pm works for you? [08:57:53] Yes, I’ll schedule a meeting. Thanks. [09:00:00] I have see TestcontainersConfiguration:370 - Attempted to read Testcontainers configuration file at file:/nonexistent/.testcontainers.properties but the file was not found. Exception message: FileNotFoundException: /nonexistent/.testcontainers.properties [09:00:54] in CI (https://gitlab.wikimedia.org/dcausse/cirrus-streaming-updater/-/commits/page-rerender-wip) if you get a chance to have a quick look, I'm not sure to understand what I've messed up that's causing this [09:01:52] relatedly to this branch I had to give on some final fields in https://gitlab.wikimedia.org/dcausse/cirrus-streaming-updater/-/commit/3a219b7254d0d69c67acca408bb322d2bfe3ef60 [09:02:28] I'm not super happy with this but that's a requirement of the flink PojoTypeInfo [09:03:05] without PojoTypeInfo we would rely on GenericTypeInfo which would be a mistake for us [09:03:30] Unmeeting if anyone is interested [09:04:00] so please let me know what you think, I believe that by writing more boilerplate and custom TypeInformation we could keep InputEvent as it was (keep Set and keep the final fields) [09:04:10] meet.google.com/hvn-zxxd-xrb [09:05:21] sorry can't make the unmeeting today :/ [09:23:57] dcausse: I’ll have a look. [09:45:51] hm.. cindy mysql dbs do not seem to be cleared up and is failing trying to create the bot password because it already exists... [09:51:58] Error response from daemon: error while removing network: network mwcli-mwdd-default_dps id 453e7a8ea6819bb92980a487297c83d4a4ca9497fad7dc67ed3aa23d6380b2f9 has active endpoints [09:52:22] and perhaps it's causing subsequent cleanups (volume to fail?) [09:52:55] I barely remember Erik saying that sometimes docker gets into a weird state and reboot could help, will try that [09:58:42] yes now mw docker destroy worked and all volumes are gone [10:01:51] seems to work now [10:26:12] dcausse: regarding page-re-renders: I added some debug logging + CI artifacts in a personal fork, lets see, if that sheds some light [10:29:09] lunch [10:31:57] A more general question regarding cirrus search: Do we try to be search engine agnostic or is ES the de facto default without alternatives? I’d like to know so I can reason how to name fields in event schemas. If we want to be agnostic, we’d have to prepare the schema for alternative engines (and sets of parameters) to coexist (at least schema-wise). If we’re happy with ES and assume it will be the only option in [10:31:57] the near future, I would name things accordingly. [11:04:31] dcausse: I rebased on main and the CI build passes: https://gitlab.wikimedia.org/pfischer/cirrus-streaming-updater/-/jobs/126226 [12:10:50] pfischer: being engine agnostic would be great but sadly we're carrying these noop settings which are pretty much engine specific [12:11:18] thanks for testing the build, I thought I was already on top of main but I might have missed something [12:33:23] pfischer: regarding the build that fails it's on my wip branch (https://gitlab.wikimedia.org/dcausse/cirrus-streaming-updater/-/tree/page-rerender-wip) tried to reran the build to see [12:53:56] well.. now it passes, go figure... [14:17:14] semi stupid question (only because after 7 years, I should probably know): where is the code for Cindy? [14:28:43] dcausse re: nginx 404 https://phabricator.wikimedia.org/T342762#9047088, I'm guessing that could happen to wdqs as well? Just thinking about where to document this type of failure [14:57:29] inflatador: not sure it could happen on wdqs unless we mess up this file manually, reason it happened on wcqs is probably because we used that aliases.map early in the life of wcqs but we got rid of that after enabling real-time updates [14:58:21] gehel: it's now here https://gitlab.wikimedia.org/repos/search-platform/cirrus-integration-test-runner but previously it was mainly a hacked python script on the wmcs host [14:59:49] and the tests themselves are in https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/CirrusSearch/+/refs/heads/master/tests/selenium/ ? [15:00:42] gehel: close, https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/CirrusSearch/+/refs/heads/master/tests/integration/features/ [15:01:06] damn, almost! [15:01:08] the selenium folder is supposed to be the folder for the browser tests run by mw [15:01:20] s/mw/jenkins [15:03:51] ryankemper: retrospective https://meet.google.com/eki-rafx-cxi [15:54:41] have friends at home, going offline early [16:38:08] ryankemper or anyone else, if you have time to take a look at https://gerrit.wikimedia.org/r/c/operations/puppet/+/942457 , it's for getting metrics from the new flink-zk cluster [16:41:02] ryankemper I've checked a few things and I don't think the firewall is the issue...hmm [16:42:23] we probably need to define monitoring stuff in puppet somewhere [16:42:47] inflatador: don't think firewall is the issue meaning https://gerrit.wikimedia.org/r/c/operations/puppet/+/942457/1/hieradata/role/common/zookeeper/flink.yaml is not necessary? or are ya saying something else [16:44:43] ryankemper I mean that PR I just sent is probably unnecessary. I confirmed that the prom host can get to the flink-zk host, it's just not trying [16:44:57] in other words, network connectivity works, it's just not configured to scrape the new hosts [16:46:51] I'm going to leave it open for the moment just in case I missed something, but I'm looking at monitoring configs in puppet instead. If you have suggestions LMK [16:47:55] Hitting up lunch, back in ~1h [17:14:56] Yeah feels like prometheus target config or something is wrong maybe [17:15:01] breakfast, will take a look in an hr [17:46:37] back [17:55:19] re: monitoring flink-zk, I just pinged in observability IRC [18:25:00] ryankemper, inflatador : I'll be 10' late for our pairing session [18:25:07] gehel ACK [18:30:32] gehel: inflatador: same for me actually, need to talk to the construction guys outside real quick [18:31:47] np