[09:20:08] sigh... just realized that image_suggestions_weekly is past dependent and thus stuck since one month :/ [09:28:43] skipped the last 3 runs by forcing a success [11:17:47] lunch [11:28:12] dcausse: Do you know if we rely on reaching the empty “complete” task in our DAGs? I just learned about the option of conditionally short-circuiting a DAG and thought, that might be an alternative. [13:22:40] pfischer: hm... hard to tell, might need to ask Erik, but I guess it matters if there are ExternalTaskSensor waiting for it on other dags, regarding depends_on_past I don't know if it has an impact, it might prevent the next "complete" task to complete [13:24:28] I feel that depend_on_past is working at the task level, such that if the previous run for this same task is not green the "current" run is going to wait [13:28:12] so bypassing "complete" on some runs might not work well for depends_on_past dags [13:29:32] depends_on_past dags are capricious anyways, we should probably avoid them wherever possible [14:27:32] actually if the task is marked "skipped" it might mean "success" so depends_on_past should still work, for ExternalTaskSensor if allowed_states = ["success", "skipped"] that might work but this downstream dag probably needs to be aware that it's skipped to avoid hiding this in a success [15:06:04] \o [15:06:51] o/ [15:07:01] pfischer: the 'complete' things...i dunno maybe we could drop them. The general idea was we needed to chain various dags together, so if all dags had a 'complete' node then the other dag didn't have to understand how it was structured, just to wait for 'complete' [15:30:51] o/ thanks for your explanations. [15:31:44] ebernhardson: is there an easy way to check if short-circuit blows up existing dags? [15:33:03] pfischer: hmm, I can't think of a great way. To test i would probably put together a fresh dag using dummy operators and see how it behaves when running airflow locally [15:33:30] or maybe PythonOperator since it's easier to fail than DummyOperator i guess [15:38:47] * ebernhardson finds it annoying that in an immutable object, holding an ImmutableMap, sonarqube wants the empty constructor to use java.util.Map.of(), even though it wont fit in the place it's going :P [15:42:40] ebernhardson: do you have a sonarqube URLß [15:42:41] ? [15:43:19] pfischer: https://sonarcloud.io/project/issues?resolved=false&sinceLeakPeriod=true&branch=work%2Febernhardson%2Fsaneitizer-source-65a17263&id=wmftest_cirrus-streaming-updater&open=AY5obY-u2WBQbc1bYts- [15:43:57] sonar is also saying i don't understand the interaction between Object.wait, Object.notifyAll, and InterruptedException. it's probably right :) [15:44:33] I suspect in my use case the complaints can be ignored, but still [15:46:12] Hm, regarding the ImmutableMap: I would suggest to stick with the flink helpers for argument checking and use the plain java Map.of() (which is immutable). Or is the state exposed publicly so we must communicate its immutability? [15:46:45] (Together, that would get rid of a dependency on guava) [15:47:51] pfischer: the main thing is that flink has two variants of the class, an immutable version and a mutable version, i suppose i wanted to remove the possibility of getting it wrong when converting and having state reach between classes [15:48:34] but i guess if just enforce the constructor to take a copy its ~ same [15:49:01] well, no guarantee against mutation via type system, but as long as the constructors always using an immutable impl i suppose thats fine [15:49:13] Regarding the synchronized/wait: I ran into the same complaint when writing an IT with a dummy flink source. Solved it by calling CompletableFuture: https://gitlab.wikimedia.org/repos/search-platform/cirrus-streaming-updater/-/merge_requests/107/diffs#782d9b8a4e597d7dfee62ba01f975bc78ad69146_0_540 [15:49:31] in the code linked, you enforce an empty map, so Collections.emptyMap() would do the same, be slightly more explicit and remove that dependency on Guava (which is what I think that Sonar is complaining about). [15:50:20] pfischer: ok thanks, i'll poke that and see if can do similar [15:50:44] ebernhardson: problem is that wait can be suriously awaken and that's why it wants to always check for an "out" condition in a loop [15:51:05] s/suriously/spuriously [15:52:00] dcausse: i suppose i can handle that, but I suppose i was allowing that to defer to the while loop a couple frames up in the stack that will keep re-running our fetch [15:52:23] but that is certainly less obvious [15:54:15] ebernhardson: I think the code works well (except the blind catch on InterruptedException) since you don't need to wait precisely for millis, IIUC it's just to avoid burning too much cpu [15:55:15] i really dunno what to do with that InterruptedException, by my reading it should never happen (Thread.interrupt() has to be explicitly called somewhere) but we have to do something with it [15:57:03] what this ends up doing is passing control back to flink, incase it wanted to do something. But flink should have woken us up with the explicit wakeUp() method [15:57:24] if you own (created it manually) this thread then probably nothing is going to interrupt it for you, if it's from flink then it might happen [15:57:34] flink does create it [15:58:26] * gehel adds this on the stack of things to take a deeper look at... [15:59:36] ebernhardson: generally you want to propagate the InterruptedException as it means something has to stop, if you can't what can be done is reset the Interrupt flag so something else in the call stack will fail again [16:00:20] but ignoring InterruptedException might prevent flink from e.g. shutting down properly [16:00:56] I think gehel has some material about this [16:05:36] dcausse: weekly meeting: https://meet.google.com/eki-rafx-cxi [16:05:41] oops [19:40:56] * ebernhardson realizes saneitize is configured by domain names, but the wikis configuration arg is wiki-ids [20:39:11] ebernhardson: I added a question about the map capacity (https://gitlab.wikimedia.org/repos/search-platform/cirrus-streaming-updater/-/merge_requests/110#note_75221). Feel free to ignore if you think it's obvious enough.