[00:09:52] Also just want to share the fact that normal form can be parsed in Rust using 12 lines of serde only and I think that's really amazing [00:12:47] I'm a bit confused by the variability of test outcomes recently. For example, when I looked this morning, I thought Z36287 was doing quite well, but now I see 2/7. [00:23:10] these unspecified errors are quite suspicious 🧐 could it be backend-related? [07:47:45] 🤷‍♂️ Back to 6/7 again now. What’s quite interesting from the Provenance is that some of the validation call results are from three days ago, suggesting that the successful results from the tests are themselves stable. But then… it’s not clear why they are not all from three days ago. 🤔 (re @u99of9: I'm a bit confused by the variability of test outcomes recently message> [07:47:46] . For example, when I looked this morning, I thought Z36287 was d...) [07:55:56] And since it is 6/7 why isn't it promoted since recursive is currently showing 5/7. (re @Al: 🤷‍♂️ Back to 6/7 again now. What’s quite interesting from the Provenance is that some of the validation call results are from th...) [07:57:49] https://tools-static.wmflabs.org/bridgebot/1a134a2b/67_6767.mp4 [08:00:14] Because the sixth is a WASI resource issue, so the comparison is stalled? (re @u99of9: And since it is 6/7 why isn't it promoted since recursive is currently showing 5/7?) [08:01:58] So it doesn't demonstrate passing, but shows as passed? (re @Al: Because the sixth is a WASI resource issue, so the comparison is stalled?) [08:13:47] No. Z34975 fails in Z13466 with a Z576 resource error, so the comparison is aborted or deferred and the list remains unaltered. I think 🤷‍♂️ (re @u99of9: So it doesn't demonstrate passing, but shows as passed?) [08:34:09] Should they be given a standard time penalty, or a fail in the comparison? What at our end should I envisage causes WASI?l (re @Al: No. Z34975 fails in Z13466 with a Z576 resource error, so the comparison is aborted or deferred and the list remains unaltered. ...) [09:08:56] I don’t think it’s anything specific at our end. Simple calls require only one runner, so they are less likely to fail to acquire that within the time limit than a complex call that requires many. Reducing unnecessarily multiplied calls within test cases is probably “our” only available weapon. But if one implementation is particularly likely to encounter this type of err [09:08:56] [09:08:56] or whilst a different implementation is not, the multiplied calls are what should be a relevant factor when it comes to ranking them. [09:08:58] Thinking about it, it may be that v2’s lazy evaluation has a negative impact in some cases, because the deferred evaluations might create a spike in the demand for runners, where previously (particularly in a recursive implementation) the demand was spread. That is pure conjecture, however! (re @u99of9: Should they be given a standard time penalty, or a fail in the [09:08:58] comparison? [09:08:59] What at our end should I envisage causes WASI?l)