[04:03:41] very cool! It does not cover the funny thing I just found :) But I'm really enjoying diving into MW, thank you. (what I found: null edits are published via PageSaveComplete -> EventBus -> Kafka just the same as any other edit, and apparently this happens a *lot*... trying to find a way to identify this that's universal now...) [04:03:41] https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/refs/heads/master/includes/Storage/PageUpdater.php#1627 [10:58:00] milimetric: do you mean revisions that have an empty diff (with entry on history page), or edit attempts that ended up no-oping (no entry in history page). [10:58:40] Both are sometimes called null edits. The former happens usually via non-edit actions such as protect, import, etc [11:11:27] milimetric: looks like you can use EditResult::isNullEdit, similar to what WikimediaEvents does already for the same hook. https://gerrit.wikimedia.org/g/mediawiki/extensions/WikimediaEvents/+/306cf12610fb167da4387f83ae6743989803f13b/includes/WikimediaEventsHooks.php#167 [11:13:18] The docs of isNullEdit threw me off at first since it says "revision is the same as parent" is ambiguous (same rev ID, or textually same content?), but indeed it is about same revision ID, I'll update the docs to clarify. Cases that force a no-change revision to be added are not considered null edits. https://gerrit.wikimedia.org/g/mediawiki/core/+/a2289ef5a4c5f7e80ea217e5d6ade4a13d9ce5ec/includes/Storage/EditResultBuilder.php#120 [11:15:18] It might be easier to use the onRevisionFromEditComplete hook instead which avoids the null edit cases I believe. But it's worth checking more widely to make sure it's fired in otherwise all the same circumstances. I'm guessing for your use case you don't need events for each historically imported revision during a batch import? [11:33:59] thanks Krinkle, looks like I have my research for today. I think we might be ok catching all edits as long as we can flag these weird cases so consumers of the events know what to do. But indeed, some consumers would want the no-op edits and some would not, while I can't think of anyone that would want the null edit that doesn't make an entry in the history. And there are soo many of them [11:34:39] A lot of bots seem to trigger them too, which makes me feel like there should be a better api interface to do whatever they're trying [11:37:35] milimetric: yeah, null edits in the sense of EditResult only serve for purging and links update regeneration. They should not leave the system in any way into public streams I think. [11:38:22] Yeah, people would just subscribe to link updates if they needed those [11:38:49] Also in part for privacy and correctness as it would become the only recorded place detailing them, either attributing to a user that left no on wiki trail, or to the previous edit and incorrrctkt suggesting the person who edited it weeks ago is active now when they weren't [11:39:24] Afaik event bus uses to hook into revision insert [11:39:43] I wonder why that changed. Maybe be worth some git digging [11:40:36] There is also the PageUpdater refactor that maybe led to a deprecation mass change updating it incorrectly to a slightly different hook [15:26:06] well, they're using https://www.mediawiki.org/wiki/Manual:Hooks/PageSaveComplete because I think they want to capture all page state changes that affect the current version of a page, so that would include historically imported revisions and most weird cases, but definitely not null edits. [15:26:11] I'm sending a change to https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/EventBus/+/refs/heads/master/includes/HookHandlers/MediaWiki/PageChangeHooks.php#246 [16:03:18] milimetric: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventBus/+/821776 [16:03:31] https://gerrit.wikimedia.org/g/mediawiki/extensions/EventBus/+/refs/changes/76/821776/30/includes/EventBusHooks.php#295 [16:03:40] This is how it used to be, checking isNullEdit() [16:04:13] the patch adds PageChangeHooks.php which doesn't perform that check [16:04:43] not sure if that was intentional on Andrew's part, I'm guessing not :) [18:25:58] Krinkle: I sent this patch https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventBus/+/939317 and I'm not sure who usually reviews these things since the history shows lots of different people. Mostly I wonder if people agree with Andrew that maybe we should log them or me that we should just completely ignore, otherwise it seems like a safe change to make and deploy [18:35:26] milimetric: I don't know. We've had have a dozen reorgs since the extension was created with not one manager at any level in the affected chains still employed here. Seems like perhaps a fairly small and standalone extension for a team outside MSG (MediaWiki Services Group) to pick up. Could be yours. [18:36:33] I'm happy to help with any Phab automation as needed to ensure triage of new tasks is easy and centralised for your team. And whatever else you might need. [18:37:19] The alternative is to raise to VP level with a case for why you don't want to see it undeployed. [18:39:23] oh, good point, this should go through proper process [19:42:19] EventBus could also be merged into core, as was promised multiple teams ago :) (not that that would address the maintainership issue) [19:58:22] legoktm: What kind of benefits would you expect that to bring? I worry that would either hardcoding, or add significant complexity/flexibility/indirection, as it would need to manage its schemes somehow and format/dispatch events. We could make that pluggable and keep the schemas wmf-specific separately based on a documented PHP value format. But that's basically what we have now via hooks and their params. [19:59:23] We could potentially formalise it a bit and recognise which hooks we consider event-like, by using EventRelayer (as for purges already), or through the Action/Filter hooks RFC.