[16:20:02] https://github.com/miraheze/RequestSSL/pull/40#pullrequestreview-2063857578 a:EyesSHAKE: [16:20:30] That actually not a bad review from a bot... [16:24:19] lol it caught a type of mine in method name... [16:31:26] AI code review? [16:31:29] 🧐 [16:32:03] Yep I quite like it lol [16:32:43] Hmmm [16:32:48] Can you run it on my PR? [16:33:06] I was just about to lol [16:33:21] It might blow up [16:33:26] 😉 [16:33:39] If I haven't hit the hourly rate limit trying to run on a bunch of old PRa yet lol [16:33:54] It’s my code. It’ll do that anywys [16:34:06] Is it specifically geared towards mediawiki? [16:35:03] Nope, but it reads the entire to find out what to do. [16:38:08] Hm, what’s the price and limit [16:38:36] Its free for public repos [16:40:13] [1/2] https://github.com/coderabbitai/ai-pr-reviewer [16:40:13] [2/2] Is a slightly older version of the code [16:40:29] I assume chat gpt limits apply since it pipes the review through there [16:41:59] Hm. Wonder if they could use copilot idk if it tends to be better or not [16:43:07] I wish it had options to chose which model [16:43:13] Claude > any other model [16:43:47] Hm? [16:46:55] @cosmicalpha it makes a poem with its code reviews [16:46:58] I love this thing [16:47:29] [1/8] >>> 🐰✨ [16:47:30] [2/8] In the land of wikis, where comments flow, [16:47:30] [3/8] A new rule appears to help them grow. [16:47:30] [4/8] No more repeats, no echoes in the hall, [16:47:30] [5/8] Each word unique, standing tall. [16:47:31] [6/8] So let your thoughts be fresh, anew, [16:47:31] [7/8] In the wiki world, for me and you. [16:47:31] [8/8] 🌟📜 [16:50:15] What? [16:50:28] What is that from? [16:53:25] This is the review on CA’s PR for CW to prevent duplicate comments [16:54:02] Erm [17:30:27] can code review by an AI be truly trusted? [17:31:12] No [17:50:42] No one said it’s the only thing we are relying on [17:53:48] [1/62] On a dark office evening, [17:53:48] [2/62] Sat down in my chair. [17:53:49] [3/62] Sharp smell of stale coffee [17:53:49] [4/62] Circling round in the air. [17:53:49] [5/62] Suddenly on the webpage [17:53:50] [6/62] There came a flickering light. [17:53:50] [7/62] My head grew heavy, and my sight grew dim; [17:53:50] [8/62] I had to stop for the night. [17:53:50] [9/62] There it was in the link list: [17:53:51] [10/62] "Edit page; you'll do well" [17:53:51] [11/62] And I was thinking to myself: [17:53:52] [12/62] This could be Heaven or this could be Hell! [17:53:52] [13/62] Then it lit up the quickbar, [17:53:53] [14/62] And it showed me the way. [17:53:53] [15/62] There were pages begging for clean-up; [17:53:54] [16/62] I thought I heard them say: [17:53:54] [17/62] Welcome to the Hotel Wikipedia [17:53:55] [18/62] Such a lovely place [17:53:55] [19/62] So much empty space [17:53:56] [20/62] Plenty of work at the Hotel Wikipedia [17:53:56] [21/62] Any time of year [17:53:57] [22/62] You can find us here... [17:53:57] [23/62] Its structure's maze-passage twisted; [17:53:58] [24/62] No one knows where it ends. [17:53:58] [25/62] It's got a lot of money mirror sites, [17:53:59] [26/62] That it calls friends. [17:53:59] [27/62] And in the dance of the pages [17:54:00] [28/62] Editors sweat - [17:54:00] [29/62] Some change to remember, [17:54:01] [30/62] Some change to forget. [17:54:01] [31/62] So I chose Contributions, [17:54:02] [32/62] Tell me, what have I done? [17:54:02] [33/62] And it said: [17:54:03] [34/62] This is all that you've been good for, here, [17:54:03] [35/62] since two thousand and one. [17:54:04] [36/62] And still those pages beg changes [17:54:04] [37/62] From far away, [17:54:05] [38/62] Keep you up in the middle of the night [17:54:05] [39/62] Just to hear them say... [17:54:06] [40/62] Welcome to the Hotel Wikipedia [17:54:06] [41/62] Such a lovely place [18:00:44] I LOVE HOTEL WIKIPEDIA [18:02:10] https://meta.wikimedia.org/wiki/Hotel_Wikipedia [18:08:26] One song Miraheze karaoke [18:16:12] Time to set up Hotel Miraheze for a quick extra source of income [18:16:16] Who wants to be our first investor? 😄 [18:31:22] Me [18:33:15] Yeah they definitely wont be relied upon but it does provide some pretty decent feedback [18:33:32] https://github.com/miraheze/CreateWiki/pull/511#pullrequestreview-2064141701 @pixldev [18:33:38] Oh boy [18:34:03] nah the feedback on my RequestSSL PR was pretty bad [18:34:13] "make sure you actually are using the services" [18:35:10] The comment about the User::newSystemUser doesn't apply [18:35:39] https://github.com/miraheze/CreateWiki/pull/511#discussion_r1605409202 but I did though.. :( the XSS one is good assuming it’s right mediawiki don’t escape stuff [18:35:44] huh? It is a step by step. It does apply actually and something I'd recommend myself. Duplicating code is bad. Using variables to store it once is good. [18:36:03] Also if canEditRequest() a real function I just missed [18:36:14] due to how the logic works it will only ever be called once anyway [18:36:27] No it is saying to make it [18:36:32] so there are no multiple instantiations [18:36:56] the code also is checking for DNS failures [18:36:57] Ah [18:37:01] 1/10 review [18:37:02] Well yes that isn't the problem. Readability is what it was asking. [18:37:24] Also it seems confused on the purpose. It thinks it’s for all users not just reviewers [18:37:33] "to avoid multiple instantiations" https://github.com/miraheze/RequestSSL/pull/40#discussion_r1605255611 [18:37:53] Tell it that it on comments, it will learn in the org btw [18:38:01] I assume it’ll become more familiar with the mediawiki codebase as we yell at it yeah [18:38:49] Yes it can learn and I'm building custom instructions for it also to train it on MW standards. [18:39:05] Just like me [18:39:51] "and improve code clarity." that is the part I agree with I agree with you it isn't 100% right on the other parts. [18:40:53] [1/8] >>> 🐇 [18:40:53] [2/8] A threshold now we set with care, [18:40:53] [3/8] To manage wikis, fair and square. [18:40:54] [4/8] A warning comes when counts are high, [18:40:54] [5/8] To keep our requests from reaching the sky. [18:40:54] [6/8] With thoughtful code and messages clear, [18:40:54] [7/8] Our system runs with less to fear. [18:40:55] [8/8] 📝✨ [18:40:58] This alone is enough to keep it ngl [18:45:44] Hmm the XSS thing it found seems to be valid though @pixldev since it uses raw => true, it would be hard to produce but still technically valid I believe. [18:47:15] Can a username trigger XSS? [18:47:31] But ye better to be safe [19:57:40] yeah I pretty much have a zero tolerance policy for LLMs so forgive me for being extremely firm for expressing my dissent here [19:58:38] LLMs waste an insurmountable amount of physical resources and have literally no ability to understand any subject matter, only spew out tokens, and any good feedback is hidden among a sea of bad or subtly wrong feedback [19:59:17] testing on small PRs that are easy to review is not a great test; LLMs are best when problems are within their extremely limited knowledge set. go outside that even slightly and I'm sure it'll spew nonsense [20:00:14] have genuinely wanted to get more involved on the tech side but had an enormous amount of stuff crop up IRL which is why I haven't even really paid attention to these channels >< [20:00:49] so sorry for starting off being firm and negative [20:01:59] The tool is simply that, a tool. it won't replace developer due dilligence, but it will help in pointing out where things could be improved. [20:03:08] I think that labelling it a tool is incredibly misleading because it is categorically unable to do what it claims to do [20:03:27] LLMs cannot analyse anything, end of story [20:03:50] They can analyse, the issue imo is with the comprehension of it [20:04:10] although at this level the exact term dont matter [20:05:02] no, saying that LLMs can comprehend or analyse anything is categorically misunderstanding what LLMs do. they spew out a probabilistic sequence of tokens that immediately follow a previous sequence of tokens. just because these tokens resemble suggestions doesn't mean that they have any understanding of the actual context at hand [20:05:04] that is your opinion. [20:05:11] nobody asked it to analyse everything. [20:05:26] that's literally what it is claiming to do, analyse the code and provide feedback [20:05:39] like what is it doing if not analysis [20:05:41] theres a difference between analysing a commit and "analysing everything" [20:05:52] apparently chatgpt has been taught to generate valid wikitext templates... [20:06:07] or at least the code works [20:06:10] Like I've just said, its to assist with picking out small things that could be improved, not write MediaWiki 1.44 [20:06:42] I mean, you have to understand the project and everything it does in order to provide useful feedback. maybe not everything, all the time, but if you want it to provide feedback for any arbitrary PR, it will have to do that over the course of every PR submitted [20:07:24] [1/2] > it will have to do that over the course of every PR submitted [20:07:24] [2/2] good job it doesn't need sleep then. [20:07:35] I don't see how that nullifies any of my points [20:07:50] a single human who knows nothing of the project could provide feedback to every PR but I would say that's more harmful than helpful [20:08:19] Feedback that's wrong can be ignored [20:08:39] Because developers make mistakes sometimes. I really don't know how you are not understanding, its to pick up on small things that can be changed. [20:08:55] I knew like nothing about the CW codebase and still managed to catch an issue that flew by both the Director and Deputy director of tech [20:08:58] my entire point is it literally cannot pick up on the small things. the small things are exactly the things it misses [20:09:05] it picks up big things only [20:09:16] We use tools all of the time to gain insight [20:09:36] All have to be trialed [20:09:43] thats your opinion but I'm clearly talking to myself so I'm not engaging with the conversation anymore [20:09:48] Some we scrap, some we keep [20:10:18] We also need to see if it get's any better when we feed it more data about the codebase [20:10:22] I don't think dismissing it because it includes AI is a good idea [20:10:36] it's not my opinion that LLMs cannot do things. it is an objective fact about how they work. the fact that people claim that they're "analysing" or "understanding" or "comprehending" or "responding" to anything is an anthropomorphism done intentionally to market them as a valuable tool [20:11:32] if it were simply a harmless thing, I would not be responding so strongly. but Microsoft alone was responsible for 30% of all global water usage in 2023 cooling server racks in data centres training AI. all for just, ChatGPT to maybe, sometimes provide feedback to your code [20:11:40] I agree it has its limitations [20:11:45] clearly has not been listening, the tool has already caught things that were missed... [20:12:12] But it has provided additional insight and detected things that have been missed [20:12:35] were they missed? or did everyone just ignore the PR because it was labelled as incomplete, waiting for it to be finished before providing feedback? [20:12:56] like, that's literally why I didn't even bother looking over it. it was labelled as partial and incomplete [20:13:07] They were missed [20:13:31] I think that's a strong assertion, since in the absence of someone proving they had reviewed the PR already, we cannot possibly know whether someone would have caught the mistakes [20:13:32] The code on @reception123's PR is mostly duplicated from another script [20:13:44] I have reviewed it before [20:13:53] okay, I did not know this, so, that's fair [20:13:56] [1/4] To separate out intertwined lines of debate here, your complaint seems to be threefold: [20:13:56] [2/4] * LLMs are resource-intensive and wasteful [20:13:56] [3/4] * LLMs should not be used as a substitute for human oversight [20:13:57] [4/4] * LLMs underperform human review/give limited benefit [20:14:14] I have been meaning to move it to a python package to avoid the duplication [20:14:17] thank you, NotAracham. that summarises my points nicely [20:14:37] And @pixldev's was reviewed [20:14:49] It did catch a valid XSS [20:14:52] again, I'm not denying the fact that it can provide good feedback sometimes. I am denying the fact that it will provide more benefit than harm over the course of its usage [20:14:54] yeah [20:15:11] We can't decide that on a single PR [20:15:20] Your view is prejudicial [20:15:27] I am categorically stating that all LLM/ChatGPT-based tools will provide more harm than benefit over the course of their usage [20:15:35] Yes you are [20:15:41] And I disagree [20:15:46] They have a place [20:16:03] you're saying that 30% of all global water usage is enough to justify its current, extremely limited benefits? [20:16:05] We can't stop big mic from running all these AI's and there's not really any harm directly to us. I am not going to argue if AI may have a net negitive on the world but frankly we can't do much bout it here [20:16:26] Where did you read that [20:16:59] The water is being wasted anyways, if anything it would be more wrong to let it be wasted for nothing instead of using it, even if we'd rather it not be wasted at all [20:17:08] that's a pretty defeatest mentality [20:17:18] if something doesn't provide a lot of benefit you can choose to just not use it on principle [20:18:07] Ethics and sustainability are worth considering [20:18:10] [1/5] I'm very much agreed on the first two points of debate, but my thoughts [20:18:10] [2/5] * For most models, the performance per watt is improving substantially but there's absolutely a long road to go towards anything I'd say is ethical consumption [20:18:10] [3/5] * I would like more information on this water usage element, not quite aligned with my understanding of resource intensiveness... [20:18:11] [4/5] * Non-profit status is at least negating misuse of funds, as our usage today is free to my understanding [20:18:11] [5/5] * At least at present, intended use is not as a replacement for human oversight (or wholesale code creation) but as another tool to ensure safe and performant code goes out the door. [20:18:20] But this is a very black and white debate at the moment [20:18:32] okay, I definitely was wrong there, I should be using proper statistics [20:18:39] microsoft increased its own water usage 30%, not globally [20:18:48]