[13:22:11] Looks like there's an outage ongoing at WMF https://grafana.wikimedia.org/d/1T_4O08Wk/ats-backends-origin-servers-overview?orgId=1 [13:23:37] see -operatins [13:25:46] It is a known issue and being investigated. [17:31:45] I've a weird question, not sure if on-topic. Say I host a wiki whose content is available under a CC license, and a company is meaning to use the wiki's content and do something with it. They are fully within their rights to scrape the content off of the wiki and use it in accordance with the license terms, right? [17:32:31] On one hand it feels wrong, on the other hand scraping isn't illegal I guess. :D [17:33:21] Depends which CC license, but generally yes, that is allowed (and kind of the point) under a CC license [17:34:20] That said, you probably don't neccesarily have to help them do this if you don't want to [17:34:42] e.g. I don't think you have any legal obligation (IANAL) to allow their robots access [17:35:12] but generally it would be seen as contrary to the purpose to block a scrapper of someone who plans to reuse CC content [17:36:03] unless they're abusing you (e.g. not throttling their scraping and effectively DOSing your server in the process) [17:37:00] but yeah entire point of CC is that other people can reuse the content in accordance with whatever terms are set on it (e.g. attribution, sharing any modifications under the same license, ...) [17:37:21] you don't necessarily need to make this *easy* for them to do, although MW does so via Special:Export [17:38:08] Makes sense. I guess the only reason it feels wrong to me is because the company in question is a massive corporation, and we're a wee little fan wiki. :D Well, not so little any more I guess, but entirely fan-made and non-profit. [17:38:44] The license is BY-NC-SA though so they can't commercialize the content... Or maybe they can via an ML loophole, we'll see. [17:39:20] "non-commercial" just puts restrictions on *how* it can be re-used, not *who* can re-use it [17:39:41] right [17:39:42] ML is pretty wild west [17:40:16] But generally, no matter what copyright license you chose, the company might still have fair use rights [17:40:28] although where those start and end can be very ambigious [17:41:24] people training LLMs generally just ignore the copyright assuming that their training is transformative enough and doesn't retain enough of the original data to qualify under "fair use" in US copyright law. Other copyright regimes have no concept of fair use, and even in the US this interpretation is being challenged in courts, so the legal landscape is still evolving around them [17:42:35] There's also the pragmatic part aspect - small fan wiki is probably not going to sue big company, even if they are in the right [17:44:12] eh, in the US you can probably get a lawyer to do it with legal fees on contingency if the target is big enough [17:45:22] if only because the US has the absolutely bonkers statuatory damages of up to $250k per infringement (meaning that damages award is completely divorced from how much money the plaintiff actually lost due to the infringement) [17:46:32] but even then, lot of time/effort/headache