Blocking AI crawlers on the fediverse

cecep@fedia.io · 9 months ago

Blocking AI crawlers on the fediverse

ArbitraryValue@sh.itjust.works · 9 months ago

I don’t think that would make much of a difference. Training AI on copyright-protected data appears to be fair use.

FaceDeer@kbin.social · 9 months ago

Yup. There are dumps of Reddit’s entire archive of comments and posts available via torrent, I suspect the only reason Reddit’s getting paid for that stuff right now is that it’s a legal ass-covering that’s comparatively cheap. Anyone who’s a little daring could use it to train an LLM and if they prep the data well enough it’d be hard to even notice.