If somebody wants to use my online content to train their AI without my consent I want to at least make it difficult for them. Can I somehow “poison” the comments and images and stuff I upload to harm the training process?
If somebody wants to use my online content to train their AI without my consent I want to at least make it difficult for them. Can I somehow “poison” the comments and images and stuff I upload to harm the training process?
If you have control of the server or platform serving the content, could look into “robots.txt” and “tarpits.” There are a few, but one example is Nepenthes: https://zadzmo.org/code/nepenthes/
If you just own the domain and it’s hosted elsewhere, you could set it up to go through CloudFlare DNS. They have a one-button scrape-stopper: https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click/