“I think the weakness with this and [Creative Commons’] similar proposal for ‘preference signals’ is that they rely on scrapers to respect these signals out of some desire to be good actors,” White continued. “We’ve already seen some of these companies blow right past robots.txt or pirate material to scrape.”
There’s the problem. Anything that’s publicly available on the internet either has already been scrapped, or will be scrapped, even Lemmy.
Exactly! As far as I’m concerned, robots.txt should be enough: I tell your bot to stay the hell away, or not, and your bot obeys. What it scrapes for doesn’t matter, IMO.
We don’t need more standards and rules for assholes to ignore, we need assholes to adhere to the rules.
There’s the problem. Anything that’s publicly available on the internet either has already been scrapped, or will be scrapped, even Lemmy.
Exactly! As far as I’m concerned, robots.txt should be enough: I tell your bot to stay the hell away, or not, and your bot obeys. What it scrapes for doesn’t matter, IMO.
We don’t need more standards and rules for assholes to ignore, we need assholes to adhere to the rules.