Well, you’ve got a timestamped copy of much of the Web that existed up until latent-diffusion models at archive.org. That may not give you access to newer information, but it’s a pretty whopping big chunk of data to work with.
Hopefully archive.org have measures in place to stop people from yanking all their data too quickly. As least not without a hefty donation or something. As a user it can chug a bit, and I’m hoping that’s the rate-limiting I’m talking about and not that they’re swamped.
That would go against the principal of the archive imo but regardless, if you take away all means of acquiring data freely, you are just giving companies like OpenAI and Google who already have copies of it an insane advantage.
AI isn’t going away, we need to make sure we have free access to it as to not give our whole economy to a handful of companies.
Well, you’ve got a timestamped copy of much of the Web that existed up until latent-diffusion models at archive.org. That may not give you access to newer information, but it’s a pretty whopping big chunk of data to work with.
Hopefully archive.org have measures in place to stop people from yanking all their data too quickly. As least not without a hefty donation or something. As a user it can chug a bit, and I’m hoping that’s the rate-limiting I’m talking about and not that they’re swamped.
That would go against the principal of the archive imo but regardless, if you take away all means of acquiring data freely, you are just giving companies like OpenAI and Google who already have copies of it an insane advantage.
AI isn’t going away, we need to make sure we have free access to it as to not give our whole economy to a handful of companies.