- cross-posted to:
- reddit@lemmy.ml
- technology@beehaw.org
- cross-posted to:
- reddit@lemmy.ml
- technology@beehaw.org
Generative AI has really become a poison. It’ll be worse once the generative AI is trained on its own output.
You’re two years late.
Maybe not for the reputable ones, that’s 2026, but these sheisters have been digging out the bottom of the swimming pool for years.
New models already train on synthetic data. It’s already a solved solution.
Is it really a solution, though, or is it just GIGO?
For example, GPT-4 is about as biased as the medical literature it was trained on, not less biased than its training input, and thereby more inaccurate than humans:
https://www.thelancet.com/journals/landig/article/PIIS2589-7500(23)00225-X/fulltext
Here’s my prediction. Over the next couple decades the internet is going to be so saturated with fake shit and fake people, it’ll become impossible to use effectively, like cable television. After this happens for a while, someone is going to create a fast private internet, like a whole new protocol, and it’s going to require ID verification (fortunately automated by AI) to use. Your name, age, and country and state are all public to everybody else and embedded into the protocol.
The new ‘humans only’ internet will be the new streaming and eventually it’ll take over the web (until they eventually figure out how to ruin that too). In the meantime, they’ll continue to exploit the infested hellscape internet because everybody’s grandma and grampa are still on it.
I would rather wade with bots than exist on a fully doxxed Internet.
Yup. I have my own prediction - that humanity will finally understand the wisdom of PGP web of trust, and using that for friend-to-friend networks over Internet. After all, you can exchange public keys via scanning QR codes, it’s very intuitive now.
That would be cool. No bots. Unfortunately, corps, govs and other such mythical demons really want to be able to automate influencing public opinion. So this won’t happen until the potential of the Web for such influence is sucked dry. That is, until nobody in their right mind would use it.
That sounds very reasonable as a prediction. I could see it being a pretty interesting black mirror episode. I would love it to stay as fiction though.
I called this shit out like a year ago. It’s the end of any viable online searching having much truth to it. All we’ll have left is youtube videos from project farm to trust.
I ran into this issue while researching standing desks recently. There are very few places on the internet where you can find verifiably human-written comparisons between standing desk brands. Comments on Reddit all seem to be written by bots or people affiliated with the brands. Luckily I managed to find a YouTube reviewer who did some real comparisons.
It kinda seems like the end of the Google era. What will we search Google for when the results are all crap? This is the death gasps of the internet I/we grew up with.
Maybe web rings of the 90s were not such a bad idea! Let’s bring 'em back!
Gemini webrings are the future?
They would poison that shit as well unfortunately. The concept is great though.
Eh, how’d you do that?
Do what? Webrings?
How do you poison them.
Create sites that look like legit websites, then slowly ramp-up the bullshit. Same tactic as always.
Remember when you could type a vague plot of a film you’d heard about into Google and it’d be the first result?
Nah doesn’t work anymore
Saw a trailer for a french film so I searched “french film 2024 boys live in woods seven years”
Google - 2024 BEST FRENCH FILMS/TOP TEN FRENCH FILMS YOU MUST SEE THIS YEAR/ALL TIME BEST FRENCH MOVIES
Absolute fucking gash
I’ve not been too impressed with Kagi search, but at least the top result there was “Frères 2024”
Remember when you could type a vague plot of a film you’d heard about into Google and it’d be the first result?
I honestly don’t remember this at all. I remember priding myself on my “google-fu” and how to search it to get what i, or other people, needed. Which usually required understanding the precise language that you would need to use, not something vague. But over the years it’s gotten harder and harder, and now I get frustrated with how hard it has become to find something useful. I’ve had to go back to finding places I trust for information and looking through them.
Although, ironically, I can do what you’re talking about with ai now.
I honestly don’t remember this at all.
It was absolutely a thing and one of the reasons Google became wildly popular at first
When?
TUESDAY
I’m feeling myself old and I’m 28.
Cause in my early childhood in 2003-2007 we would resort to search engines only when we couldn’t find something by better (but more manual and social) means.
Because - mwahahaha - most of the results were machine-generated crap.
So I actually feel very uplift due to people promising the Web to get back to norm in this sense.
When the internet is eventually oversaturated with smartbots, where will the humans go.
I think there will be captchas everywhere.
The usefulness of Captchas is being destroyed by “AI” too. And ironically they were used to train certain types of Machine Learning.
Peer-to-peer systems? Systems where you have to do physically be at the location to get data maybe, so cyber cafe like things. Or back to the old system and go to the regular bars, repair cafés or hobby places.
Group chats.
To a new social media platform where you have to send in a DNA sample to create an account.
That creates a market for morticians and midwifes creating preauthenticated accounts to sell to bot farms
( ͡°╭͜ʖ╮͡° )
Synchronous spaces.
Social VR does not have a lot of the ills of social media. You only have to deal with people much like you would IRL.
It’s gross, but also inevitable. If there’s an untapped niche to make money from, somebody’s going to try it – plus if they want to waste their money on generating accounts only to have them be banned, then so be it.
Makes me kinda thankful that this community is smaller and less likely to be targeted by this sort of crap.
What’s funny is I think it would be profitable for maybe, like, a year, before everyone starts doing it and then even normal people stop trusting reddit comments.
It’s like pissing in a pool to sell people soap. What’s the plan once people stop using the pool?
Buy a new pool and piss in again to sell new soaps.
By the time that the cow is bled dry, someone is stuck holding the bag while some people made out like bandits.
That is the stock market for you. Create no value, just wealth transfer.
Create no value, just wealth transfer.
In this case it’s creating a kind of anti-value - harm, I guess.
Also I bow to your superior and brazen use of mixed metaphors. You got double what I did. “Bleeding” a cow dry? It adds impact over the usual “milking” even!
Milking assume that you don’t kill the cow, which isn’t the case here.
Some people are specialized at being hired at startups to prop up the startup to be sold and make a quick buck.
Then they move on to the next startup, wash rinse and repeat. It tells a lot about the state of innovation.
Innovation’t 😒
Doesn’t mean that the fediverse is immune.
News stories and narratives are still fought over by actors on all sides and sometimes by entities that might be bots. And there are a lot of auto-generating content bots that post stuff or repost old content from other sites like Reddit.
Especially since being immune to censorship is kind of the point of the fediverse.
If you’re even a tiny bit smart about it, you can start hundreds of sock puppet instances and flood other instances with bullshit.
Can’t some instances make some sort of agreement and have a whitelist of instances to not block? People would need to register to add their instances to the list, and some common measures would be applied to restrict someone from registering several instances at once, and banning people who misuse the system.
That wouldn’t solve the problem, but perhaps would make things more manageable.
You can’t block people. Who would you know, who registered the domain?
What you’re proposing is pretty similar to the current state of email. It’s almost impossible to set up your own small mail server and have it communicate the “mailiverse” since everyone will just assume you’re spam. And that lead to a situation where 99% of people are with one of the huge mail providers.
you’re right, the matter is more complicated than I thought…
It’s extremely complicated and I don’t really see a solution.
You’d need gigantic resources and trust in those resources to vet accounts, comments, instances. Or very in depth verification processes, which in turn would limit privacy.
What I actually found interesting was bluesky’s invite system. Each user got a limited number of invite links and if a certain amount of your invitees were banned, you’d be banned/flagged to. That creates a web of trust, but of course also makes anonymous accounts impossible.
I try to avoid talking about how indefensibly terrible Lemmy’s anti-spam and anti-brigading measures are for fear of someone doing something with the information. I imagine the only thing keeping subtle disinfo and spam from completely overtaking Lemmy is how small its reach would be. Doing the same thing to Reddit is a hundred times more effective, and systemically accepted. Reddit’s admins like engagement.
Put in those tickets. It’s a community effort y’know.
It’s an arms race and Lemmy is only a small player right now so no one really pays attention to our little corner. But as soon as we get past a certain threshold, we’ll be dealing with the same problems as well.
I feel the same about a lot of Fediverse apps right now. They’re kinda just coasting on the fact that they’re not big enough for most spammers to care about. But they need to put in solid defenses and moderation tools before that happens
Another reason to block federation with Threads.
Meta will likely actually moderate against spambots because they want you to fucking pay them for that service. The problem is, they aren’t too interested in moderating hate speech.
So, you’re suggesting that it is better that they are profiting from helping state actors and hate groups?
I don’t think I made a value statement whatsoever. I think calling it a problem and hate speech would’ve been enough of a clue as to how I felt about it, however.
It’s actually why I support most instances defederating from them
Meta has the most resources to combat spam and abuse.
And the least demonstrated desire to do so.
The only reason reddit was valuable was because it was from real people who weren’t paid off. Well that’s ruined now.
I wanted to figure out what game hosting sites were good and Google pointed me to reddit…every thread was full of boilerplate ads for different sites. The comments were the most obvious, marketing-approved sentences I’ve ever seen
Everything I can find online seems to be advertisements or paid reviews (Also advertisements) when looking for anything anymore. Businesses are terrified of an open honest conversation about what is good and what is not
I so don’t understand how to run a business.
-
Spend $Billions shoving advertising down everyone’s throats? Absolutely!
-
Just make a good product and provide good customer support? It will never work!
Option 1 is easy and any idiot can throw money at it to solve the problem. Option 2 requires talented people and real effort.
-
If you’re terrified of honest conversations, your product is probably shit.
Marques Brownlee had a video recently about the question “do bad reviews kill products?” that highlights the issue well
Exactly. Every company is terrified of honest conversation since it makes putting out shit harder.
Yeah, I’ve noticed that a bit lately anyways. Maybe I’m looking up stuff that has less of a community on Reddit, and thus has less discussion, but I have absolutely noticed some comments have a single product name-drop with little clarity for why they liked the product. It starts to feel like they’re just ads (generated or otherwise) meant to trick you into thinking Reddit users are liking the product.
AI is going to just make it worse, and cause Reddit to not be a good goto for actual reviews and discussion on pros/cons.
The first obvious wave of this stuff, to me, was the video conversion ripoff software and similar. They had people looking around for questions their software was possibly a solution for. Sometimes they would act like users, other times it was more neutral info, but still clear it was self promotion because of what was recommended.
Exactly. Usually there’s a conversation or a quick consensus on one or two things. But I’ve been seeing lots of single answers or just ads
There’s an excellent chance that even some of the “authentic” discussions you see are word-for-word reposts of old posts and comments, created by bots to build up karma in order to be sold to spammers and influence peddlers down the line.
If the rumor is true that a reddit/google training deal is what led to reddit getting boosted in search results, this would be a direct result of reddit’s own actions.
If only people moved to an open and federated platform. I mean I don’t have to say that I hate reddit since I’m here but still whenever I Google a problem reddit answers are one of the most useful places. Especially about something local.
This isn’t a problem that can be solved with a technical solution that isn’t itself extremely dystopian in nature.
This is a problem that requires legislation and criminal liability, or genuine punitive civil liability that pierces the corporate legal shields.
Don’t hold your breath for a serious solution to present itself.
Do you think legislation and laws would be reasonable for trolls who ban evade and disrupt and destroy synchronous online social spaces too?
The same issue happens there. Zero repercussions, ban evasion is almost always possible, and the only fool proof solutions seem to quickly turn dystopian too.
Ban evasion and cheating are becoming a bigger and bigger issue in online games/social spaces. And all the nerds will agree it’s impossible to fix. And many feel it’s just normal culture. But it’s not sustainable, and with AI and an ever escalating cat and mouse game, it’s going to continue to get worse.
Can anyone suggest a solution that is on the horizon?
No, I’m a free speech absolutist when it comes to private citizens. Be they communists, Nazis, Democrats, trolls, assholes or furries, the government should have no role in regulating their speech outside of reasonable exceptions i.e. yelling fire in a crowded theater, threats of physical violence, etc.
My moral conviction on relative free speech absolutism ends at the articles of incorporation, or other nakedly profit driven speech e.g. market manipulation.
So if the trolls and ban evaders are acting on behalf of a company, or for profit driven interests, their speech should be regulated. If they’re just assholes or trolls, that’s a problem for the website and mod teams.
That’s just for small players. Big corps probably been doing it for years.
I was about ready to downvote out of pure annoyance lol.
Well, that was the last bit of usefulness I used to get out of google. I’ve been on yahoo for a while now
I see the yahoo ai bot is working well. /s
Absolutely, I am definitely not human
Yahoo is still alive?
The creator of the company, Alexander Belogubov, has also posted screenshots of other bot-controlled accounts responding all over Reddit. Begolubov has another startup called “Stealth Marketing” that also seeks to manipulate the platform by promising to “turn Reddit into a steady stream of customers for your startup.” Belogubov did not respond to requests for comment.
What an absolute piece of shit. Just a general trash person to even think of this concept.
His surname translates from russian as ‘white lips’. No wonder he is a ghoul.
I don’t understand how Lemmy/Mastodon will handle similar problems. Spammers crafting fake accounts to give AI generated comments for promotions
The only thing we reasonably have is security through obscurity. We are something bigger than a forum but smaller than Reddit, in terms of active user size. If such a thing were to happen here, mods could handle it more easily probably (like when we had the spammer of the Japanese text back then), but if it were to happen on a larger scale than what we have it would be harder to deal with.
There’s one advantage on the fediverse. We don’t have the corporations like reddit manipulating our feeds, censoring what they dislike, and promoting shit. This alone makes using the fediverse worth for me.
When it comes to problems involving the users themselves, things aren’t that different, and we don’t have much to do.
We don’t have corporations manipulating our feeds
yet. Once we have enough users that it’s worth their effort to target, the bullshit will absolutely come.
they can perhaps create instances, pay malicious users, try some embrace, extend, extinguish approach or something, but they can’t manipulate the code running on the instances we use, so they can’t have direct power over it. Or am I missing something? I’m new to the fediverse.
There’s very little to prevent them just pretending to be average users and very little preventing someone from just signing up a bunch of separate accounts to a bunch of separate instances.
No great automated way to tell whether someone is here legitimately.
Federation means if you are federated then sure you get some BS. Otherwise, business as usual. Now, making sure there is no paid user or corporate bot is another matter entirely since it relies on instance moderators.
I think the real danger here is subtlety. What happens when somebody asks for recommendations on a printer, or complains about their printer being bad, and all of a sudden some long established account recommends a product they’ve been happy with for years. And it turns out it’s just an AI bot shilling for brother.
For one, well established brands have less incentives to engage in this.
Second, in this example, the account in question being a “long established user” would seem to indicate you think these spam companies are going to be playing a long game. They won’t. That’s too much effort and too expensive. They will do all of this on the cheap, and it will be very obvious.
This is not some sophisticated infiltration operation with cutting edge AI. This is just auto generated spam in a new upgraded form. We will learn to catch it, like we’ve learned to catch it before.
I mean, it doesn’t have to be expensive. And also doesn’t have to be particularly cutting edge. Start throwing some credits into an LLM API, haven’t randomly read and help people out in different groups. Once it reaches some amount of reputation have it quietly shill for them. Pull out posts that contain keywords. Have the AI consume the posts and figure out if they have to do with what they sound like they do. Have it subtly do product placement. None of this is particularly difficult or groundbreaking. But it could help shape our buying habits.
mods could handle it more easily probably
I kind of feel like the opposite, for a lot of instances, ‘mods’ are just a few guys who check in sporadically whereas larger companies can mobilize full teams in times of crisis, it might take them a bit of time to spin things up, but there are existing processes to handle it.
I think spam might be what kills this.
If a community is so small that the mod team can be so inactive, there’s no incentive for the company to put any effort into spamming it like you’re suggesting.
And if they do end up getting a shit ton of spam in there, and it sits around for a bit until a moderator checks in, so what? They’ll just clean it up and keep going.
I’m not sure why people are so worried about this. It’s been possible for bad actors to overrun small communities with automated junk for a very long time, across many different platforms, some that predate Reddit. It just gets cleaned up and things keep going.
It’s not like if they get some AI produced garbage into your community, it infects it like a virus that cannot be expelled.
Hmm, good point.
Mostly it seems to be handled here with that URL blacklist automod.
How exactly are they poisoning a pool of toxic waste?
Pissing into an ocean of piss.
Now it’s not only toxic, it’s also acidic and instead of killing you, it’ll also melt you.
I still haven’t seen a use of AI that doesn’t serve state or corporate interests first, before the general public. AI medical diagnostics comes the closest, but that’s being leveraged to justify further staffing reductions, not an additional check.
The AI-captcha wars are on, and no matter who wins we lose.
no matter who wins we lose.
AI is helping me learn and program C++. It’s built into my IDE. Much more efficient than searching stackoverflow. Whenever it comes up with something I’ve never seen before, I learn what that thing does and mentally store it away for future use. As time goes on, I’m relying on it less and less. But right now it’s amazing. It’s like having a tutor right there with you who you can ask questions anytime, 24/7.
I hope a point comes where my kid can just talk to a computer, tell it the specifics of the program he wants to create, and have the computer just program the entire thing. That’s the future we are headed towards. Ordinary folks being able to create software.
I’ll agree there’s huge potential for ‘assistant’ roles (exactly like you’re using) to give a concise summary for quick understanding. But LLMs aren’t knowledgeable like an accredited professor or tutor is, understanding the context and nuance of the topic. LLMs are very good at scraping together data and presenting the shallowest of information, but their limits get exposed quickly when you try to go into a topic.
For instance, I was working a project that required very long term storage (+10 years) with intermittent exposure to open air, and was concerned about oxidation and rust. ChatGPT was very adamant that desiccant alone was sufficient (wrong) and that VCI packs would last (also wrong). It did a great job of repackaging corporate ad-copy and industrial white papers written by humans, but not of providing an objective answer to a semi complex question.
I guess it’s not great for things requiring domain knowledge. Programming seems to be easy for it, as programs are very structured, predictable, and logical. That’s where its pattern-matching-and-prediction abilities shine.