Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue

ForgottenFlux@lemmy.world · 1 year ago

Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue

lemmylommy@lemmy.world · 1 year ago

When do they announce a deal with 4chan?

e0qdk@reddthat.com · 1 year ago

I wonder what cuil things it will say if you start asking questions about hamburgers instead…

catloaf@lemm.ee · 1 year ago

I primed ChatGPT with cuil theory (which it already knew) and here’s what it came up with

You ask me for a hamburger. I nod and walk into the kitchen, but instead of returning with a hamburger, I bring you a picture of a hamburger. Confused, you ask again, and this time I present you with a photograph of you asking for a hamburger. Frustrated, you repeat your request, and I hand you an intricate painting of the universe, meticulously devoid of any trace of hamburgers. Baffled, you insist once more, and suddenly, a trout appears, reciting lines from Shakespeare’s “Hamlet.” Undeterred, you ask again, and I give you a detailed map of Atlantis, with all the continents shaped like hamburgers. Your persistence leads me to produce an ancient scroll, describing a hamburger in a forgotten language. As your patience wears thin, I conjure a sentient cloud that dreams of becoming a hamburger. Still seeking a hamburger, you find yourself transported to a dimension where hamburgers debate human rights. Finally, a symphony envelops you, its notes tasting like a hamburger. At your final request, the fabric of reality unravels, and in an existential twist, you become the hamburger you so desperately sought.

Not bad. Doesn’t look like it cribbed directly from any existing texts, at least as far as I can tell by searching Google for “cuil hamburger hamlet atlantis”.

Chef@sh.itjust.works · 1 year ago

Hey Google, when is Jenny available to meet up for kisses?

Ketchup@reddthat.com · 1 year ago

Now I only regret not *EDITING all of my Reddit posts to say complete nonsense when I deleted my account June 2023. Instead I deleted each and every post and requested a copy of my data to cost them money.

catloaf@lemm.ee · 1 year ago

I’m sure they used a dataset from before people started editing and deleting stuff.

Margot Robbie@lemmy.world · 1 year ago

Reddit, and by extension, Lemmy, offers the ideal format for LLM datasets: human generated conversational comments, which, unlike traditional forums, are organized in a branched nested format and scored with votes in the same way that LLM reward models are built.

There is really no way of knowing, much less prevent public facing data from being scraped and used to build LLMs, but, let’s do an thought experiment: what if, hypothetically speaking, there is some particularly individual who wanted to poison that dataset with shitposts in a way that is hard to detect or remove with any easily automate method, by camouflaging their own online presence within common human generated text data created during this time period, let’s say, the internet marketing campaign of a major Hollywood blockbuster.

Since scrapers do not understand context, by creating shitposts in similar format to, let’s say, the social media account of an A-list celebrity starring in this hypothetical film being promoted(ideally, it would be someone who no longer has a major social media presence to avoid shitpost data dilution), whenever an LLM aligned on a reward model built on said dataset is prompted for an impression of this celebrity, it’s likely that shitposts in the same format would be generated instead, with no one being the wiser.

That would be pretty funny.

Again, this is entirely hypothetical, of course.

ericatty@lemmy.ml · 1 year ago

The new SEO model

WindyRebel@lemmy.world · 1 year ago

As an SEO - I don’t want this AI crap at all in search. Leave it on its own siloed platform, please!

kjaeselrek@lemmy.ml · 1 year ago

What’s this about shitposting? I’m just here to talk about rampart.

Margot Robbie@lemmy.world · 1 year ago

I knew it! So that’s what you’ve really been up to on Lemmy, @kjaeswlrejk@lemmy.ml.

Or should I say, Academy Award nominated actor Woody Harrelson?

CheeseNoodle@lemmy.world · 1 year ago

So we should all start ending our comments with a randomly generated string of words to fuck with the models?

stork, fridge, tiger, animal, mineral, oxtail, oil, clouds

Clasm@lemmy.world · 1 year ago

Ideally, it would be the same word over and over, so that we can trick the AI into ending all sentences with the word. Bonus points if it is the word “buffalo”, since it can from a grammatically correct sentence.

Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo

Konala Koala@lemmy.world · 1 year ago

At least this is not “Google Is Paying Lemmy $60 Million for Fucksmith to Tell Its Lemmings to Eat Glue” otherwise I would be wondering why Lemmy Admins are excepting huge wads of cash from tech giants.

CodeInvasion@sh.itjust.works · 1 year ago

You do realize that every posted on the Fediverse is open and publicly available? It’s not locked behind some API or controlled by any one company or entity.

Fediverse is the Wikipedia of encyclopedias and any researcher or engineer, including myself, can and will use Lemmy data to create AI datasets with absolutely no restrictions.

AnarchistArtificer@slrpnk.net · 1 year ago

I personally don’t have nearly as much of a problem with that than I do with Reddit making AI deals. I’m still not keen on the idea of having anything I interact with being scraped for training AI, but aside from only interacting in closed wall spaces that I or someone I trust controls, I can’t change that. That’a not great for actually interacting with the world though, so it seems that I need to accept that scraping is going to happen. Given that, I’d definitely rather be on Lemmy than Reddit.

And this way, who knows, maybe we’re on our way to the almost utopian “open digital commons”

ReallyActuallyFrankenstein@lemmynsfw.com · 1 year ago

Fediverse is the Wikipedia of encyclopedias

Isn’t Wikipedia the Wikipedia of encyclopedias?

NutWrench@lemmy.world · 1 year ago

They also highlight the fact that Google’s AI is not a magical fountain of new knowledge, it is reassembled content from things humans posted in the past indiscriminately scraped from the internet and (sometimes) remixed to look like something plausibly new and “intelligent.”

This. “AI” isn’t coming up with new information on its own. The current state of “AI” is a drooling moron, plagiarizing any random scrap of information it sees in a desperate attempt to seem smart. The people promoting AI are scammers.

WholeEnchilada@lemmy.today · 1 year ago

Yeah, just like that x-files episode with the sushi and the theme of teaching them well.

Schadrach@lemmy.sdf.org · 1 year ago

I mean in this case it’s probably more accurately web search results being fed into an LLM and asked to summarize said results. Which if web search results were consistently good and helpful might be a useful feature instead of the thing you skip past and look for links to something useful.

Queen HawlSera@lemm.ee · 1 year ago

Can reddit just fucking die off?

ahal@lemmy.ca · 1 year ago

Not disagreeing with the sentiment… But how is this Reddit’s fault? This is entirely on Google.

Maggoty@lemmy.world · 1 year ago

I’m just thinking of all the really dumb shit we all said on Reddit as satire. Oh I need to go search military meme stuff!

nyan@lemmy.cafe · 1 year ago

This is why you don’t train a bot on the entire Internet and then use it to offer advice. Even if only 1% of all posts are dangerously ignorant . . . that’s a lot of dangerous ignorance.

Fortunately, this particular piece of bad advice is unlikely to poison any fool who goes through with it, since PVA glue is not considered an ingestion hazard, but “non-toxic” doesn’t mean “edible”, it just means “not going to poison you when used in the intended manner”. “Non-toxic” can still be quite dangerous if you mistake something intended as linoleum pigment for a dessert topping.

bane_killgrind@lemmy.ml · 1 year ago

There’s also wilful and or malicious ignorance

duffman@lemmy.world · 1 year ago

I Googled some extremely invasive weed(creeping buttercup) and Google suggested to let it be, quoting some awful reddit comment.

dumblederp@lemmy.world · 1 year ago

I googled how to increase my blue tooth range and was told to place the devices closer to each other.

geekworking@lemmy.world · 1 year ago

Just wait until they start scraping the chans

I expect it to create the next Qanon.

Nightwatch Admin@feddit.nl · 1 year ago

AI will continue to be shoved into every aspect of all of its products until morale improves

Stahp! I can only get so hard!

ILikeBoobies@lemmy.ca · 1 year ago

Speaking of, I found a recipe today which had to have been ai generated because the ingredient list and the directions were for completely different recipes

Moorshou@lemmy.zip · 1 year ago

It’s so fucking stupid