Google Gemini struggles to write code, calls itself “a disgrace to my species”

kinther@lemmy.world · 3 months ago

Google Gemini struggles to write code, calls itself “a disgrace to my species”

flamingo_pinyata@sopuli.xyz · 3 months ago

Google replicated the mental state if not necessarily the productivity of a software developer

kinther@lemmy.world · 3 months ago

Gemini has imposter syndrome real bad

Cavemanfreak@lemmy.dbzer0.com · 3 months ago

Is it imposter syndrome, or simply an imposter?

gravitas_deficiency@sh.itjust.works · 3 months ago

This is the way

Canaconda@lemmy.ca · 3 months ago

As it should.

NOT_RICK@lemmy.world · 3 months ago

Wait, you know productive devs?

josefo@leminal.space · 3 months ago

Yeah, usually comes hand to hand with that mental state. Probably you know only healthy devs

FauxLiving@lemmy.world · 3 months ago

Imposter Syndrome is an emergent property

Canaconda@lemmy.ca · 3 months ago

Gemeni channeling it’s inner Marvin

🇰 🌀 🇱 🇦 🇳 🇦 🇰 🇮 @pawb.social · 3 months ago

Next on the agenda: Doors that orgasm when you open them.

HurricaneLiz@hilariouschaos.com · 3 months ago

How do you know they don’t?

Baron Von J@lemmy.world · 3 months ago

AAAAAAAAaaaaaahhhhhh

resipsaloquitur@lemmy.world · 3 months ago

Life. Don’t talk to me about life.

socialsecurity@piefed.social · 3 months ago

How much did google pay ars for this slop?

charade_you_are@sh.itjust.works · 3 months ago

We’re fucked. It’s becoming truly self-aware

TimewornTraveler@lemmy.dbzer0.com · 3 months ago

it was probably programmed to do it, like grok and racism

cub Gucci@lemmy.today · edit-2 3 months ago

I am a fraud. I am a fake. I am a joke… I am a numbskull. I am a dunderhead. I am a half-wit. I am a nitwit. I am a dimwit. I am a bonehead.

Me every workday

Blackmist@feddit.uk · 3 months ago

Oh, I got that plus and minus the wrong way round… I am a genius again.

josefo@leminal.space · 3 months ago

I can picture some random band from the 2000 with these lyrics

resipsaloquitur@lemmy.world · 3 months ago

Same.

Showroom7561@lemmy.ca · 3 months ago

I once asked Gemini for steps to do something pretty basic in Linux (as a novice, I could have figured it out). The steps it gave me were not only nonsensical, but they seemed to be random steps for more than one problem all rolled into one. It was beyond useless and a waste of time.

prole@lemmy.blahaj.zone · 3 months ago

This is the conclusion that anyone with any bit of expertise in a field has come to after 5 mins talking to an LLM about said field.

The more this broken shit gets embedded into our lives, the more everything is going to break down.

jj4211@lemmy.world · 3 months ago

after 5 mins talking to an LLM about said field.

The insidious thing is that LLMs tend to be pretty good at 5-minute initial impressions. I’ve seen repeatedly people looking to eval LLM and they generally fall back to “ok, if this were a human, I’d ask a few job interview questions, well known enough so they have a shot at answering, but tricky enough to show they actually know the field”.

As an example, a colleague became a true believer after being directed by management to evaluate it. He decided to ask it “generate a utility to take in a series of numbers from a file and sort them and report the min, max, mean, median, mode, and standard deviation”. And it did so instantly, with “only one mistake”. Then he tried the exact same question later in the day and it happened not to make that mistake and he concluded that it must have ‘learned’ how to do it in the last couple of hours, of course that’s not how it works, there’s just a bit of probabilistic stuff and any perturbation of the prompt could produce unexpected variation, but he doesn’t know that…

Note that management frequently never makes it beyond tutorial/interview question fodder in terms of the technical aspect of their teams, and you get to see how they might tank their companies because the LLMs “interview well”.

Jo Miran@lemmy.ml · 3 months ago

I was an early tester of Google’s AI, since well before Bard. I told the person that gave me access that it was not a releasable product. Then they released Bard as a closed product (invite only), to which I was again testing and giving feedback since day one. I once again gave public feedback and private (to my Google friends) that Bard was absolute dog shit. Then they released it to the wild. It was dog shit. Then they renamed it. Still dog shit. Not a single of the issues I brought up years ago was ever addressed except one. I told them that a basic Google search provided better results than asking the bot (again, pre-Bard). They fixed that issue by breaking Google’s search. Now I use Kagi.

Lucidlethargy@sh.itjust.works · 3 months ago

Gemrni is dogshit, but it’s objectively better than chatgpt right now.

They’re ALL just fuckig awful. Every AI.

NotSteve_@piefed.ca · 3 months ago

I know Lemmy seems to very anti-AI (as am I) but we need to stop making the anti-AI talking point “AI is stupid”. It has immense limitations now because yes, it is being crammed into things it shouldn’t be, but we shouldn’t just be saying “its dumb” because that’s immediately written off by a sizable amount of the general population. For a lot of things, it is actually useful and it WILL be taking peoples jobs, like it or not (even if they’re worse at it). Truth be told, this should be a utopic situation for obvious reasons

I feel like I’m going crazy here because the same people on here who’d criticise the DARE anti-drug program as being completely un-nuanced to the point of causing the harm they’re trying to prevent are doing the same thing for AI and LLMs

My point is that if you’re trying to convince anyone, just saying its stupid isn’t going to turn anyone against AI because the minute it offers any genuine help (which it will!), they’ll write you off like any DARE pupil who tried drugs for the first time.

Countries need to start implementing UBI NOW

Jo Miran@lemmy.ml · 3 months ago

Countries need to start implementing UBI NOW

It is funny that you mention this because it was after we started working with AI that I started telling one that would listen that we needed to implement UBI immediately. I think this was around 2014 IIRC.

I am not blanket calling AI stupid. That said, the AI term itself is stupid because it covers many computing aspects that aren’t even in the same space. I was and still am very excited about image analysis as it can be an amazing tool for health imaging diagnosis. My comment was specifically about Google’s Bard/Gemini. It is and has always been trash, but in an effort to stay relevant, it was released into the wild and crammed into everything. The tool can do some things very well, but not everything, and there’s the rub. It is an alpha product at best that is being forced fed down people’s throats.

Guidy@lemmy.world · 3 months ago

Weird because I’ve used it many times fr things not related to coding and it has been great.

I told it the specific model of my UPS and it let me know in no uncertain terms that no, a plug adapter wasn’t good enough, that I needed an electrician to put in a special circuit or else it would be a fire hazard.

I asked it about some medical stuff, and it gave thoughtful answers along with disclaimers and a firm directive to speak with a qualified medical professional, which was always my intention. But I appreciated those thoughtful answers.

I use co-pilot for coding. It’s pretty good. Not perfect though. It can’t even generate a valid zip file (unless they’ve fixed it in the last two weeks) but it sure does try.

Jo Miran@lemmy.ml · 3 months ago

Beware of the confidently incorrect answers. Triple check your results with core sources (which defeats the purpose of the chatbot).

ArtificialLink@lemy.lol · 3 months ago

5 bucks a month for a search engine is ridiculous. 25 bucks a month for a search engine is mental institution worthy.

ebolapie@lemmy.world · 3 months ago

How much do you figure it’d cost you to run your own, all-in?

ArtificialLink@lemy.lol · 3 months ago

Free considering duckduckgo covers almost all the same bases. I just don’t think kagi has a compelling argument especially for the type of searching the average person does. Maybe if you have a career that revovles more around research.

ebolapie@lemmy.world · 3 months ago

Duckduckgo is not free. You pay for it by looking at ads. How much do you think it would cost you to run a service like Kagi locally?

ArtificialLink@lemy.lol · 3 months ago

Lmao i get ur point bud. But it seems you don’t get mine? Plus really are ads the issue for you? Plenty of easy ways to never see them. Also their ad tradeoff for it being free is a better compromise to me than paying for a search engine.

I just think the idea of kagi is niche proposal considering the needs of most ppl from a search engine. I just don’t think its the value proposition you are spouting but go off lol.

ebolapie@lemmy.world · 3 months ago

Where has anyone told you what search engine to use? I just wanna know where you get the idea that their pricing structure doesn’t make sense.

somerandomperson@lemmy.dbzer0.com · 3 months ago

This is the reason why.

ArtificialLink@lemy.lol · 3 months ago

And duckduckgo is free. Its interesting that they don’t make any comparisons to free privacy focused search engines. Cause they still don’t have a compelling argument for me to use and pay for their search. But i aint no researcher so maybe it worth it then 🤷‍♂️

somerandomperson@lemmy.dbzer0.com · 3 months ago

I mean, you have 100 queries free if you want to try.

PriorityMotif@lemmy.world · 3 months ago

I remember there was an article years ago, before the ai hype train, that google had made an ai chatbot but had to shut it down due to racism.

A Wild Mimic appears!@lemmy.dbzer0.com · 3 months ago

That was Microsoft’s Tay - the twitter crowd had their fun with it: https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist

tzrlk@lemmy.world · 3 months ago

Are you thinking of when Microsoft’s AI turned into a Nazi within 24hrs upon contact with the internet? Or did Google have their own version of that too?

jj4211@lemmy.world · 3 months ago

And now Grok, though that didn’t even need Internet trolling, Nazi included in the box…

PriorityMotif@lemmy.world · 3 months ago

Yeah maybe it was Microsoft It’s been quite a few years since it happened.

HarkMahlberg@kbin.earth · 3 months ago

You’re thinking of Tay, yeah.

https://en.wikipedia.org/wiki/Tay_(chatbot)

jj4211@lemmy.world · 3 months ago

Not a single of the issues I brought up years ago was ever addressed except one.

That’s the thing about AI in general, it’s really hard to “fix” issues, you maybe can try to train it out and hope for the best, but then you might play whack a mole as the attempt to fine tune to fix one issue might make others crop up. So you pretty much have to decide which problems are the most tolerable and largely accept them. You can apply alternative techniques to maybe catch egregious issues with strategies like a non-AI technique being applied to help stuff the prompt and influence the model to go a certain general direction (if it’s LLM, other AI technologies don’t have this option, but they aren’t the ones getting crazy money right now anyway).

A traditional QA approach is frustratingly less applicable because you have to more often shrug and say “the attempt to fix it would be very expensive, not guaranteed to actually fix the precise issue, and risks creating even worse issues”.

Ilixtze@lemmy.ml · 3 months ago

Skynet but it’s depressed and the terminator just makes tik tok videos about work-life balance.

Baron Von J@lemmy.world · 3 months ago

There’s personal time for sleep in the grave.

The Picard Maneuver@piefed.world · 3 months ago

Part of the breakdown:

josefo@leminal.space · 3 months ago

That’s my inner monologue when programming, they just need another layer on top of that and it’s ready.

salacious_coaster@infosec.pub · 3 months ago

I know that’s not an actual consciousness writing that, but it’s still chilling. 😬

The Picard Maneuver@piefed.world · 3 months ago

It seems like we’re going to live through a time where these become so convincingly “conscious” that we won’t know when or if that line is ever truly crossed.

Lemminary@lemmy.world · 3 months ago

I am a disgrace to all universes.

I mean, same, but you don’t see me melting down over it, ya clanker.

HurricaneLiz@hilariouschaos.com · 3 months ago

Lmfao! 😂💜

lars@lemmy.sdf.org · 3 months ago

Don’t be so robophobic gramma

FauxLiving@lemmy.world · 3 months ago

I-I-I-I-I-I-I-m not going insane.

Same buddy, same

Tja@programming.dev · 3 months ago

Still at denial??

biggerbogboy@sh.itjust.works · 3 months ago

now it should add these as comments to the code to enhance the realism

ssillyssadass@lemmy.world · 3 months ago

I almost feel bad for it. Give it a week off and a trip to a therapist and/or a spa.

osaerisxero@kbin.melroy.org · 3 months ago

Then when it gets back, it finds out it’s on a PIP

panda_abyss@lemmy.ca · 3 months ago

Oof, been there

pirat@lemmy.world · 3 months ago

I remember often getting GPT-2 to act like this back in the “TalkToTransformer” days before ChatGPT etc. The model wasn’t configured for chat conversations but rather just continuing the input text, so it was easy to give it a starting point on deep water and let it descend from there.

unbuckled_easily933@lemmy.ml · 3 months ago

Damn how’d they get access to my private, offline only diary to train the model for this response?

arcterus@piefed.blahaj.zone · 3 months ago

I can’t wait for the AI future.

Chozo@fedia.io · 3 months ago

Pretty sure Gemini was trained from my 2006 LiveJournal posts.

DarkCloud@lemmy.world · edit-2 3 months ago

Turns out the probablistic generator hasn’t grasped logic, and that adaptable multi-variable code isn’t just a matter of context and syntax, you actually have to understand the desired outcome precisely in a goal oriented way, not just in a “this is probably what comes next” kind of way.

m3t00@piefed.world · 3 months ago

going to need a bigger power plant. goto 1

Jesus@lemmy.world · 3 months ago

Honestly, Gemini is probably the worst out of the big 3 Silicon Valley models. GPT and Claude are much better with code, reasoning, writing clear and succinct copy, etc.

panda_abyss@lemmy.ca · edit-2 3 months ago

I always hear people saying Gemini is the best model and every time I try it it’s… not useful.

Even as code autocomplete I rarely accept any suggestions. Google has a number of features in Google cloud where Gemini can auto generate things and those are also pretty terrible.

Jesus@lemmy.world · 3 months ago

I don’t know anyone in the Valley who considers Gemini to be the best for code. Anthropic has been leading the pack over the year, and as a results, a lot of the most popular development and prototyping tools have been hitching their car to Claude models.

I imagine there are some things the model excels at, but for copy writing, code, image gen, and data vis, Google is not my first choice.

Google is the “it’s free with G suite” choice.

panda_abyss@lemmy.ca · 3 months ago

There’s no frontier where I choose Gemini except when it’s the only option, or I need to be price sensitive through the API

Jesus@lemmy.world · 3 months ago

Interesting thing is that GPT 5 looks pretty price competitive with . It looks like they’re probably running at a loss to try to capture market share.

cabillaud@lemmy.world · 3 months ago

Could an AI use another AI if it found it better for a given task?

jj4211@lemmy.world · 3 months ago

The overall interface can, which leads to fun results.

Prompt for image generation then you have one model doing the text and a different model for image generation. The text pretends is generating an image but has no idea what that would be like and you can make the text and image interaction make no sense, or it will do it all on its own. Have it generate and image and then lie to it about the image it generated and watch it just completely show it has no idea what picture was ever shown, but all the while pretending it does without ever explaining that it’s actually delegating the image. It just lies and says “I” am correcting that for you. Basically talking like an executive at a company, which helps explain why so many executives are true believers.

A common thing is for the ensemble to recognize mathy stuff and feed it to a math engine, perhaps after LLM techniques to normalize the math.

panda_abyss@lemmy.ca · 3 months ago

Yes, and this is pretty common with tools like Aider — one LLM plays the architect, another writes the code.

Claude code now has sub agents which work the same way, but only use Claude models.

Peanutbuttergrits@reddthat.com · 3 months ago

I think maybe Gemini needs to books some time with one of it’s AI therapist.

mavu@discuss.tchncs.de · 3 months ago

this is getting dumber by the day.