It’s not like the companies train one model and they use it for months until they need new version. They train new models all the time to update them and test new ideas.
They don’t use small models. Typical LLMs offered by ChatGPT or Claude are the big ones
They process thousands of queries per second so their GPUs are maxed out all the time, not just for few seconds.
Wouldn’t it then help to run the smaller ones locally instead of using the big ones like ChatGPT?
I read that one called Deepmind or something in china took a lot less to train and is just as strong. Is that true?
What do people usually use LLM’s for? I know they suck for most things people are using them for like coding. But what do people use them for that justifies all the hype?
Again, please don’t think i am trying to justify it. I just don’t know super much about them.
Wouldn’t it then help to run the smaller ones locally instead of using the big ones like ChatGPT?
I read that one called Deepmind or something in china took a lot less to train and is just as strong. Is that true?
What do people usually use LLM’s for? I know they suck for most things people are using them for like coding. But what do people use them for that justifies all the hype?
Again, please don’t think i am trying to justify it. I just don’t know super much about them.