DeepSeek launched a free, open-source large language model in late December, claiming it was developed in just two months at a cost of under $6 million.
fascinating. my boss really bought into the tech bro bullshit, every time we get coffee as a team, he’s always going on and on about how chatGPT will be the savior of humanity, increase productivity so much that we’ll have a 2 day work week, blah blah blah.
I’ve been on his shit list lately because i had to take some medical leave and didn’t deliver my project on time.
Now that this thing is open sourced, I can bring it to him, tell him it out performs even chatgpt O1 or whatever it is, and tell him that we can operate it locally. I’ll be off the shit list and back into his good graces and maybe even get a raise.
Your boss sounds like he buys into bullshit for a living. Maybe that’s what drew him to the job, lol.
I think believing in our corporate AI overlords is even overshadowed by believing those same corporations would pass the productivity gains on to their employees.
But I feel like that will just lead to more training with the same (or more) hardware with a more efficient model. Bitcoin mining didn’t slow down only because it got harder. However I don’t know enough about the training process. I assume more efficient use of the hardware would allow for larger models to be trained on the same hardware and training data?
They’ll probably do that, but that’s assuming we aren’t past the point of diminishing returns.
The current LLM’s are pretty basic in how they work, and it could be that with the current training we’re near what they’ll ever be capable of. They’ll of course invest a billion in training a new generation, but if it’s only marginally better than the current one, they won’t keep investing billions into it if it doesn’t really improve the results.
does it really need less power? I’m playing around with it now and I’m pretty impressed so far. it can do math, at least.
That’s the claim, it has apparently been trained using a fraction of the compute power of the GPT models and achieves similar results.
fascinating. my boss really bought into the tech bro bullshit, every time we get coffee as a team, he’s always going on and on about how chatGPT will be the savior of humanity, increase productivity so much that we’ll have a 2 day work week, blah blah blah.
I’ve been on his shit list lately because i had to take some medical leave and didn’t deliver my project on time.
Now that this thing is open sourced, I can bring it to him, tell him it out performs even chatgpt O1 or whatever it is, and tell him that we can operate it locally. I’ll be off the shit list and back into his good graces and maybe even get a raise.
Your boss sounds like he buys into bullshit for a living. Maybe that’s what drew him to the job, lol.
I think believing in our corporate AI overlords is even overshadowed by believing those same corporations would pass the productivity gains on to their employees.
But I feel like that will just lead to more training with the same (or more) hardware with a more efficient model. Bitcoin mining didn’t slow down only because it got harder. However I don’t know enough about the training process. I assume more efficient use of the hardware would allow for larger models to be trained on the same hardware and training data?
They’ll probably do that, but that’s assuming we aren’t past the point of diminishing returns.
The current LLM’s are pretty basic in how they work, and it could be that with the current training we’re near what they’ll ever be capable of. They’ll of course invest a billion in training a new generation, but if it’s only marginally better than the current one, they won’t keep investing billions into it if it doesn’t really improve the results.