Are there m.2 keyed RAM drives anywhere yet? Like a NVME but all DRAM and no Flash

𞋴𝛂𝛋𝛆@lemmy.world · 3 months ago

Are there m.2 keyed RAM drives anywhere yet? Like a NVME but all DRAM and no Flash

UberKitten@lemmy.blahaj.zone · 3 months ago

a ram drive is volatile memory. you can get higher performance out of DRAM chips attached to the CPU memory controller, versus putting them behind the PCIe bus using NVME. for applications that only work with file systems, a RAM drive works around that limitation. so, why bother?

𞋴𝛂𝛋𝛆@lemmy.world · 3 months ago

Enormous model access. It is too slow for real time but it is possible to load a 671 billion parameter model in 2 bit quantization with 4 experts at barely over 2 tokens a second. That is the hyperbolic extreme. It really means that I go from my present limit around 70B and can double it. The more interesting aspect is what is possible with a larger mixture of experts model. The new Llama 4 models are exploring this concept. I already run a 8×7B model most of the time because it is so fast and nearly on par with a 70B. One of the areas that is really opening up right now is a MoE built out of something like a 3B and run at full model weights on a 16 GB GPU. It would be possible to do a lot with a model like that because fine tuning individual 3B models is accessible on that same 16GB GPU. Then it becomes possible to start storing your own questions and answers and using them for training in niche subjects, then stitch them together in your own FrankenMoE. Like the main training set used to teach a model to think like Deepseek R1 is only 600 questions long. If you actually read most of this kind of training dataset it is half junk most of the time. If a person that knows how to prompt well saves their own dataset, much better results will follow. A very large secondary nonvolatile storage makes it much more reasonable to load and unload 200 - 400 GB a few dozen times a day. With an extensive agentic toolset, up that by an order of magnitude. If the toolset is automated with several models being swapped out, raise that another order of mag