I’m looking at people running Deepseek’s 671B R1 locally using NVME drives at 2 tokens a second. So why not skip the FLASH and give me a 1TB drive using a NVME controller and only DRAM behind it? The ultimate transistor count is lower on the silicon side. It would be slower than typical system memory but the same as any NVME with a DRAM cache. The controller architecture for safe page writes in Flash and the internal boost circuitry for pulsing each page is around the same complexity as the memory management unit and constant refresh of DRAM and it’s power stability requirements. Heck, DDR5 and up is putting power regulation on the system memory already IIRC. Anyone know why this is not a thing or where to get this thing?
I don’t know if this is what OP is going for, but I’ve wanted to do something similar to what he’s talking about myself to exceed the maximum amount of memory that a motherboard supported. Basically, I wanted to stick more memory on a system – and I was fine with access to it being slower than to the on-motherboard memory – to act as a very large read cache.
A RAM drive will let you use memory that your motherboard supports as a drive. But it won’t let you stick even more physical DRAM into a system, above-and-beyond what the motherboard can handle.
You’re describing swap.