I’m looking at people running Deepseek’s 671B R1 locally using NVME drives at 2 tokens a second. So why not skip the FLASH and give me a 1TB drive using a NVME controller and only DRAM behind it? The ultimate transistor count is lower on the silicon side. It would be slower than typical system memory but the same as any NVME with a DRAM cache. The controller architecture for safe page writes in Flash and the internal boost circuitry for pulsing each page is around the same complexity as the memory management unit and constant refresh of DRAM and it’s power stability requirements. Heck, DDR5 and up is putting power regulation on the system memory already IIRC. Anyone know why this is not a thing or where to get this thing?

  • Carobu@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    2 months ago

    I THINK you could accomplish something like this by making your swap file absurdly huge (like 1tb) and then setting your application to use the \tmp or any of the other folders that are technically in RAM in Linux. The only issue is I don’t know if you could tell it to only use the m.2 and it would obviously be somewhat random with where it locates data in the m.2 vs actual RAM. Maybe if you set your swap to a different device and ONLY told it to use that?

    I suspect that still probably wouldn’t be exactly what you want. What you actually want is Intel optane.