I’m looking at people running Deepseek’s 671B R1 locally using NVME drives at 2 tokens a second. So why not skip the FLASH and give me a 1TB drive using a NVME controller and only DRAM behind it? The ultimate transistor count is lower on the silicon side. It would be slower than typical system memory but the same as any NVME with a DRAM cache. The controller architecture for safe page writes in Flash and the internal boost circuitry for pulsing each page is around the same complexity as the memory management unit and constant refresh of DRAM and it’s power stability requirements. Heck, DDR5 and up is putting power regulation on the system memory already IIRC. Anyone know why this is not a thing or where to get this thing?
What on earth are you on about? What negativity? I’m trying to suggest solutions and point out that maybe the solution is somewhere else than your initial idea?
This is a negative and personal statement without any basis. It is rude to make any unsolicited personal inference about my thoughts.
I explained my personal constraints and methodologies, along with the packages that enable the workflow. You then wholely dismiss that without reason or understanding to maintain this position as if it is the only path and repeat your original take with a dismissive attitude.
You have no idea what I am thinking, and I did not invite your opinion on that. I told you what is possible and works, along with enough supporting information to find this information, or for anyone else to find such information if they read this. You responded to that with an unsolicited comment and dogma. I find that offensive.