Are there m.2 keyed RAM drives anywhere yet? Like a NVME but all DRAM and no Flash

𞋴𝛂𝛋𝛆@lemmy.world · 3 months ago

Are there m.2 keyed RAM drives anywhere yet? Like a NVME but all DRAM and no Flash

Shadow@lemmy.ca · 3 months ago

Modern flash is already faster than your pci bus, and it’s cheaper than dram. Using ram doesn’t add anything.

It uses to be a thing before modern flash chips, you’d have battery backed dram pci-e cards.

MHLoppy@fedia.io · 3 months ago

Using ram doesn’t add anything.

It would improve access latency vs flash though, despite less difference in raw bandwidth

𞋴𝛂𝛋𝛆@lemmy.world · 3 months ago

It adds an additional memory controller on a different bus and infinite read/write cycling to load much larger AI models.

Shadow@lemmy.ca · 3 months ago

Memory connected via the pci bus to the CPU, would be too slow for application use like that.

Apple had to use soldered in ram for their unified memory because the length of the traces on the mobo news to be so tightly controlled. Pci is way too slow comparatively.

𞋴𝛂𝛋𝛆@lemmy.world · 3 months ago

Not at all. An NVME already works as I clearly stated in the post. The speed is irrelevant with very large models. They are MoEs so they get loaded and moved around in large blocks once per inference. The only issue is cycling a NVME. They will still work, it would just be nice to not worry about the limited cycle life. I am setting up agentic toolsets where models will get loaded and offloaded a lot. I do this regularly with 40-50GB models already. I want to double or quadruple this amount.

MHLoppy@fedia.io · 3 months ago

Memory connected via the pci bus to the CPU, would be too slow for application use like that.

https://www.intel.com/content/www/us/en/content-details/842211/optimizing-system-memory-bandwidth-with-micron-cxl-memory-expansion-modules-on-intel-xeon-6-processors.html

The experimental results presented in this paper demonstrate that Micron’s CZ122 CXL memory modules used in software level ratio based weighted interleave configuration significantly enhance memory bandwidth for HPC and AI workloads when used on systems with Intel’s 6th Generation Xeon processors.

Found via Wendell: YouTube

edit: typo