I’m looking at people running Deepseek’s 671B R1 locally using NVME drives at 2 tokens a second. So why not skip the FLASH and give me a 1TB drive using a NVME controller and only DRAM behind it? The ultimate transistor count is lower on the silicon side. It would be slower than typical system memory but the same as any NVME with a DRAM cache. The controller architecture for safe page writes in Flash and the internal boost circuitry for pulsing each page is around the same complexity as the memory management unit and constant refresh of DRAM and it’s power stability requirements. Heck, DDR5 and up is putting power regulation on the system memory already IIRC. Anyone know why this is not a thing or where to get this thing?

  • whaleross@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    19 days ago

    There are server motherboards that have slots for 1TB of RAM, if that would help? They are not cheap but maybe you could find one second hand.

    • 𞋴𝛂𝛋𝛆@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      19 days ago

      Back nearly 2 years ago, the cheapest option I found for a good GPU was a 2022 laptop with a 3080Ti. That got me a 16gb GPU because the mobile version is better specs than the discrete 3080. Unfortunately I max out at 64GB of addressable sysmem. I have dual NVME drives though.

      Most people here probably don’t know about Deepspeed and Zero 3. I looked it up to try and share the reference but the Deepspeed package has expanded so much in what it does that the functionality is obscured. I only know about it as an option in Oobabooga Textgen WebUI/llama.cpp. There, Deepspeed is how I can offload larger models onto the NVME when they do not fit on both GPU and sysmem.

      • whaleross@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        19 days ago

        I’m not sure where your mind went while writing this, but my comment was a suggestion for a possible solution to your original question about massive sets of fast volatile memory for storage. Maybe you need to consider changing platform and adapt your project to what is accessible if you want to make this happen. Unless you can find that exotic device that may or may not exist in the first place and afford it. I mean, is running it on your existing laptop really the show stopper requirement?

        • 𞋴𝛂𝛋𝛆@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          0
          ·
          19 days ago

          What is your mindset from casual neutral conversation to a personal negativity. This isn’t tech support. I know more than 90% of people here. I am sharing ideas because such information can be hard to find in search results. If such casual conversations offend you or you find it difficult to talk without making things personal, feel free to block me. I’m just some physically disabled guy in involuntary social isolation where this is my only place external human contact. I expect everyone to behave like the would in an public commons. I do not appreciate random negativity from strangers interacting with casual conversation.

          • surewhynotlem@lemmy.world
            link
            fedilink
            arrow-up
            0
            ·
            19 days ago

            You’re coming off like a wacko. I’m being objective, not mean, since I have no stake in this conversation. Just FYI

          • whaleross@lemmy.world
            link
            fedilink
            arrow-up
            0
            ·
            19 days ago

            What on earth are you on about? What negativity? I’m trying to suggest solutions and point out that maybe the solution is somewhere else than your initial idea?

            • 𞋴𝛂𝛋𝛆@lemmy.worldOP
              link
              fedilink
              English
              arrow-up
              0
              ·
              19 days ago

              I’m not sure where your mind went while writing this

              This is a negative and personal statement without any basis. It is rude to make any unsolicited personal inference about my thoughts.

              Maybe you need to consider changing platform

              I explained my personal constraints and methodologies, along with the packages that enable the workflow. You then wholely dismiss that without reason or understanding to maintain this position as if it is the only path and repeat your original take with a dismissive attitude.

              You have no idea what I am thinking, and I did not invite your opinion on that. I told you what is possible and works, along with enough supporting information to find this information, or for anyone else to find such information if they read this. You responded to that with an unsolicited comment and dogma. I find that offensive.

    • Cocodapuf@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      19 days ago

      At this moment, you can buy consumer motherboards with 8 ram slots. Filling them with 64gb dimms you can reach half a TB. It’s definitely more affordable than server parts, but it’s still pricey. I guess you’re looking at a $1500 system rather than a $20,000 system.