• Ironfacebuster@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    Rockstar making GTA online be like: “Computer, here is a 512mb json file please download it from the server and then do nothing with it”

  • jballs@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 months ago

    I have the same problem with XML too. Notepad++ has a plugin that can format a 50MB XML file in a few seconds. But my current client won’t allow plugins installed. So I have to use VS Code, which chokes on anything bigger than what I could do myself manually if I was determined.

  • BaardFigur@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    I’ve never had any problems with 4,2 MB (and bigger) json files. What languages/libraries/editors chokes on it?

  • Xyloph@lemmy.ca
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    That is sometime the issue when your code editor is a disguised web browser 😅

  • AusatKeyboardPremi@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    Given it is a CPU is limiting the parsing of the file, I wonder how a GPU-based editor like Zed would handle it.

    Been wanting to test out the editor ever since it was partially open sourced but I am too lazy to get around doing it

    • icesentry@lemmy.ca
      link
      fedilink
      arrow-up
      0
      ·
      6 months ago

      That’s not how this works, GPUs are fast because the kind of work they do is embarrassingly parallel and they have hundreds of cores. Loading a json file is not something that can be trivially parallelized. Also, zed use the gpu for rendering, not reading files.

      • You999@sh.itjust.works
        link
        fedilink
        arrow-up
        0
        ·
        6 months ago

        I’d like to point out for those who aren’t in the weeds of silicon architecture, ‘embarrassingly parellel’ is the a type of computation work flow. It’s just named that because the solution was an embarrassingly easy one.

        • Kevin@programming.dev
          link
          fedilink
          arrow-up
          0
          ·
          6 months ago

          Huh, I was about to correct you on the use of embarrassment in that the intent was to mean a large amount, but it seems a Wiki edit reverted it to your meaning a year ago, thanks for making me check!

    • agelord@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      6 months ago

      As far as my understanding goes, Zed uses the GPU only for rendering things on screen. And from what I’ve heard, most editors do that. I don’t understand why Zed uses that as a key marketing point

  • Skull giver@popplesburger.hilciferous.nl
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    6 months ago

    Fifty million polygons processed by over 7 thousand processing cores (Intel iGPU), versus 4 million tokens processed by a single execution unit (with some instruction reordering tricky).

    • AdrianTheFrog@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      6 months ago

      Doesn’t a 3070 have less than 7k cores? A UHD 750 (relatively recent iGPU) only has 256.

      And I don’t know the structure of JSON that well, but can’t tokens be made of multiple chars?

      • Skull giver@popplesburger.hilciferous.nl
        link
        fedilink
        arrow-up
        0
        ·
        6 months ago

        You’re right, I looked up the highest Intel GPU count but forgot that they released desktop cards. Intel iGPUs “only” have 768 cores, it’s the Ampere cards that have thousands of cores.

        JSON is UTF-8 so it can be up to three bytes per token theoretically. Depends on the language you’re processing, I guess.

      • vvvvv@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        6 months ago

        106 Gbps

        They get to this result on 0.6 MB of data (paper, page 5)

        They even say:

        Moreover, there is no need to evaluate our design with datasets larger than the ones we have used; we achieve steady state performance with our datasets

        This requires an explanation. I do see the need - if you promise 100Gbps you need to process at least a few Tbs.

        • neatchee@lemmy.world
          link
          fedilink
          arrow-up
          0
          ·
          6 months ago

          Imagine you have a car powered by a nuclear reactor with enough fuel to last 100 years and a stable output of energy. Then you put it on a 5 mile road that is comprised of the same 250 small segments in various configurations, but you know for a fact that starts and ends at the same elevation. You also know that this car gains exactly as much performance going downhill as it loses going uphill.

          You set the car driving and determine that, it takes 15 minutes to travel 5 miles. You reconfigure the road, same rules, and do it again. Same result, 15 minutes. You do this again and again and again and always get 15 minutes.

          Do you need to test the car on a 20 mile road of the same configuration to know that it goes 20mph?

          JSON is a text-based, uncompressed format. It has very strict rules and a limited number of data types and structures. Further, it cannot contain computational logic on it’s own. The contents can interpreted after being read to extract logic, but the JSON itself cannot change it’s own computational complexity. As such, it’s simple to express every possible form and complexity a JSON object can take within just 0.6 MB of data. And once they know they can process that file in however-the-fuck-many microseconds, they can extrapolate to Gbps from there

          • vvvvv@lemmy.world
            link
            fedilink
            English
            arrow-up
            0
            ·
            6 months ago

            Based on your analogue they drive the car for 7.5 inches (614.4 Kb by 63360 inches by 20 divided by 103179878.4 Kb) and promise based on that that car travels 20mph which might be true, yes, but the scale disproportion is too considerable to not require tests. This is not maths, this is a real physical device - how would it would behave on larger real data remains to be seen.

        • frezik@midwest.social
          link
          fedilink
          arrow-up
          0
          ·
          edit-2
          6 months ago

          No, that’s not what RISC is about. There was some early attempts to keep the number of instructions low–originally, ARM didn’t have a multiply instruction, and there’s still a bunch of microcontrollers you can buy that don’t have a divide instruction–but it was quickly abandoned as it’s just not that useful. It only holds back instructions that optimize common cases. Your compiler can implement multiplication by doing addition in a loop, but that’s not very efficient.

          What really worked about it was keeping a separation between how memory is accessed. You don’t have an ADD instruction that can fetch from both registers or main memory. You have a MOV instruction that can fetch from memory into a register, and you have an ADD instruction that can work on registers.

          ARM still does this just fine.

          • DumbAceDragon@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            0
            ·
            edit-2
            6 months ago

            I’m a computer engineering major (still a student tbf), I’m well aware of the difference between CISC and RISC, I was making a joke.

            Also, I understand your point, but you should know though that a load-store architecture and a RISC instruction set are not the same thing. The vast majority of RISC ISAs are load-store, but not all load-store architectures are RISC.

            • frezik@midwest.social
              link
              fedilink
              arrow-up
              0
              ·
              6 months ago

              http://www.quadibloc.com/arch/sriscint.htm

              The RISC architecture contains several common elements. Some of them are no longer present in most chips that still call themselves RISC:

              • All instructions execute in a single cycle.
              • Floating-point operations, specifically, are therefore excluded.

              But most of the defining characteristics of RISC do remain in force:

              • All instructions occupy the same amount of space in memory.
              • Only load, store, and jump instructions directly address memory. Calculations are performed only between operands in registers.

              https://groups.google.com/g/comp.arch/c/IZP5KUJprHw?pli=1

              MOST RISCs:
              3a) Have 1 size of instruction in an instruction stream
              3b) And that size is 4 bytes
              3c) Have a handful (1-4) addressing modes) (* it is VERY hard to count these things; will discuss later).
              3d) Have NO indirect addressing in any form (i.e., where you need one memory access to get the address of another operand in memory)
              4a) Have NO operations that combine load/store with arithmetic, i.e., like add from memory, or add to memory. (note: this means especially avoiding operations that use the value of a load as input to an ALU operation, especially when that operation can cause an exception. Loads/stores with address modification can often be OK as they don’t have some of the bad effects)
              4b) Have no more than 1 memory-addressed operand per instruction
              5a) Do NOT support arbitrary alignment of data for loads/stores
              5b) Use an MMU for a data address no more than once per instruction
              6a) Have >=5 bits per integer register specifier
              6b) Have >= 4 bits per FP register specifier

              Note that none of this has to do with reducing the number of instructions, which is what people tend to think of when they hear the name.

              • barsoap@lemm.ee
                link
                fedilink
                arrow-up
                0
                ·
                6 months ago

                All instructions occupy the same amount of space in memory.

                Both ARM and RISC-V have compressed instructions. Dunno how ARM works but with RISC-V the 16-bit instruction set is freely interspersable with the 32 bit one, which also get their alignment reduced to 16 bits. Gets like 95% of the space reduction possible with full variable-width instructions without overcomplicating the insn decoder.

                As to addressing and loads and arithmetic: No such instructions, but every CPU but the tiniest ones are expected to do macro-op fusion for things like indexed loads. Here’s an overview.

                The MMU thing… well the vector extension can do gather/scatter, I guess it could stay within the letter of “use the MMU once” but definitely not the spirit.

        • ChaoticNeutralCzech@feddit.de
          link
          fedilink
          arrow-up
          0
          ·
          edit-2
          6 months ago

          The website title says “Arm Developer”, not “ARM Developer”, in a clearly non-acronym way so it’s a guide for making prosthetic hardware. Of course you want a cyborg arm to parse JS natively, why else even get one?

        • barsoap@lemm.ee
          link
          fedilink
          arrow-up
          0
          ·
          6 months ago

          Nope it’s still a register-register op, that’s very much load-store architecture.

          It’s reduced, not minimalist, otherwise every RISC CPU out there would only have one instruction like decrement and branch if nonzero. RISC-V would not have an extension mechanism. The instruction exists because it makes things faster because you don’t have to do manual bit-fiddling over 10 instructions to achieve a thing already-existing ALU logic can do in a single cycle. A thing that isn’t even javascript-specific (or terribly relevant to json), it’s a specific float to int cast with specific rounding and overflow mode. Would it more palatable to your tastes if the CPU were to do macro-op fusion on 10(!) instructions to get the same result?

      • ramble81@lemm.ee
        link
        fedilink
        arrow-up
        0
        ·
        6 months ago

        My thoughts on software in general over the past 20 years. So many programs inefficiently written and in 4th level languages just eats up any CPU/memory gain. (Less soap box and more of a curious what if to how fast things would be if we still wrote highly optimized programs)

        • raspberriesareyummy@lemmy.world
          link
          fedilink
          arrow-up
          0
          ·
          6 months ago

          I fully concur. There’s tons of really inefficient software out there that wastes resources just because for a long time, available resources grew fast enough to just keep using more of them without the net speed of an application slowing down. If we didn’t have so many lazy SW devs, I suspect the reduction in needed CPU cycles would have a measurable positive effect on climate change.

        • masterspace@lemmy.ca
          link
          fedilink
          English
          arrow-up
          0
          ·
          edit-2
          6 months ago

          Answer: there’d be far less software in the world, it would all be more archaic and less useful, and our phones and laptops would just sit at 2% utilization most of the time.

          There’s an opportunity cost to everything, including fussing over whether that value can be stored as an int instead of a double to save 8 bits of space. High level languages let developers express their feature and business logic faster, with fewer bugs, and much lower ongoing maintenance costs.