• silence7@slrpnk.netOP
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 day ago

    You can produce a remarkably good estimate by looking at CPU and GPU utilization out of procfs and profiling a handful of similar machines power use with similar utilization and workloads.

    Network is less than 5% of power use for non-GPU loads; probably less for GPU.

    • ddh@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 day ago

      Sure, you can do that at an aggregate level, but then how do you divide it by customer? And even then, some setups will be more efficient than others, so you’d only get that setup’s usage.

      And even if you do that and can narrow it down to a single user and a single prompt, you can still only roughly predict how long it will think and how long the response will be.