• Mojave@lemmy.world
    link
    fedilink
    English
    arrow-up
    33
    ·
    11 months ago

    DeepSeek claimed the model training took 2,788 thousand H800 GPU hours, which, at a cost of $2/GPU hour, comes out to a mere $5.576 million.

    That seems impossibly low.

    DeepSeek is clear that these costs are only for the final training run, and exclude all other expenses

    • flamingo_pinyata@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      This claims only training costs come out to $5M. It doesn’t include previous versions (there had to have been multiple attempts before the final one), or salary of the people working on it.