• xthexder@l.sw0.com
    link
    fedilink
    English
    arrow-up
    1
    ·
    7 hours ago

    I think part of the difference is the amount of output being measured. Maybe a single statement has a 10% chance of being wrong, but over the course of a whole response the likelihood of there being an incorrect statement goes up. After only 5 statements at 10% error, that’s a 40% chance of being wrong in some way.

    I don’t have any real numbers, just personal experience using AI for programming at work, and all of these numbers (10%, 40%, 70%) seem plausible depending on exactly what you’re measuring.