• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    13 hours ago

    Not just them. GLM, Qwen, Kimi, Stepfun, Baidu’s models. Z-Image. Small finetuners, Huawei’s prototype. There’s even a Chinese fast food chain that trains a ridiculously good audio/text mixed model (Longcat).

    I actually thought the recent Deepseek preview was a little underwhelming and “deep fried” compared to competition, though maybe it’s just underbaked. And the architecture is interesting.

    Gemma is great, too, if Google would actually unrestrain it and give it Gemini’s architecture.

    Europe is struggling though. Mistral (and everyone else) basically can’t do anything because the EU left regulation ambiguous; however strictly they regulate AI (and it should be pretty strict), anything is better than “we have no idea if we’ll get litigated, the law is clear as mud and might change?” They have at least one communal training project too, but everything I’ve seen is weirdly dated, architecture wise, like they’re living two years in the past.