Sips'@slrpnk.net to Selfhosted@lemmy.worldEnglish · 1 year agoCan't relate at all.slrpnk.netexternal-linkmessage-square209linkfedilinkarrow-up11.05Karrow-down124
arrow-up11.03Karrow-down1external-linkCan't relate at all.slrpnk.netSips'@slrpnk.net to Selfhosted@lemmy.worldEnglish · 1 year agomessage-square209linkfedilink
minus-squarebrucethemoose@lemmy.worldlinkfedilinkEnglisharrow-up3·1 year agoDepends which 14B. Arcee’s 14B SuperNova Medius model (which is a Qwen 2.5 with some training distilled from larger models) is really incrtedible, but old Llama 2-based 13B models are awful.
minus-squareHackworth@lemmy.worldlinkfedilinkEnglisharrow-up2·1 year agoI’ll try it out! It’s been a hot minute, and it seems like there are new options all the time.
minus-squarebrucethemoose@lemmy.worldlinkfedilinkEnglisharrow-up3·1 year agoTry a new quantization as well! Like an IQ4-M depending on the size of your GPU, or even better, an 4.5bpw exl2 if you can manage to set up TabbyAPI.
Depends which 14B. Arcee’s 14B SuperNova Medius model (which is a Qwen 2.5 with some training distilled from larger models) is really incrtedible, but old Llama 2-based 13B models are awful.
I’ll try it out! It’s been a hot minute, and it seems like there are new options all the time.
Try a new quantization as well! Like an IQ4-M depending on the size of your GPU, or even better, an 4.5bpw exl2 if you can manage to set up TabbyAPI.