Memory prices tipped to fall as China starts flooding the market with DRAM and NAND chips

sanitation@lemmy.radio · 2 days ago

Memory prices tipped to fall as China starts flooding the market with DRAM and NAND chips

ImmersiveMatthew@sh.itjust.works · 16 hours ago

I am using llamma.cpp with QWEN 3.6 27B MTP, with a 64k context window on a 4090 that OpenCode talks to and then it in term talks to the Unity Game engine via MCP. Getting 80/112 tokens/second work 90 average which is shocking to me as it really does feel as fast as cloud AI (well faster for me as I am in Vietnam and round trips to US data centers really adds up in a session). The only really issue is you pretty much have to one shot prompts as follow up prompts will easily go over the context window size. If I cannot one shot prompts them use cloud AI both that is very rare for my use case. Maybe 1 in 50 or so and only when the tasks touches a lot of large scripts and scenes.

Memory prices tipped to fall as China starts flooding the market with DRAM and NAND chips

Memory prices tipped to fall as China starts flooding the market with DRAM and NAND chips

Just a moment...