

I run a quant of Qwen 35B A3B (Qwen3.6-35B-A3B-GGUF:UD_Q4_K_XL) at the moment, using Opencode and llama.cpp. I’m getting useful work out of it - but it’s of course not Claude. My hardware is a 5060Ti with 16GB VRAM and then ~20GB or so of system mem is getting used as well.
It’s important to put boundaries on less capable models though, so I have two plugins in Opencode as well that really makes a big difference to the results: @tarquinen/opencode-dcp@latest and superpowers+https://github.com/obra/superpowers.git.
I want to work in small steps with good control over what the models do so it’s not very similar to what you describe with just having them run away for half an hour and do everything.












First batch new Jolla phone coming in the summer. I have high hopes.