I was pleasantly surprised by many models of the Deepseek family. Verbose, but in a good way? At least that was my experience. Love to see it mentioned here.
Mistral seems to be the popular choice. I think it’s the most open-source friendly out of the bunch. I will keep function calling in mind as I design some of our models! Thanks for bringing that up.
After finally having a chance to test some of the new Llama-2 models, I think you’re right. There’s still some work to be done to get them tuned up… I’m going to dust off some of my notes and get a new index of those other popular gen-1 models out there later this week.
I’m very curious to try out some of these docker images, too. Thanks for sharing those! I’ll check them when I can. I could also make a post about them if you feel like featuring some of your work. Just let me know!
What sort of tokens per second are you seeing with your hardware? Mind sharing some notes on what you’re running there? Super curious!