Automatic Speech Recognition in Linux - Seeking Experiences and Recommendations

njordomir@lemmy.world · 2 months ago

Automatic Speech Recognition in Linux - Seeking Experiences and Recommendations

Maxy@lemmy.blahaj.zone · 2 months ago

I’ve had good experiences with whisper.cpp (should be in the AUR). I used the large model on my GPU (3060), and it filled 11.5 out of the 12GB of vram, so you might have to settle for a lower tier model. The speed was pretty much real time on my GPU, so it might be quite a bit slower on your CPU, unless the lower tier models are also a lot faster (never tested them due to lack of necessity).

The large model had pretty much perfect accuracy (only 5 or so mistakes in ~40 pages of transcriptions), and that was with Dutch audio recorded on a smartphone. If it can handle my pretty horrible conditions, your audio should (hopefully) be no problem to transcribe.

njordomir@lemmy.world · 2 months ago

I used the base model and it ran at a very acceptable speed with CPU only. Decent accuracy considering the recording was mediocre quality at best. Thank you for the suggestion.