Read To Me - epub to TTS mp3s

Wed March 29 2023 by Christopher Aedo

For a long time I've wanted to start knitting, but I never end up putting in the time to learn and practice. One thing that held me back was that I knew I couldn't do anything else during all the hours I would spend getting the hang of it. Recently I thought it would be a perfect time to listen to audio books! Then almost immediately I realized the audio version of the book I was reading had a long wait at the library, and I didn't want to spend $15+ for a different version of a book I already purchased. What if I could make my own audiobook version?

I had heard some pretty amazing speech synthesisers that used machine learning to develop the voices. I was especially impressed with Coqui AI TTS, so I started playing around with the sample voices that were already developed. I did look into building my own voice model, but it does take a fair bit of effort. After listining to text read by the included VITS models, I thought this could really work out well.

I threw together a simple python script, epub2tts and pointed it at an epub I had. Initially I ran into lots of little problems that were pretty easy to sort out. For instance some chapters were just too long and would cause Coqui to crash, so I picked a size that I knew was consistently causing issues, and just split that into a new "chapter". Other than a few other minor tweaks, there wasn't much left to do before it was working reliably well.

I'm really happy with the result, and find it does sound great. Of course it's not at the level of having a real human read, but it's far better than I expected. It's been really easy to listen to and forget that it was all entirely computer generated.

Also it's made it even more fun now to practice knitting!