Audio version of this blog:
I am doing some coding experiments with AI. This one is now public, although I don’t know how long I will run it in this form. Anyway, I found it pretty useful for myself, so it might be useful for you as well.
Problem: I don’t have time to read
Lately, I have found myself unable to dedicate a lot of time to reading. When I can be in front of a screen, I am usually working, or I don’t have enough time to read longer texts. The only time I actually read is before sleeping, which is usually a few paragraphs and I am in dreamland.
But I have a plenty of time to listen. Podcasts and audiobooks are my thing – during driving, loading the dishwasher, cooking, walking, that’s all time when I do things when I can listen to some interesting content. But there is a lot of content that is only in written form.
I started solving this problem for my readers with my project rss2podcast, where I can publish my blogs, including this one, as a podcast. You can listen to this podcast by subscribing to Juraj’s blogs RSS feed. You can also search for “Juraj’s blogs” in your favourite podcasting app. It should already know about it. Or get it via Apple Podcasts or Spotify.
I have also done it for friends, you can listen for example to Liberation.travel newsletters.
But what about me? I also want to read custom blogs from the internet!
Solution: Your own private podcast
We have a great technology for delivering audio content with updates. It is based on RSS and it’s called Podcasts. There used to be RSS readers (well, there still are RSS readers, but we mostly use social networks now instead of them).
You would subscribe to interesting blogs and you would see new articles.
Combining RSS and audio files gives us the technology of podcasts. Good news is there are plenty of amazing apps to listen to podcasts, such as Pocket Casts, or my current favourite Fountain.fm.
You can subscribe to a private RSS feed.
How to use it
Go to loaditfor.me. Create an account – no e-mails or anything, set username or password and you are good to go. But there are no password resets, no spam, no fuss.
Then just paste an article to the form:
Then take “Your personalized RSS feed URL” and add it to your podcasting app (note, it needs to process the first episode, which can take a few minutes, only after processing the feed will starts working, give it an hour – I don’t have your e-mail, so I can’t notify you). Now you can add articles, they will be converted to podcasts and you can listen to them. Your podcasting app will remember which you’ve heard, where you stopped, in many you can actually listen cross devices and remember the state.
After you give it some time, this is how you add it to fountain for example:
Now you are reading blogs again! Enjoy!
Add to home screen
The app is also a Progressive web app, so you can add it to your home screen and it will act as a native app. But basically, you just paste URL and that’s it, everything else happens on the backend and in your podcasting app.
Technology behind it
This sounds pretty simple. So let me tell you what happens behind it:
- The page is downloaded and the main article text is extracted. This only works with article-based webpages (not PDFs!). I use the amazing trafilatura library.
- It is converted using a large language model (currently gemma2:27b through locally-run ollama) to a pronouncible, readable form. It’s not perfect, but it is usually better than just plainly passing this to text to speech.
- Of course hallucinations are a problem, so I verify if the output is not significantly longer and then I verify it with another pass of LLM that just check that the original and rewritten form have the same meaning. If anything is awry, I just pass the text chunk verbatim to text to speech.
- I do text to speech using StyleTTS2 (my fork with some bugfixes). If you know of better model without audio hallucinations that is well understandable and that I can run locally (no APIs), let me know!
- I do speech to text using pywhispercpp, verify that the synthesized form matches the input. If not, I change some settings and regenerate.
- I stitch the intro, page title, text chunks and outro together, generate the RSS feed and upload to a server
I use all local models on my Mac Mini, so this does not use fancy GPUs, it is running, mostly on solar power, next to my infrarred sauna. So be gentle, don’t hammer it too much, otherwise I would need to turn it on.
What’s the business plan?
So far it’s an experiment. If you found it valuable, you can contribute some value back in the value4value spirit. I have no guarantees that this will continue working, it might break completely tomorrow, but maybe some people find value in it now and use it while it’s there.
For me it has been mostly a learning experience, learning about the limits of current models. I have some more projects with AI coming up!
Head over to loaditfor.me and try it out!