Article summary
Lately, I spend a lot of time traveling to work, school, etc. I’m strapped for free time, and my reading list (lots of technical articles) continues to grow and gather dust. When I heard about NotebookLM, a Google Labs project with a feature that transforms your notes into a podcast, I decided to experiment by making some of my to-read into a podcast list.
How I Did It
My use cases: technical articles online that cover a broad range of topics in the realm of computer science with general code examples to illustrate a point.
Generating the Notebook
NotebookLM is a free, experimental project you can use if you have a Google account. Its AI-generated audio feature is one feature among many. Since my purposes were pretty straightforward, I ignored the other features and focused on this one.
You start by adding a new notebook, choosing how you want to upload your notes (paste, link, etc.), customizing the podcast (optional), and then clicking generate. For my purposes, I used minimal customization so I could see how NotebookLM handles the content unprompted. With additional prompting, especially when it contains code, you can achieve a semi-tailored experience.
Creating a Podcast list with Overcast
Once the notebook was generated, you can download the resulting .wav file. To compile these .wav files into a playlist I used Overcast.fm and uploaded them into a new playlist dedicated to my custom podcasts. I should note here that I’m using Overcast’s premium subscription for the download capability (a whopping $10 / year — it’s worth it). And, voila, I have a custom playlist of select blogs, and articles I haven’t had time to read yet.
Thoughts on the outcome
I tested NotebookLM on three sources:
- Source code (basic React component with a useEffect) pictured below.
- Elmish documentation interspersed with code snippets.
- A Geeks for Geeks article that discusses the difference between MQTT and HTTP (no code involved).
Reading Code
useEffect Audio
I don’t have too many complaints with the reading and dissecting of code. I gave it simple examples that are well-documented online, so it didn’t struggle. NotebookLM struggled with the name userId, which likely means it needs to be trained on a dataset with camel case conventions and their pronunciations.
I also hoped for a real deep dive (maybe a mention of the virtual DOM in the React example), though I appreciated the analogies (my bad; I didn’t specify how deep). The podcast felt contrived when the hosts seemingly congratulated themselves on their own analogy.
Overall, I like this better than expecting a text reader like Instapaper to read my code. I felt that the interactiveness of the hosts reading the code helped energize its analysis. It allowed for natural mental pauses where the hosts would explain a keyword (like async), and the analogies used made sense.
Adding Context
F# Audio
In a different notebook comparing F# pipeline operators and composition operators, I noticed that the hosts talked about performance, which was definitely not mentioned in the article and wasn’t prompted. To me, that suggests that it was trained on datasets that included concerns over performance. Does it mention it to fill the space or to seem more conversational? I’m unsure, but I’m not opposed to it taking leaps when I gave it zero prompting, as long as those leaps aren’t hallucinations. My main gripes here are the inability to read letters (sometimes called “mg types” instead of “msg types”, and “ofMsg” became “ofFSG”) and bouts of superfluous language.
Redundancy Department Department
MQTT vs HTTP
It covered the topic well. However, about half-way through the podcast turned into a repetitive nightmare. It began repeating itself, and all the topics it covered previously, leaving me, the listener, feeling like the podcast was completely off the rails. Note to self: make sure to add DRY to the podcast customization.
Note: I had to trim this because it was 22 minutes long, hence the abrupt ending. Listening past 10 minutes in, however, is useless anyway. For those interested in the power word I used to trim the .wav file: ffmpeg -i MQTTHTTP.wav -t 960 MQTTvsHTTP.wav.
Voice
I’m an avid reader, and I can be moved and motivated by a writer’s voice. One thing I noticed immediately after transforming a co-worker’s blog into a podcast was how much I didn’t want to hear other voices discussing the topic. The internalized voice of an author can energize the content, and make it more meaningful. Transforming an article to podcast does little to capture the voice of the writer. If that’s not important to you, the podcast should be fine. For me, I wished I hadn’t made the Notebook, and had instead read it. So, not an argument against NotebookLM, but the medium. Should you value the voice of the author or the originality of a human podcast host, maybe consider leaving this on the to-read list or find/request a related podcast.
Final Thoughts
Take a good look at this AI-generated image. The prompt was “Mortal Kombat cats.” For context, this was for funsies, displayed during a DJ set through an audio-visualizer. See anything missing? This picture embodies a lot of the feelings I have for interacting with LLMs or image and audio generation—generating content, quickly, approximately. I liken it to the demon cat in Adventure Time with “approximate knowledge of many things.” It can get so many details right, and still be so incredibly wrong. In the case of Mortal Kombat cats, I don’t care about the missing limbs, because the overall aesthetic is delightful. If I wanted this to be completely accurate, it would’ve failed.
In that vein, I would say the above podcasts delivered. Does it feel a bit contrived? Yeah, but its conversational-quality, its use of analogies, and ability to provide more detail on a well-documented topic is what I was looking for. I like that the hosts sometimes stuttered or broke a sentence the way a real host would. Did I roll my eyes when the host declared their analogy the “perfect analogy”? Yeah. Am I resoundingly annoyed when they use the phrase “deep dive” for everything and then continue to use the word “deep” throughout the podcast? Sure, but I could customize it to not say that. Overall, I got approximately what I was hoping for.