You find a huge list of Markdown files from some awesome repo.

There are dozens of links inside those Markdown files. You’ve timeboxed a research topic.

You have two options:

  • Slow down time so you can carefully read all the links.
  • Delegate the task to NotebookLM.

But there’s one problem:

  • You first need to extract the URL list and possibly exclude some (like GitHub repos).
  • Or, you might only be interested in YouTube videos—and that’s fine.

Here’s a CyberChef recipe (which I learned from my wife) to solve this problem:

Extract_URLs(false,false,false)
Filter('Line feed','github.com',true)
Filter('Line feed','youtube.com',true)
Sort('Line feed',false,'Alphabetical (case sensitive)')
Unique('Line feed',false)

The recipe:

  1. Extracts all links from Markdown.
  2. Excludes GitHub and YouTube links.
  3. Sorts the links alphabetically.
  4. Removes duplicates.

Now you’re ready to give these links to NotebookLM to boost your research process.

Homework: Since some links may have disappeared, develop a simple shell script to exclude broken links and keep your notebook clean.

At the time of writing, there’s no built-in way to remove broken links from a NotebookLM notebook.

Direct CyberCHEF links: