00:00:00Are AI agents getting weaker or are they just working with bad information?
00:00:03The main problem with agents is their context.
00:00:06It's not that agents don't have information or can't remember things,
00:00:09but that they are not grounded with a controlled source of truth.
00:00:12This means that working with bad information is the reason they perform so poorly.
00:00:15Now you might know about Google's notebook LM,
00:00:18which is a tool that does extremely good research and is also a podcast generator.
00:00:22But what if it were much more than that?
00:00:23Our team tried to take this research tool and test it from various angles
00:00:27to find a way to make it fit into our development workflows
00:00:30and honestly, we didn't expect it to fit in that well.
00:00:32Throughout the video, our team used notebook LM through its CLI tool.
00:00:36It's an interface for the product that gives you full control
00:00:39over managing your notebooks, sources and audio reviews from the notebook sources.
00:00:44The installation is straightforward, just one command and it was done.
00:00:47Now once it's installed, you can verify the installation by running the help command.
00:00:51This shows all the available commands for controlling the sources for notebook LM,
00:00:56handling multimodal inputs and all the functions you can perform with the tool.
00:01:00But before using it, authenticate the CLI with your Google account using the NLM auth command.
00:01:05Once you run it, a Chrome window opens and you sign in.
00:01:08After that, NLM saves your credentials for future use.
00:01:11Notebook LM can be accessed through the CLI and MCP,
00:01:15both built by the same developer, but you can use whichever you prefer.
00:01:18We chose the CLI because it's token efficient
00:01:21and won't be a problem when it's run on long horizon tasks.
00:01:24We can use notebook LM as a second brain for AI agents
00:01:27by giving it information regarding the code base and letting it document things as it goes.
00:01:31Now to do this, we added instructions in the claud.md file
00:01:35and told it that all project knowledge, architectural decisions
00:01:38and all other documentation should live in the notebook.
00:01:41This notebook was a single source of truth.
00:01:43We used claud to create the notebook using the CLI tool and saved its ID in the claud.md.
00:01:49So when we were working on a feature for the app, we used plan mode to plan it out first.
00:01:53After implementation, when the build passed,
00:01:55it updated the notebook with the feature implementation as instructed.
00:01:59The notebook it created contained all of the decisions that claud took along the way.
00:02:03Setting this up as a second brain means claud doesn't need to search through a large number of documents on its own,
00:02:08reading it by pattern matching and bloating context with unwanted information.
00:02:12Instead, it relied on the notebook LM's rag capabilities to get exactly what it needed.
00:02:16So claud gets synthesized answers from gemini, not raw dumps,
00:02:20and it can focus on the development and implementation more.
00:02:23You can also share the notebook with anyone,
00:02:25and they can use notebook LM's capabilities to make sure the implementation is on par with what they need,
00:02:31even if they're non-technical, letting them understand the technical details at their own pace.
00:02:35Notebook LM is designed for research across multiple sources.
00:02:39So since we use claud code a lot for research already,
00:02:42we provided the research topic we were working on and asked claud to find the sources,
00:02:47create a new notebook and upload them there.
00:02:49It identified all the sources and uploaded them to the notebook it had created for this task.
00:02:53Research with claud takes up a lot of context because it also looks through links that it later identifies as unrelated.
00:02:59Splitting the research into two parts and letting a tool designed for the job handle it saved both time and tokens.
00:03:05Once the sources were in the notebook, we cleared the context so that it does not have the context of the research
00:03:11and asked claud to look up the information on notebook LM using the CLI,
00:03:15find the one with the rag pipeline research and get the key findings from it through the notebook LM chat.
00:03:20Claude used the CLI tool to fetch the notebooks, sent a chat message to get the key findings and returned the output.
00:03:26This happened much faster than normal claud research.
00:03:29And the benefit we get out of using the notebook is that if we want more information from the same research,
00:03:34we can go back to the notebook because the sources are saved in it.
00:03:37So claud doesn't have to search for them again because this research is now available externally.
00:03:41If we were just doing it with claud alone, we wouldn't be able to refer back to the sources
00:03:45unless we repeated the research and claud found and queried them all over again.
00:03:49But this allows us to reuse them in future runs.
00:03:52Understanding a codebase that's not written by you is the toughest part of development work.
00:03:57And in order to simplify that, we also used notebook LM.
00:04:00For doing this, we asked claud to clone the repository using the github CLI.
00:04:04And once the repo had been cloned, we asked it to use repo mix to generate a document for this repo.
00:04:09Now, repo mix is the tool that packs a codebase into an AI-friendly format.
00:04:14You can either use the web interface to convert the code to documents in multiple formats,
00:04:18which AI can use to understand the codebase easily in a token-efficient manner.
00:04:23But we used the repo mix CLI.
00:04:25We installed it using NPM.
00:04:26And once done, the repo mix CLI was available globally.
00:04:29So we asked claud to create a notebook on notebook LM using the CLI tool
00:04:34and add the formatted document as a source for this notebook.
00:04:37Once it had cloned the repo, it used the repo mix CLI tool to convert the code to a token-efficient document
00:04:44and then created a new notebook and added the source in TXT format.
00:04:47Now, the source had been added.
00:04:49We asked claud to use the notebook tools to visualize the codebase
00:04:52and create diagrams that would help us understand what's in the codebase.
00:04:56It ran a series of visualization commands.
00:04:58And once the diagrams were completed, we could view them in the studio on notebook LM.
00:05:03It created an atlas that acts as a guide for the project's key workings.
00:05:07It created a proper mind map for each aspect of the app
00:05:09and allowed us to chat about each individually.
00:05:12There were also infographics created where we could see the different aspects visualized,
00:05:16making it easier to understand the codebase visually
00:05:19instead of relying on textual responses by claud.
00:05:21Now, before we move forwards, let's have a word from our sponsor, Make,
00:05:25the platform that empowers teams to realize their full potential
00:05:28by building and accelerating their business with AI.
00:05:31We all know the biggest risk with autonomous agents is the black box problem.
00:05:35You deploy them, but you can't verify their decisions.
00:05:37Make has solved this, combining AI-assisted no-code capabilities
00:05:41with over 3,000 pre-built applications to give you a true glass box approach.
00:05:46For this video, I'm using their pre-built market research analyst agent
00:05:49to show how you can finally scale with control.
00:05:52Alongside powerful tools like MakeGrid, MCP, and advanced analytics,
00:05:56the game changer here is the reasoning panel.
00:05:58It lets you watch the agent's logic step by step,
00:06:01ground its responses using the knowledge feature,
00:06:03and debug live with the chat tool directly in the canvas.
00:06:06It's the transparency developers have been waiting for.
00:06:09Stop guessing and start scaling with control.
00:06:11Click the link in the pinned comment to experience the new Make agents today.
00:06:15Whenever AI hits an issue that's not in its knowledge base,
00:06:18it uses web searches and narrows down resources to find a solution.
00:06:22So we wondered whether we could skip the web searches entirely
00:06:25and replace it with a knowledge base.
00:06:27The problem with web search is that claud pulls in a bunch of sources,
00:06:30but only a few of them actually matter.
00:06:32The rest just waste tokens.
00:06:33So we asked claud to create a new notebook on NotebookLM
00:06:37and add sources from documentation, communities,
00:06:40and solutions across platforms
00:06:41that could make this notebook a go-to place for debugging.
00:06:44It created the notebook and started looking for sources to add.
00:06:48By the end, the notebook had official documentation,
00:06:50community forums, GitHub repos, blogs, and other relevant references
00:06:55that could act as a knowledge base for debugging-related issues.
00:06:58We added the ID of the notebook in the claud.md file
00:07:01and told claud to use it as a source for all the debugging issues it might face.
00:07:05We also added the instruction that whenever it hits a bug,
00:07:08it should rely on the notebook first before searching the web.
00:07:11With this in place, whenever it came across an error,
00:07:13for example, the deprecated middleware it had used in the project,
00:07:16it handled it differently.
00:07:18If it would have resolved it normally,
00:07:19it would first fetch the documentation and then use it to fix the issue.
00:07:23But instead, it just queried the notebook with a specific question
00:07:26on how to migrate to the latest proxy,
00:07:28all by just using the notebook and getting a structured response back,
00:07:31instead of fetching results from the whole web.
00:07:33Now, this claud.md, along with all the other resources,
00:07:36are available in AI Labs Pro.
00:07:38For those who don't know, it's our recently launched community
00:07:41where you get ready to use templates, prompts,
00:07:43all the commands and skills that you can plug directly into your projects
00:07:47for this video and all previous ones.
00:07:49If you've found value in what we do and want to support the channel,
00:07:52this is the best way to do it.
00:07:53Links in the description.
00:07:55We always start the AI development process by writing documentation,
00:07:59so we thought about pushing those documents to notebook LM as well.
00:08:02When we were working on an application,
00:08:04we created documents and once they were ready,
00:08:06we asked Claude to create another notebook on notebook LM
00:08:09and push all the documents as sources for that notebook.
00:08:12So it created a notebook and added all of the sources to notebook LM.
00:08:16Once we had these sources, they became organized and reliable,
00:08:19helping Claude understand things about the project.
00:08:21And if we're working with non-technical people,
00:08:24we can just share this notebook and let anyone with access chat with it
00:08:27and understand things on their own.
00:08:28And this notebook doesn't only help Claude.
00:08:30If you're using other tools like Cursor, Gemini CLI,
00:08:34or anyone else is building along with you,
00:08:36this notebook can work as a knowledge base for them as well.
00:08:39Because with the notebook chat,
00:08:40each agent can get information that's specific to what they need
00:08:44instead of relying on file tools to search through files.
00:08:46This way, Claude or any other agent can just use the NLM notebook query tool,
00:08:51ask for what's related to what they need at the moment
00:08:53and build their context from that.
00:08:55Also, if you are enjoying our content, consider pressing the hype button
00:08:58because it helps us create more content like this
00:09:00and reach out to more people.
00:09:02Now, we already saw how we can use it to onboard ourselves onto a code base,
00:09:06but we wanted to see if those same visualizations could help agents too.
00:09:10So we asked Claude to create another notebook
00:09:12and create visualizations that would help the agent find its way around the code.
00:09:16So it created a notebook and added mind maps, infographics, data tables,
00:09:20and several sources to notebook LM
00:09:22and downloaded them into the visualizations folder in the project.
00:09:25It had several formats for the agent's understanding,
00:09:28including tables in CSV and Markdown files,
00:09:30and it also contained JSON files for the mind maps.
00:09:33So what it did was create mind maps for all of these features.
00:09:36These were the ones we saw that it had exported as JSON files.
00:09:40It also created a full slide deck to aid visual understanding.
00:09:43Whenever it ran into anything that it needed to check,
00:09:46it checked the respective mind map for it instead of crawling through the file system,
00:09:50from which it found the exact flow and queried the notebook for what it needed.
00:09:54Similarly, it checked endpoints, analyzed flows,
00:09:56and queried the notebook using the JSON exported mind maps
00:10:00instead of relying on navigating around the code base to do it.
00:10:03Another way we can use notebook LM
00:10:05is for adding all the security related issues that we commonly face
00:10:08with AI generated websites by grounding them in proper sources.
00:10:12So we asked Claude to create a notebook using the CLI tool
00:10:15and add feature specs and all the relevant sources related to security.
00:10:19The purpose of this notebook is to act as a security handbook for Claude
00:10:22so that whenever it runs into any issues, it can refer to this for help.
00:10:26It created the notebook and added all the sources.
00:10:28It included custom security guides and cheat sheets from OWASP,
00:10:32security measures built by the tech stack we're using from GitHub,
00:10:35CVE databases, and the other resources needed to ensure the security of the app.
00:10:39The notebook it created had 61 sources, all in different files,
00:10:43containing security advisories from several sources.
00:10:45Using this, when we asked Claude to do a quick security check,
00:10:49it used the handbook, generated a security report,
00:10:51and identified several issues with different severities,
00:10:54like the floating point error in the transactions that it detected in the app
00:10:58that could be severe if transactions are in high amounts.
00:11:00It was able to do so because the check was grounded in research from notebook LM.
00:11:04That brings us to the end of this video.
00:11:06If you'd like to support the channel and help us keep making videos like this,
00:11:10you can do so by using the super thanks button below.
00:11:13As always, thank you for watching and I'll see you in the next one.