00:00:00This is Google's notebook LM alternative anything LLM
00:00:04It's an open-source self-hosted AI workspace that lets you chat with your code base documents and internal data
00:00:10Plus it's completely private and unlike most local LLM setups
00:00:14You don't need to stitch together a llama Lang chain a vector database and some cheap UI just to make it usable over the next few minutes
00:00:22I'll show you exactly how it replaces that entire stack and whether it's actually worth switching to
00:00:30So
00:00:32Here's the real issue local models are easy now we get it but the workflow isn't always that easy
00:00:38You've got a llama running in one terminal Lang chain scripts in another your vector database somewhere else and a UI you temporarily just threw together
00:00:47Yes, it does work
00:00:49But we also have to be careful here anything LLM collapses that into one workspace you get drag-and-drop rag a visual
00:00:56no code agent builder a full developer API with an embed widget and you can bring your own providers like a llama LM studio grok
00:01:04XAI so we get fewer moving parts which leads to faster shipping if you guys enjoy this kind of content with tools that speeds up
00:01:11Your dev workflow be sure to subscribe to the better stack channel now. Let me run through this
00:01:16I'll just install the desktop app here
00:01:18Then I can connect my local llama instance and Lance DB as the default vector database
00:01:24So there's nothing extra to configure here now
00:01:27I'm just gonna drag in a Python repo and a PDF with documentation
00:01:31Anything will automatically chunk embed and index all of this for me now
00:01:36I can ask explain this fast API endpoint and cite the exact file and it answers with citations pointing to the real file paths
00:01:43With all this now leading to less hallucinations now
00:01:47I'll create a quick agent to summarize top hacker news posts daily. I embed the web search tool and that's it
00:01:54One click there's none of that Docker compose jargon that we have to add on to that
00:01:58This is where it starts feeling like a productivity layer on top
00:02:02Workspaces are isolated projects, which means client work stays separate from your side project
00:02:09Which in turns stays separate from your internal wiki. There's a full rest API so you can embed private rag into your own
00:02:16SAS internal dashboards and even a vs code extension
00:02:20This is great because with anything you're not locked into some interface
00:02:24the visual agent builder lets you wire up tools like SQL queries web search through SERP API file operations and even
00:02:32MCP servers and if you want more control, yeah
00:02:34You can still use Lang chain inside an agent Lance DB is the default vector store
00:02:40But you can switch to PG vector or quadrant in one click
00:02:43There's also a drop in chat widget you can embed into your own product and you can switch model providers
00:02:50A conversation without restarting or even re indexing. So how is this any different from the other tools?
00:02:55We're already using like notebook LM open web. I that's one is great
00:03:00If you mainly want a llama chat interface with plugins
00:03:03But anything llm adds stronger built-in rag agent workspaces and a desktop app
00:03:08You have private GPT that works well for simple document Q&A
00:03:12But anything llm adds agents and a full API on top of that
00:03:16There is a tool called diffy that I spoke about in another video diffy and Lang flow are powerful if you love heavy visual workflows
00:03:23But they are really heavy overall with anything llm
00:03:26It's lighter for document heavy rag use cases Lang chain gives us more flexibility, but you're building everything yourself
00:03:33Now let's talk about what devs actually like and what they don't like based on going through X and reddit and other resources
00:03:40So people consistently praise the API because it makes them betting private rag into real applications a lot easier
00:03:46the desktop version makes onboarding simpler than others and a new team member if you have a team I could install connect and just
00:03:54Start this really quickly
00:03:55plus this added ability to swap models mid chat without breaking context is huge and because it's open source we can
00:04:01Self-host it which means you can demo to clients you can demo to others without worrying about your data leaving the environment now on the downside
00:04:09Rag sometimes needs document pinning for perfect recall large collections. Like I'm talking about 500 or more documents
00:04:16They're gonna eat up RAM on you know smaller laptops agent flows can still feel a bit beta in edge cases
00:04:22So it's not going to be perfect. But for most real-world workflows, it's one of the least painful options that we have right now
00:04:28Especially being an open source one. So is this worth it? I mean if you're building internal tools client facing private AI systems
00:04:37Yeah, of course, or if you want production grade rag base without writing it all yourself
00:04:41This is gonna be great. If you need agents that actually ship. This is also a huge bonus
00:04:46We're not stitching everything together
00:04:47But if you require ultra fine-tuning for every single day or you prefer building everything from scratch with raw Lang chain
00:04:55Hey, that's fun
00:04:56I get it
00:04:57But this is not gonna be for you if you're running on very low-end hardware and you need something extremely lightweight again
00:05:03This is not going to be that the desktop download in the repo I've linked below
00:05:07If you enjoy these types of tools to speed up and change your workflow. Be sure to subscribe to the better stack channel
00:05:13We'll see you in another video