This Open Source Repo Just Solved Claude Code's #1 Problem

CChase AI
컴퓨터/소프트웨어창업/스타트업AI/미래기술

Transcript

00:00:00Graphify just solved Claude Code's memory problem.
00:00:03It's able to turn any repository and turn it into a wild knowledge graph,
00:00:06just like the one you see here.
00:00:08And in the process, it allows Claude Code to give you more accurate answers
00:00:12at a fraction of the token costs.
00:00:14It's able to do this by traversing your entire code base,
00:00:17mapping all the connections, and discerning the why behind the connections.
00:00:21And the best part is, it's also open source and completely free.
00:00:24And so today, I'm going to show you how you can get this working yourself
00:00:27and what's actually going on under the hood,
00:00:30so you can start leveraging it right away.
00:00:32So Graphify came out a couple months ago.
00:00:34It's at nearly 60,000 stars.
00:00:36And what it does is it allows your AI coding assistant,
00:00:39doesn't have to be Claude Code, but that's what we'll be using today,
00:00:41to map your entire project, code, docs, PDF images, and videos
00:00:45into a knowledge graph that you can query instead of gripping through the files.
00:00:49So we are able to take Graphify and point it at any sort of repo we want,
00:00:54and it creates this sort of knowledge graph.
00:00:55The reason we care about this is when we create a knowledge graph,
00:01:00it allows Claude Code to more easily answer questions about that repository
00:01:04because everything's already mapped out.
00:01:06It's very clear how A connects to B, how B connects to C,
00:01:09and why those connections matter.
00:01:11This is in contrast through gripping through files,
00:01:13which is how AI coding assistants like Claude Code normally work.
00:01:16Kind of a simplistic analogy, but it's as if it's just doing Control-F
00:01:19and trying to search for it versus having a clearly mapped out path of how everything's going, right?
00:01:25This gives Claude Code a map while gripping through files doesn't at all.
00:01:29So because of that, it costs less tokens to get more accurate answers with something like Graphify.
00:01:35Now, how significant are those token savings?
00:01:37Well, some people are claiming up to 70x, which I found to be a little on the high side.
00:01:41And as you'll see when we demo it today,
00:01:42it's a bit lower than 70x, but still significant.
00:01:45So that's the why you should care.
00:01:47Now let's talk about how it actually works.
00:01:48How do we go from a code base to some sort of knowledge graph like this,
00:01:51which looks very, very similar to something like a graph RAG knowledge base.
00:01:56Are they the same?
00:01:56How does this relate to RAG?
00:01:57We'll talk about that.
00:01:58Well, the way it works is through three different passes.
00:02:00On the first pass, we are looking at the code structure,
00:02:03and this is completely free.
00:02:05Everything you see right here, this is just through pass one.
00:02:09This is deterministic.
00:02:10This is an AI doing a guessing game.
00:02:12It is literally going through the code itself and saying,
00:02:15this piece of code relates to this second piece of code.
00:02:18And that's literally how the code base is written out.
00:02:20These are established connections.
00:02:22As it says here, a tree sitter parses your code files and extracts classes,
00:02:26functions, imports, call graphs, and inline comments.
00:02:29This runs locally with no LLM involved.
00:02:31On pass number two, it's looking at video and audio,
00:02:34if those files exist at all.
00:02:36And if they do exist, they're going to be transcribed with faster whisper.
00:02:39And so once they're actually broken down into text,
00:02:41they will also be injected into the knowledge graph.
00:02:44Lastly, it does a third pass on docs, papers, and images.
00:02:47So if your code base includes things that isn't true code,
00:02:50whether that's just like PDF files, documentations, images, whatever,
00:02:54this gets hit on pass number three.
00:02:56And this is where the large language model actually comes in
00:02:58and does some sort of like semantic analysis,
00:03:00aka what does this document actually mean
00:03:03and where should it fit in this larger knowledge graph?
00:03:06This third pass is kind of similar without true embedding
00:03:10to what a RAG system does.
00:03:12Once it does all that,
00:03:13it then begins to create the actual knowledge graph itself.
00:03:17It goes into a little bit more technical detail in here,
00:03:19but all you need to understand is it's going to create nodes,
00:03:23nodes, which are these little circles, right?
00:03:26Each one of these circles is a node.
00:03:28We then have edges, which are the line between two nodes,
00:03:33two things that are connected, and then communities.
00:03:35Communities are simply large groupings of nodes
00:03:38that are similar in nature.
00:03:39What you see here are 486 communities.
00:03:43So that's kind of the overview of how the data is actually extracted
00:03:46and turned into a graph.
00:03:47And remember, we care about turning into a graph
00:03:49because for all intents and purposes,
00:03:51it's a map to cloud code,
00:03:52so it can more quickly answer questions.
00:03:54Now, you probably have a few questions at this point.
00:03:56One, what if there is no code structure?
00:03:58What if I'm pointing at a repository full of markdown files?
00:04:01It's just like a bunch of documents
00:04:02that I want to create a knowledge graph of
00:04:03and I don't want to go full RAG.
00:04:05Can I do that?
00:04:05Yes.
00:04:06In fact, you can actually turn it into an obsidian vault
00:04:08through Graphify.
00:04:09We'll talk about that a little bit at the end.
00:04:11The second question you probably have is,
00:04:13yeah, this actually does look super similar
00:04:15to something like GraphRAG.
00:04:17What's actually the difference
00:04:18and when should I use one or the other?
00:04:21Well, the biggest difference between Graphify
00:04:23and a GraphRAG system like LightRAG
00:04:25or RAGanything or Microsoft GraphRAG
00:04:28is really going to be the embeddings, right?
00:04:29Graphify isn't using any embedding system whatsoever.
00:04:33The second biggest difference is the use cases.
00:04:35So Graphify is best and we get the most out of it
00:04:37when we're talking about code bases.
00:04:39But if we see some sort of huge repo,
00:04:40whether it's a new one or one we've been working on
00:04:42and we want to figure out how it's wired,
00:04:44Graphify is perfect for that.
00:04:46GraphRAG, on the other hand,
00:04:48is great for something that's more unstructured.
00:04:50Let's say you have tens of thousands of documents
00:04:52that are all PDF files or Markdown files
00:04:55and you just want to ask about them.
00:04:57You know, imagine they're all policy documents
00:04:58and you're asking like,
00:04:59what does the policy say about X, right?
00:05:01It could be anywhere amongst any of these documents.
00:05:04They aren't necessarily connected.
00:05:05It's very unstructured.
00:05:06That's where GraphRAG or really any RAG system shines.
00:05:09That being said, the division between those two here
00:05:13is kind of murky
00:05:14because like I mentioned on that third pass,
00:05:16we can kind of do that with Graphify.
00:05:18It's almost like a RAG light system in that sense.
00:05:21So that's what Graphify is,
00:05:22how it works and why you should care.
00:05:24Now let's talk about actually installing this thing
00:05:27and using it for real.
00:05:27But before we jump into that demo,
00:05:29a quick word from today's sponsor, me.
00:05:32So not too long ago,
00:05:33I released the Cloud Code Masterclass
00:05:35and it is the number one way to go from zero to AI dev,
00:05:37no matter your technical background.
00:05:39This course gets updated weekly
00:05:40and it also includes additional masterclasses
00:05:43like the Codex Masterclass
00:05:45and the Cloud OS Masterclass.
00:05:48So if you're someone who wants to take this
00:05:49a little more seriously,
00:05:51definitely check it out.
00:05:52You can find it inside of Chase AI+.
00:05:53There is a link in the pinned comment.
00:05:55So installing Graphify is relatively simple.
00:05:58We have a few prerequisites
00:05:59as well as instructions for how to install it.
00:06:02If you're using Cloud Code,
00:06:03I suggest you make it very easy on yourself.
00:06:06Just go to the Graphify GitHub link.
00:06:08I'll put that down below.
00:06:09Copy it, paste it into Cloud Code
00:06:11and just tell it,
00:06:12hey, install Graphify for me.
00:06:14But if you want to do it manually,
00:06:15you can just follow the steps
00:06:16as they are laid out.
00:06:18And again, Graphify is platform agnostic
00:06:20and it works with any coding agent out there.
00:06:22And once you have Graphify installed,
00:06:23the next question becomes,
00:06:24okay, how do I use this?
00:06:25What are the commands?
00:06:27Well, there are quite a few commands
00:06:30and there's so many commands.
00:06:31In fact, you are not going to
00:06:32remember any of these.
00:06:33Luckily, when you install Graphify,
00:06:35it's going to come with a Graphify skill.
00:06:38The skill is going to teach Cloud Code
00:06:39how to use Graphify
00:06:41and when it should use which commands
00:06:42depending on the natural language you use.
00:06:45So that being said,
00:06:47I suggest you take a look at the GitHub repo,
00:06:49somewhat familiarize yourself
00:06:50with what is possible
00:06:51because there is a lot.
00:06:52But understand,
00:06:53you don't have to have this memorized.
00:06:54Cloud Code understands what to do.
00:06:56But there are a few
00:06:58we should be aware of.
00:06:59If I do forward slash Graphify,
00:07:00that's going to run the whole thing
00:07:02on whatever directory I'm currently on.
00:07:04There are also Graphify commands
00:07:05for querying the knowledge graph.
00:07:07So if I do Graphify query
00:07:09or Graphify explain,
00:07:10it's going to explicitly tell Cloud Code
00:07:12or whatever coding agent you're using
00:07:13to, hey,
00:07:14take a look at the knowledge graph
00:07:16when you answer this question.
00:07:17Don't be lazy
00:07:17and just try to answer it on your own.
00:07:19Furthermore,
00:07:19we have commands
00:07:20to make sure it's always on.
00:07:21So if I do Graphify Cloud install,
00:07:23that means it's always going
00:07:25to use Graphify
00:07:26to answer the questions.
00:07:27I don't have to be explicit.
00:07:28It literally becomes a hook.
00:07:29And there are some other
00:07:30interesting flags
00:07:31like the obsidian flag,
00:07:32which will,
00:07:33with one command,
00:07:34create an entire obsidian vault
00:07:35for you
00:07:36and fill it with
00:07:37whatever Graphify comes up with.
00:07:39But again,
00:07:40remember the skill is installed.
00:07:41So if you ever get confused
00:07:42about what makes sense,
00:07:43just ask Cloud Code.
00:07:44It will understand.
00:07:45So now let's actually run this.
00:07:47For the demo,
00:07:47we are going to be pointing
00:07:49Cloud Code at OpenDesign,
00:07:51which is a relatively large code base.
00:07:53If you've never used OpenDesign,
00:07:55it's essentially Cloud Design,
00:07:57but open sourced.
00:07:59So I've cloned it on my machine
00:08:00and I'm going to open Cloud Code
00:08:02inside that directory.
00:08:03So we're inside the directory
00:08:04and all I'm going to do
00:08:05is forward slash Graphify
00:08:07and then dot.
00:08:08It's now going to run Graphify
00:08:10on this entire folder.
00:08:12So after running for six minutes,
00:08:13this is what we got.
00:08:15It took a look at 203 files.
00:08:17We got 1,907 nodes,
00:08:203,447 edges in 109 communities
00:08:24and output tokens
00:08:25was just under 120K.
00:08:27So it lists the God nodes.
00:08:29The God nodes are pretty much
00:08:30like the most prominent nodes,
00:08:32the most prominent connections
00:08:33inside whatever it traversed.
00:08:36We have surprising connections
00:08:37that I didn't expect
00:08:39and suggested questions.
00:08:42So if we want to take a look
00:08:42at the graph,
00:08:43I can say,
00:08:44go ahead and bring up
00:08:47the graph for me.
00:08:49So here's a look
00:08:50at the knowledge graph
00:08:51it built
00:08:52and you can kind of see
00:08:53the communities there.
00:08:54It created 109 communities
00:08:56and that's really just
00:08:56all of these clusters.
00:08:58As we scroll in on them,
00:09:00we can see the nodes
00:09:01which are the actual dots
00:09:03and then the edges
00:09:05are the connections between them.
00:09:06When I click on the node,
00:09:07you can see over here
00:09:08on the top right,
00:09:10it's type,
00:09:11so it's a code node,
00:09:12it's community,
00:09:13it's source,
00:09:14as well as its neighbors.
00:09:15But remember,
00:09:16as cool as this visualization is
00:09:17and it does look neat,
00:09:19the real value here
00:09:20isn't the knowledge graph.
00:09:21This is cool looking,
00:09:23but the actual value
00:09:24is the fact that
00:09:25now we have handed
00:09:26Claude Code a map
00:09:27to the open design repository
00:09:29and I can now ask questions
00:09:31about it
00:09:31and get accurate responses.
00:09:33So what we'll test now
00:09:34is we'll ask it a question
00:09:35about something to do
00:09:36with the repo
00:09:37and we're going to have it
00:09:38use Graphify,
00:09:39so have it actually
00:09:40use the knowledge graph
00:09:41and then we'll ask
00:09:42pretty much the same question
00:09:43not using Graphify,
00:09:44so just have it like
00:09:45grab the answer
00:09:46and we'll take a look
00:09:47at what the token difference
00:09:48looks like.
00:09:49So to take a look
00:09:49at the token difference
00:09:50with and without Graphify,
00:09:51we're going to ask
00:09:52the same question
00:09:53to Claude Code
00:09:54about the repo.
00:09:55The first one is
00:09:56trace how a design request
00:09:58flows from the web app
00:09:59to a coding agent
00:10:00and back.
00:10:00So we're trying to understand
00:10:01how this application
00:10:03actually works
00:10:03and in the first tab
00:10:04we're going to say
00:10:05use Graphify
00:10:06and in the second tab
00:10:07with the same question
00:10:08we're saying
00:10:09do not use Graphify.
00:10:10So we can see
00:10:11the Graphify skill
00:10:11being loaded right away
00:10:13and then we can see
00:10:14commands like
00:10:15graphify query
00:10:16asking the question
00:10:17we just gave Claude Code.
00:10:18Over here
00:10:19on the non-graphify side
00:10:20we see that Claude Code
00:10:21has spawned
00:10:22to explore agents
00:10:23to take a look
00:10:25at the code base
00:10:25and right off the rip
00:10:27we've already used
00:10:27100,000 tokens
00:10:28between them.
00:10:29Now in terms of
00:10:30the actual answers
00:10:30we got
00:10:31they were the same
00:10:32they both identified
00:10:32how this app
00:10:34actually works
00:10:35but with the
00:10:36non-graphify version
00:10:37we needed to run
00:10:38those explore agents
00:10:39so we were looking
00:10:40at about
00:10:40150,000 tokens
00:10:42give or take
00:10:43with the explore agents
00:10:44plus an additional
00:10:4550,000 tokens
00:10:46on the main session
00:10:47so you know
00:10:48about 200,000 tokens
00:10:50total
00:10:50versus over here
00:10:52on the non-graphify version
00:10:54we only used
00:10:55about 80,000
00:10:58so about
00:10:5840%
00:11:00of the total cost
00:11:01of the non-graphify
00:11:02which is significant savings.
00:11:03Now since
00:11:04this non-graphify version
00:11:06has now sort of
00:11:07crawled through
00:11:08the repo itself
00:11:09if I ask additional questions
00:11:11the token cost
00:11:12won't be as
00:11:13off
00:11:14however
00:11:14since we have
00:11:16the knowledge graph
00:11:16built
00:11:17whenever we want
00:11:18to ask questions
00:11:18about it
00:11:19via graphify
00:11:20well we're not
00:11:21going to have to
00:11:21deal with that
00:11:22token cost
00:11:22of going through
00:11:23it again and again
00:11:24and that kind of
00:11:25leans into the
00:11:26whole memory piece
00:11:26like we've built
00:11:27it out already
00:11:28we can always
00:11:28query it for cheap
00:11:29now the question
00:11:30then becomes
00:11:31if this is a
00:11:31living breathing repo
00:11:32what happens
00:11:33when we make
00:11:34updates to the repo
00:11:35will this knowledge graph
00:11:35also be updated
00:11:36well the answer
00:11:37is yes
00:11:38we see this spelled
00:11:39out in the workflow
00:11:40in the readme
00:11:40if we run
00:11:41graphify hook install
00:11:42it's going to
00:11:43auto rebuild
00:11:44after each commit
00:11:45and that is the
00:11:45AST only
00:11:46there's no API
00:11:47cost associated
00:11:48with that
00:11:48it's literally
00:11:49just looking at
00:11:50what actually
00:11:51changed
00:11:51what is it now
00:11:52connected to
00:11:53and it rebuilds
00:11:53that tree
00:11:54but it's at no
00:11:54cost to you
00:11:55like this is
00:11:56all done
00:11:56in a deterministic
00:11:57way
00:11:58furthermore
00:11:59this also works
00:12:00in a team
00:12:00setup
00:12:01so if you had
00:12:01two devs
00:12:02working on
00:12:02the same repo
00:12:03in parallel
00:12:04it also deals
00:12:04with that situation
00:12:05so in the end
00:12:06you get this
00:12:07persistent yet
00:12:08living map
00:12:09of whatever repo
00:12:09you want
00:12:10that you can give
00:12:10the cloud code
00:12:11so you can get
00:12:12more efficient
00:12:13answers
00:12:14and lastly
00:12:14we hinted at it
00:12:15a little bit here
00:12:16with the obsidian flag
00:12:17we can do all this
00:12:18with the repo
00:12:19that is not code based
00:12:19it's a little bit
00:12:20different and we are
00:12:21actually going to do
00:12:22that in another video
00:12:23where we drill down
00:12:23on graphify and obsidian
00:12:25and sort of what
00:12:26that connection looks like
00:12:27but just understand
00:12:28we aren't pigeonholed
00:12:29into code only
00:12:30this is a pretty
00:12:31flexible tool
00:12:32but that is where
00:12:33I'm going to leave
00:12:33you guys for today
00:12:34I think this is a
00:12:35really cool tool
00:12:36and when you look
00:12:37at the spectrum
00:12:37of sort of these
00:12:39like memory adjacent
00:12:40applications and plugins
00:12:42that we can use
00:12:43alongside things
00:12:43like cloud code
00:12:44and codex
00:12:44I think graphify
00:12:45sort of falls
00:12:46somewhere in between
00:12:47obsidian
00:12:48and a true rag system
00:12:49and I think that's great
00:12:50the more options we have
00:12:52the more tools we have
00:12:53at our disposal
00:12:53the better we can choose
00:12:54the right one for the job
00:12:55we don't have to only
00:12:56use obsidian
00:12:57you know we might not
00:12:58just be doing something
00:12:59in markdown
00:12:59and we don't have to go
00:13:00crazy and generate
00:13:02some huge rag
00:13:03infrastructure
00:13:04this is again
00:13:04it's a cool little
00:13:05middle ground
00:13:05that I think
00:13:06is worth exploring
00:13:06so as always
00:13:08let me know
00:13:08what you thought
00:13:09make sure to check out
00:13:10Chase AI Plus
00:13:11if you want to get your
00:13:11hands on the
00:13:12cloud code masterclass
00:13:13speaking of obsidian
00:13:14I'm actually going to be
00:13:15running a free
00:13:16live webinar next week
00:13:17about obsidian
00:13:18and cloud code
00:13:19I'll put a link to that
00:13:19down there as well
00:13:21and besides that
00:13:22I'll see you around

Key Takeaway

Graphify improves AI coding assistant accuracy and reduces token costs by converting repositories into persistent knowledge graphs, which provide a structured map for querying codebases.

Highlights

  • Graphify maps codebases into a knowledge graph to provide AI coding assistants with a structural map instead of relying solely on file-based searches.

  • Using Graphify to query an OpenDesign repository reduced token usage to approximately 80,000 tokens, compared to 200,000 tokens without it.

  • The tool performs three distinct passes to build its graph: deterministic code structure parsing, transcription of audio/video files, and semantic analysis of documentation.

  • Graphify creates nodes for code components, edges for connections, and community clusters to organize data, with the demo generating 109 communities from 203 files.

  • An automatic rebuild hook allows the knowledge graph to stay updated after every commit without additional API costs by only parsing changed files.

  • The tool functions as a middle ground between Obsidian vaults and full-scale RAG systems, operating without complex embedding requirements.

Timeline

Knowledge Graph Solution for AI Coding Assistants

  • Graphify maps entire project files into a queryable knowledge graph.
  • AI agents gain a structural map of code connections rather than relying on basic file grep searches.
  • Token usage decreases significantly while answer accuracy improves.

Standard AI coding assistants function like a simple Control-F search through files. Graphify upgrades this process by traversing codebases to map classes, functions, and imports. This approach creates a clear map of how A connects to B, allowing assistants like Claude Code to retrieve context more efficiently.

Data Extraction and Graph Construction

  • Data extraction occurs through three passes: code parsing, media transcription, and document semantic analysis.
  • The graph structure consists of nodes, edges for connections, and community groupings for similar elements.
  • Graphify functions as a middle ground, unlike traditional embedding-based RAG systems.

The first pass is deterministic and uses tree-sitter to parse code structures locally without an LLM. The second pass processes video and audio using Faster Whisper, while the third pass uses an LLM to analyze documentation, images, and PDFs. This results in a graph where nodes represent specific components, edges show relationships, and communities represent large logical groupings.

Installation and Usage

  • Installation is executed by running commands within the target directory.
  • The Graphify skill integrates directly into coding agents to manage query logic automatically.
  • Specific commands allow users to query the graph, explain code, or ensure the tool is used for every interaction.

Users can install Graphify manually or via their coding agent. Once installed, the Graphify skill teaches the AI when to use the knowledge graph versus standard search methods. Key commands include query and explain, which force the agent to consult the mapped structure, and a hook command to automate graph updates.

Performance Demo and Workflow Integration

  • Querying the OpenDesign repository with Graphify used 80,000 tokens compared to 200,000 without it.
  • Auto-rebuild hooks update the graph after every commit based on code changes.
  • The tool supports parallel team workflows and offers an Obsidian export feature.

Testing on the OpenDesign codebase demonstrated a significant reduction in token usage for complex questions. The system maintains a living map that stays current through commit hooks, which parse only the changed files locally. This provides a persistent, low-cost reference point for coding assistants, even in evolving projects.

Community Posts

No posts yet. Be the first to write about this video!

Write about this video