This Tool Fixes AI Coding at Scale with 70x Fewer Tokens (Graphify)

Englishالعربية Deutsch Español Français हिन्दी Bahasa Indonesia 日本語 한국어 Português Русский 中文

Computing/SoftwareSmall Business/StartupsInternet Technology

Transcript

00:00:00This might be one of the most insane ways to bring your code base to life.

00:00:04If you're using Clod Code or Cursor on a real project, you think the hard part is writing code.

00:00:09Well, it's not. The hard part is just understanding your own repo.

00:00:13You ask one question and your AI is burning through your tokens just to figure out what's going on.

00:00:18It's slow, expensive, and half the time still raw.

00:00:22What if instead of sending your whole project every time, you gave the AI a map of it?

00:00:27That's exactly what Grafi does, and it can cut token usage by over 70%.

00:00:32Let me show you how all this works.

00:00:34Right now, your AI sees your project like this. Just a pile of files.

00:00:44There's no real connections. There's no structure. There's no memory.

00:00:48So every time you ask a question, it has to relearn everything from scratch.

00:00:53That's why answers feel close, but not quite right.

00:00:56And yeah, this is exactly what Carpathi pointed out when the raw folder problem.

00:01:01Grafi showed up right after that. It's more of a memory layer.

00:01:06If you enjoy coding tools and tips like this, be sure to subscribe.

00:01:09We have videos coming out all the time.

00:01:11Alright, now let me show you. I've got a small repo here. Code docs diagram.

00:01:16Now, normally I'd have to explain all this to AI every time.

00:01:20Instead, I run one command, Grafi, right here. Give it a second. Now look at this.

00:01:27After Claude executes Grafi, this isn't just files anymore. It's an actual graph.

00:01:33Everything is connected. I can click and dissect actually what's going on

00:01:38and what is linked together to just here within the HTML file that it generated.

00:01:42Then instead of asking AI to read everything again, I can now ask it what connects off to the API layer.

00:01:50And now it answers using relationships, using the MD file that it generated with this call.

00:01:56It's not guesses, it's relationships. And here's the part that surprised me.

00:02:00Before this, around 14,000 tokens, okay, however many were used.

00:02:04After this, after it executes the first time, we drop that down to maybe a couple hundred.

00:02:09Same question, completely different cost. All because of this generated map.

00:02:14So what is this actually doing? Grafi is basically like Google Maps for your code base.

00:02:20Instead of raw text, you get nodes and connections.

00:02:24Under it all, it uses tree sitters to understand the structure, then an LLM to extract the meaning.

00:02:30Then it can group everything into clusters, and it's not just code.

00:02:35It reads PDFs, diagrams, even audio and video. All locally, nothing leaves the machine.

00:02:41What you get from this is simple. We get a visual graph, a written report,

00:02:46and a knowledge base we can actually explore.

00:02:49This visual graph is huge for a lot of us as we can see how things connect.

00:02:54Now here's where this changes how AI coding usually works.

00:02:57Most tools use rag, which basically means find similar chunks of text.

00:03:03Well, Grafi doesn't do that. It builds real relationships.

00:03:07This function calls that one. This module depends on that.

00:03:11This idea came from this document, and it even tells you how confident it is.

00:03:16So instead of this looks related, we get something like this is actually connected

00:03:21in an actual visual representation of what is connected.

00:03:24And the biggest difference here, it remembers too since it generated us that MD file,

00:03:30it can look back on. We're not starting from zero every time.

00:03:33It updates only what changed so your AI finally has context that sticks.

00:03:38All right, now I actually thought all this was pretty sweet.

00:03:42But what are the good and the bad things here and now?

00:03:44First up to the plate, the efficiency compounds.

00:03:47Every question gets cheaper. And because it connects code,

00:03:51docs, diagrams, you start finding relationships you didn't even know existed.

00:03:56That's huge for onboarding for these messy projects that we get dumped into.

00:04:00That's great. Now the drawbacks to all this are this.

00:04:03The first rung can be slow and cost tokens, especially with a lot of documents.

00:04:08After that, it's cached. But yeah, that first hit is real.

00:04:12It's also early, so long-term support is still a known and small thing.

00:04:17When you install this, it's graphy with two Y's, not one.

00:04:20So check your spelling on that. The relationships aren't always perfect,

00:04:23but it labels them clearly. Extracted, inferred, ambiguous,

00:04:28so you know what you can actually trust. And if your repo is tiny,

00:04:32this is going to be somewhat of an overkill. So is it worth it?

00:04:35I mean, yeah, if you're using AI on anything real, this is cool.

00:04:38I thought it was worth it. Because your biggest problem isn't running the code,

00:04:42it's actually understanding it across files, across time, across context.

00:04:46And that's exactly what this fixes. The token savings alone make it worth trying,

00:04:51but the bigger win is this. Your AI stops guessing and starts reasoning.

00:04:56If you're working solo, doing research, or have all these big systems, this is a serious upgrade.

00:05:01If you're just working on smaller scripts, this is probably just an overkill,

00:05:04so you don't really need to try it. But most devs who try this,

00:05:07this is going to be an awesome tool. If you enjoy coding tools and tips

00:05:10that speed up your workflow, be sure to subscribe to the Better Stack channel.

00:05:14We'll see you in another video.

Key Takeaway

Graphify acts as a persistent memory layer for AI coding tools by building a relational graph of dependencies and documentation, cutting token costs by 70% while improving reasoning accuracy.

Highlights

Graphify reduces AI token usage by over 70% by providing a structural map instead of sending raw code files.
Tree-sitters and LLMs work together to convert raw text into a local knowledge base of nodes and connections.
The tool processes code, PDFs, diagrams, and audio/video files entirely on the local machine for privacy.
Initial analysis of a small repository dropped token consumption from 14,000 to a few hundred for identical queries.
Relationships are labeled as extracted, inferred, or ambiguous to define the reliability of AI reasoning.
Graphify utilizes an MD file memory layer so the AI only updates changed files rather than relearning the entire repo.

Timeline

The inefficiency of file-based AI context

Project comprehension is the primary bottleneck in AI-assisted coding rather than syntax generation.
Traditional AI tools treat repositories as disconnected piles of files without inherent structure or memory.
Sending entire projects for every query results in slow response times and high operational costs.

Current AI coding workflows suffer from the 'raw folder problem' where context is lost between prompts. This lack of structure forces the model to relearn the repository every time a question is asked. Consequently, answers often lack precision because the model is guessing based on proximity rather than actual code relationships.

Mapping code relationships with Graphify

One command generates an interactive HTML graph and a markdown-based knowledge map.
Querying the AI through the generated map uses specific relationships instead of statistical guesses.
Token usage for a single query dropped from 14,000 to approximately 200 after the first execution.

Running the Graphify command transforms a standard repository into a connected web of dependencies. Users can click and dissect links between HTML files and API layers directly within a browser. This map serves as a reference for the AI, allowing it to navigate the codebase like a GPS rather than reading every file for every prompt.

Technical architecture and data handling

Tree-sitters analyze code structure while LLMs extract semantic meaning from the files.
Data processing happens locally and includes non-code assets like PDFs, diagrams, and audio.
Graph-based relationships replace standard Retrieval-Augmented Generation (RAG) text chunking.

Unlike standard RAG that finds similar text chunks, this system identifies functional dependencies such as which function calls another or which module depends on a specific document. The system tracks confidence levels for these links, distinguishing between explicit code connections and inferred ideas. It maintains context over time by only updating the map for files that have changed.

Practical constraints and implementation

The first execution incurs a high initial token cost and slow processing speed for large document sets.
Efficiency gains compound over time as the graph is cached and reused for subsequent prompts.
Small scripts and tiny repositories do not benefit from the overhead of a full relationship map.

The tool is most effective for onboarding onto complex, messy projects where cross-file relationships are unclear. While the initial run can be resource-intensive, the subsequent savings in time and tokens justify the setup for professional development. It is available as 'graphy' with two Y's and is designed for local-first workflows where privacy is a priority.

Community Posts

Write about this video