This Huge Update Changed The Way I Use Claude Code

Englishالعربية Deutsch Español Français हिन्दी Bahasa Indonesia 日本語 한국어 Português Русский 中文

Computing/SoftwareSmall Business/StartupsInternet Technology

Transcript

00:00:00No single Claude model is enough on its own. Opus has the reasoning but burns through your

00:00:04limits. Sonnet is fast but hits a wall on harder decisions. And the answer isn't picking one over

00:00:10the other. It's using all of them together. Now Claude code already does this to some extent.

00:00:14It orchestrates between models on its own. But Anthropic just released something that

00:00:18not only saves tokens but also makes smaller models almost as capable as the larger ones.

00:00:23Now when building with Claude you might have noticed this. Whenever you hand Opus a task

00:00:28and it determines that it doesn't need that much effort, it hands it off to Sonnet or Haiku and

00:00:32delegates tasks to the smaller models in order to manage token usage properly. But there's a problem

00:00:37with this approach. As we mentioned in our previous video, Anthropic has been lowering the rate limits

00:00:42so during peak hours your 5 hour window fills up faster. And on top of that, Opus consumes a lot

00:00:47of tokens even on simple tasks which means using Opus means your context limit fills up faster.

00:00:52Anthropic decided to flip the script on this and they came out with something called the

00:00:55Advisor strategy. The way this strategy works is that you give the role of executor to the Sonnet

00:01:00model and use Opus purely as an advisor that only gets consulted when the executor actually needs

00:01:05it. There are two agents involved. The executor is your main agent running on Sonnet and it handles

00:01:10all tool calls, code changes and user facing output. The advisor runs on Opus and its only

00:01:15job is to guide the executor when it gets stuck. The advisor never writes code or makes any changes.

00:01:20When Anthropic experimented with this approach, they found it outperformed Sonnet alone on the

00:01:25SWE bench. They found that this combination outdid Sonnet alone in terms of both performance and cost.

00:01:31And it costs significantly less than running Opus as the main agent because Opus only gets invoked

00:01:36when it actually matters, not for every single iteration. Now you might think that we already

00:01:40have a lot of frameworks for building apps that are better and ready to use so why bother with this

00:01:45setup? The reason is that most existing frameworks are not built with cost and token efficiency in

00:01:50mind. Even though they get the job done, they fall short when it comes to making Claude run longer

00:01:54and more efficiently because they are primarily focused on building the app rather than optimizing

00:01:59for token usage. With this setup, you can build a working app using a weaker model, making the

00:02:04whole process far more token efficient. And that connects back to the limits problem we mentioned

00:02:09earlier. We already made a video on Claude's limits and told you to switch to a smaller model to make

00:02:13it last longer. Here's how it connects. Sonnet consumes way fewer tokens and requires less effort

00:02:19than Opus to perform the same task. Opus is a very large and powerful model so it consumes a lot of

00:02:24tokens even for simple tasks. Sonnet is able to handle many of those tasks more efficiently. So

00:02:30using Opus only to bridge the performance gap on harder decisions is where the real impact comes in.

00:02:35You're only invoking that power when you actually need it, not for every single task. This makes the

00:02:40overall usage more token efficient and lets you get more done within the same limits. We share

00:02:45everything we find on building products with AI on this channel so if you want more videos on that

00:02:50subscribe and keep an eye out for future videos. So we wanted to test how this actually plays out on an

00:02:55app that was already built using Sonnet. To use the strategy inside Claude code we set the advisor

00:03:00command with Opus 4.6 as the advisor model. Our main agent was the executor which I had already

00:03:05set to Sonnet since I built the app using it. The app was supposed to have real-time sync and while

00:03:10moving and resizing elements synced perfectly across sessions deletion wasn't syncing at all. We tried

00:03:16debugging this multiple times with Sonnet on its own but the issue kept persisting no matter how

00:03:20much it tried to fix the issues. So after turning on Opus as the advisor we gave Claude the prompt

00:03:25describing the problem and because Sonnet had already failed multiple times instead of taking

00:03:30another shot on its own it decided to invoke the advisor this time. The advisor reviewed the

00:03:34conversation so far to assess the situation. It provided the exact changes that needed to be made pinpointing

00:03:40where the sync logic was breaking and what specifically needed to be restructured. The executor model took

00:03:45in that advice and it applied those fixes directly without any additional back and forth. We tested it

00:03:50across multiple devices to test the sync and found that the issue was resolved. Both ends were

00:03:55reflecting deletions properly as intended even if the user had selected the item at one end and the

00:04:00other end was being deleted which wasn't the case previously. If we had tried fixing this using Sonnet

00:04:05alone it would have taken more rounds of back and forth prompting because Sonnet inherently is a

00:04:09weaker model and not capable enough to handle complex logic by itself. On the other hand using Opus alone

00:04:15would have consumed far more tokens and likely wouldn't have been this fast. Using Sonnet with Opus

00:04:20as an advisor made the process much more efficient. So overall this strategy helped debug syncing issues

00:04:25much faster than before. But before we move forwards let's have a word by our sponsor Juni by JetBrains.

00:04:30If you're a developer you know the struggle context switching between your terminal IDE and CI pipelines

00:04:36just to get stuff done. Most coding agents lock you into one environment or one specific LLM and

00:04:41call it a day. Juni CLI is different. It's an LLM agnostic coding agent that works everywhere. Your

00:04:47terminal, your IDE, GitHub, CI/CD pipelines, even your task manager. One agent everywhere. Delegate

00:04:54real work to it. Writing tests, building backends, refactoring, automating code reviews on every commit.

00:04:59Right now JetBrains is running a free early access program including $50 in Gemini credits to test the

00:05:04agent plus BYOK support so you can use any model you prefer. Full access to all features, early access

00:05:10to new ones and direct support from the dev team shaping the product. It's simply better with Juni.

00:05:15Click the link in the pinned comment to join for free. Now we wanted to test whether Sonnet actually

00:05:20consults the advisor for major UI changes. We had a previously built application and we wanted to

00:05:25transform its UI to a different library. On top of that we wanted to make multiple UI changes in one

00:05:31go which isn't normally recommended but we wanted to see how the smaller model performs in coordination

00:05:36with the larger one on a bigger task. It first accessed the current UI using the Playwright MCP.

00:05:41Once it understood the layout instead of jumping straight into code changes it consulted the advisor

00:05:46to determine the best approach because it was a major critical change and might break the app if

00:05:50handled wrongly. The advisor reported that the library we chose as a new library and the one that

00:05:55was already used in the project had version issues. So before any UI work could start Claude needed to

00:06:00resolve these first. Sonnet handled those first, ran multiple commands to make sure the dependencies

00:06:04were properly applied then checked the current state of the UI through Playwright to confirm the app was

00:06:09still running correctly with no client side issues. Once the dependencies were sorted it started making

00:06:14the changes as the advisor suggested working through each component one by one and effectively

00:06:18redesigning the app as a whole. The UI it created was much more interactive and looked significantly

00:06:23more polished than before. It still had some issues but the overall improvement was clear. But here's

00:06:27where the limitation showed up. The entire process took around 31 minutes. Opus on its own would have

00:06:32done this much faster because it's better at orchestrating tasks by identifying what can run in

00:06:37parallel and executing them at the same time. Sonnet being a smaller model handled everything sequentially

00:06:43without breaking any of the work into parallel sub-agents. For an app that wasn't even that complex

00:06:4831 minutes is longer than it should have been. It also handles smaller changes on its own without

00:06:53involving the advisor which is the right behavior for minor tweaks. But for large scale changes across

00:06:58an entire app like this you're better off using Opus directly because that will save you significantly

00:07:03more time and effort. Now we wanted to test whether it implements a completely new feature on an

00:07:08existing code base properly. We had an app already built and wanted to add another page with a

00:07:13different feature to it. We gave it a prompt describing what we wanted and this time we fully

00:07:17expected it to use the advisor because it wasn't a simple task but it went ahead and implemented

00:07:22the changes entirely on its own without consulting the advisor at all. It treated the whole thing as

00:07:27routine implementation work which it clearly wasn't given the scope of the feature. When we tested the

00:07:31application we found multiple issues. If we modified something and pressed the run button changes like

00:07:37heading updates or color adjustments were also reflected in components outside the preview pane

00:07:41which shouldn't happen. On top of that we wanted it to sync directly instead of requiring us to press

00:07:46run again after every change. So we prompted it again and told it to use the advisor to fix

00:07:51these issues. Upon our prompt it first invoked the advisor agent. The advisor looked at the

00:07:56implementation and identified what was actually causing both problems. That being the wrong

00:08:00component choice. It laid out what needed to change and why the original approach had introduced those

00:08:06issues in the first place. The executor took that guidance and applied it across the app. When we

00:08:10tested it again streaming worked correctly. All changes reflected immediately as we edited without

00:08:16needing to press run after every modification. The issue of changes bleeding across components

00:08:20was also resolved and everything updated properly within the right boundaries. So there are times

00:08:25when it works exactly as intended but other times the executor assumes a task is small enough and

00:08:30decides not to consult the advisor. In those cases you often have to nudge it yourself so it follows

00:08:35the intended workflow. The model doesn't always judge the complexity of a task the same way you

00:08:40do and when it misjudges you end up with bugs that the advisor would have caught from the start. Also

00:08:44if you are enjoying our content consider pressing the hype button because it helps us create more

00:08:49content like this and reach out to more people. With real-time distributed state involved this

00:08:54approach still needed multiple rounds of prompting before everything was working correctly. The

00:08:58strategy helped but it has a ceiling you should understand before committing to it for a project.

00:09:02For simpler to medium scale applications the advisor strategy can save you several rounds

00:09:07of back and forth that you'd otherwise spend trying to push sonnet past its limits on its

00:09:11own. If what you're building requires occasional deep reasoning but mostly straightforward

00:09:16implementation this is a genuinely good structure for it. You can build more within your token limits

00:09:20without having to babysit the model through every decision or fall back to opus for the whole session.

00:09:25For complex apps with many connected dependencies or multiple failure points you're better off just

00:09:30using opus directly as your main agent. Even when sonnet follows the advisor's guidance correctly

00:09:36it can still choose the wrong implementation path because it doesn't have the reasoning depth to

00:09:40evaluate multiple approaches at once and weigh the downstream consequences. The advisor helps close

00:09:45that gap but it doesn't fully close it. In those cases the back and forth can cost you more time

00:09:50than running opus from the start would have. So this strategy is useful when you're working within

00:09:54tight token limits and the application doesn't require opus level reasoning at every step. If

00:09:58both of those conditions are true for what you're building it's worth setting up. That brings us to

00:10:03the end of this video. If you'd like to support the channel and help us keep making videos like this

00:10:08you can do so by using the super thanks button below. As always thank you for watching and I'll

00:10:12see you in the next one.

Key Takeaway

The Advisor strategy optimizes performance and cost by using Sonnet for execution and Opus for high-level reasoning, though it remains up to 3x slower than Opus for large-scale parallel tasks.

Highlights

The Advisor strategy uses Claude 3.5 Sonnet as the main executor and Claude 3 Opus exclusively as a consultant to bypass token limits and rate restrictions.
This dual-agent setup outperformed Sonnet alone on the SWE bench in both performance and cost efficiency.
Claude 3 Opus consumes significantly more tokens than Sonnet even for simple tasks, making it inefficient for full-session usage.
A complex UI migration project using this strategy took 31 minutes because Sonnet executes tasks sequentially rather than in parallel.
The Advisor strategy resolved a real-time sync deletion bug that Sonnet alone failed to fix after multiple debugging attempts.
Sonnet often misjudges task complexity and fails to invoke the Advisor for new feature implementations unless manually prompted by the user.

Timeline

The inefficiency of single-model workflows

Opus possesses superior reasoning but exhausts token limits and 5-hour rate windows quickly.
Standard orchestration delegates tasks to smaller models but still consumes excessive Opus tokens for simple context management.
Anthropic has been lowering rate limits, forcing a shift toward more token-efficient strategies.

Relying on a single model creates a trade-off between reasoning depth and operational speed. While Opus handles harder decisions, its high token consumption fills the context limit even during routine tasks. This problem is exacerbated by peak-hour rate limit reductions that shorten the usable window for developers.

The Advisor strategy mechanics and benefits

The executor role belongs to Sonnet for tool calls and code writing, while Opus acts as a non-coding advisor.
Combinatorial usage of Sonnet and Opus outperforms Sonnet alone on the SWE bench for technical tasks.
Existing app-building frameworks often prioritize deployment over token and cost efficiency.

The Advisor strategy flips the traditional hierarchy by making the smaller model the primary agent. The executor manages all user-facing output and implementation while only consulting the advisor when it reaches a logic threshold. This configuration maintains high performance while significantly lowering the cost of long development sessions.

Debugging complex state synchronization

Sonnet failed to resolve a real-time deletion sync issue despite multiple independent debugging rounds.
The Advisor pinpointed the exact breakdown in sync logic and provided a restructure plan for the executor.
The combined approach resolved the bug across multiple devices where deletions previously failed during item selection.

Testing the strategy on a real-time sync application revealed that Sonnet struggles with complex logic gaps on its own. After activating the Advisor, the Opus model reviewed the full conversation context to identify the root cause of the sync failure. The executor then applied the specific logic changes without further back-and-forth prompting.

Limitations in large-scale UI transformations

A full UI library migration took 31 minutes because Sonnet lacks the parallel orchestration capabilities of Opus.
The Advisor correctly identified version dependency conflicts before the UI work began.
Directly using Opus is more efficient for large-scale changes across an entire application codebase.

During a UI library overhaul, the Advisor strategy was hindered by Sonnet's sequential execution style. While the advisor accurately spotted library version issues, the executor could not break the workload into parallel sub-tasks. For projects involving interconnected dependencies and massive code changes, the time saved by Opus's speed outweighs the token savings of the Advisor strategy.

Failures in complexity assessment and implementation

Sonnet occasionally treats large feature implementations as routine work and fails to consult the Advisor.
Manual user intervention is required when the executor misjudges the reasoning depth needed for a task.
The strategy serves medium-scale applications well but has a ceiling for complex apps with many failure points.

In a test adding a new feature page, the executor attempted the build without help, resulting in UI bugs where changes bled across unrelated components. The user had to manually 'nudge' the model to use the advisor, which then identified the wrong component choice as the root cause. This highlights that while the strategy closes the reasoning gap, it does not replace the need for human oversight in complex environments.

Community Posts

Write about this video