The toolkit from Y Combinator CEO that Will Makes Claude Code Amazing

BBetter Stack
Computing/SoftwareSmall Business/StartupsManagementInternet Technology

Transcript

00:00:00The CEO of YCombinator has built his own toolkit for Claude's code called GStack, his secret
00:00:06to crushing almost a hundred PRs in seven days, which includes nine specialised workflows,
00:00:13a headless browsing mode using Playwright, Greptile integration, a diff-aware QA and much
00:00:18much more.
00:00:19But Gary's recent tweet about the future of code has got a lot of developers really
00:00:24annoyed.
00:00:25So what does that mean for the future of GStack?
00:00:28Subscribe and let's find out.
00:00:33Gary Tan has been the CEO of YCombinator since 2023 and before that he co-founded a venture
00:00:39capitalist firm in 2011.
00:00:42So he has loads of experience when it comes to going through pitches and finding out what
00:00:46makes a new piece of tech unique.
00:00:49And he's put all of that knowledge into his own Claude's code toolkit, which you can
00:00:53tell by looking at the names he's given to a lot of his workflows.
00:00:57In fact, let's give GStack a spin.
00:00:59So for GStack to work, you'll need to have Claude code installed as well as BAN, but once
00:01:03you've properly installed it on Claude's code by prompting with this exact text or from
00:01:08just downloading the skills, you should have this information added to your Claude MD file.
00:01:12Mine was empty.
00:01:13That's why this is the only thing here, but if you have some text, then this will be added
00:01:17to it.
00:01:18It also puts all the relevant skills into the skills directory if you want to share it with
00:01:21your teammates and then installs Playwright with the appropriate browser.
00:01:25Now I'm going to use GStack to add a feature to this React Vite application to give the
00:01:30user the ability to download an image of a tweet with a specific URL.
00:01:34Now you may have seen me add this feature in a previous video.
00:01:37I'll have a link to it in the description if you want to see what the results were, but
00:01:41we'll see if GStack can do better than that.
00:01:44So first I'll need to start in plan mode, then use the plan CEO review skill and give GStack
00:01:49some information about the feature.
00:01:51Now I'm going to say add a feature that takes a screenshot of a tweet from the URL provided
00:01:56by the user.
00:01:57I also want the user to customise and download the image and I want Claude to honour the existing
00:02:02layout and styles.
00:02:03So after I hit enter, GStack first checks if there are any updates to that skill and then
00:02:08checks the git log before proceeding.
00:02:10Now this mode rethinks the problem from the perspective of a founder/CEO and tries to think
00:02:16of the best possible version of what we're trying to build and challenges assumptions
00:02:20about the scope and value.
00:02:21So once it's done that, it allows us to choose how much we want to challenge the original
00:02:26scope.
00:02:27And here I'm going to go with the scope expansion because it has the most amount of features.
00:02:30Then it lets us choose a critical architectural decision.
00:02:33I'm going to go with the recommended since it's the easiest.
00:02:36And then it asks a few more questions, which again, I'm going to go with the recommended
00:02:39approach.
00:02:40And now that it's finished, it's come up with a mega plan showing the scope mode selected
00:02:44and everything it's going to do that's in that scope.
00:02:47And it's also written some things that are not in scope for this feature.
00:02:50And then down here we have the implementation plan, which has an architecture diagram, key
00:02:55decisions and different steps.
00:02:57This is an insanely detailed plan similar to something I'd get from superpowers if I went
00:03:01through the same route.
00:03:02Note, there's also a plan engineering review skill in GStack, which turns Claude into an
00:03:07engineering manager or tech lead to come up with architectural diagrams, lock in the tech
00:03:12stack, define edge cases and so on.
00:03:15But it looks like the plan CEO review skill has got ahead and done some of that already.
00:03:20So we're jumping to the implementation.
00:03:22And now that it's done, we can run the review slash command to review missing edge cases,
00:03:27find bugs that would have passed CI and basically catch any issues before they hit production.
00:03:32So again, that checks for new updates inside the script, checks the diff.
00:03:36And now it's checking the completeness of the task before giving us a summary saying that
00:03:40no issues have been found.
00:03:41And now we can run the ship slash command, which syncs with the main branch, runs tests
00:03:46and resolves any grep tile reviews if they exist.
00:03:49And here we can see it's gone ahead and created a pull request without me even telling it to.
00:03:54And then at this stage we can run the QA slash command, which will test only the changes we've
00:03:58made based on the diff.
00:03:59And here we can see it started the server locally, and it's going through the website to test
00:04:05the features that have just been implemented using screenshots and much more.
00:04:09It's found some 500 errors from screenshots and has found a bug with JSON pass, which it
00:04:15looks like it's fixed.
00:04:16Here we go.
00:04:17It's verified and pushed the fix.
00:04:20And now it's written a final report with the issues that it solved.
00:04:24This is very cool.
00:04:25Okay.
00:04:26So now it's done.
00:04:27Let's go ahead and try the feature.
00:04:28And now we have a screenshot page.
00:04:30Let's grab a tweet from Tana.
00:04:32So this one, and I'll paste that in here.
00:04:34It's not the most exciting tweet, but it's just a test if this works.
00:04:37And wow, okay, this is super impressive.
00:04:40We have the tweet here.
00:04:42We can pick between lights and it's capturing again.
00:04:44Oh, wow.
00:04:45Okay.
00:04:46So we've got light and dark mode.
00:04:47We'll see if it's cache that.
00:04:49And it has very cool.
00:04:51I can hide the actions and here we go.
00:04:53So I can show and hide the images and I can change the background.
00:04:58This is very cool.
00:04:59So you've got LinkedIn, we've got Twitter, blog, gradient purple, and we can even customize
00:05:03it or change the angle of the gradient.
00:05:07Wow.
00:05:08This is super fully fledged and we can change the aspect ratio.
00:05:11So we've got nine by 16, 16 by nine, one by one and so on.
00:05:16Let's now actually downloads the image.
00:05:18And here we go.
00:05:19If I now click on this, you've seen all my tabs.
00:05:22We have the image here.
00:05:23I'm going to open it and preview.
00:05:24And this is it.
00:05:25This is the image I just took with the feature I just built with G stack, which is insanely
00:05:29impressive, but there's more that we can do because if we go back to the PR, we can see
00:05:34Greptile has a summary, so it's found some resource exhaustion from the server, race condition,
00:05:40no cash expiry, and so on.
00:05:42And instead of me asking Claude to look at the issues and solve them, we're just going
00:05:47to run the review slash command.
00:05:49It's found all the comments.
00:05:50It's given me some options down here on how to fix them, which I'll go through.
00:05:53And now it's fixed all the issues.
00:05:55Well, apart from one false positive and has pushed the code, Greptile seems to be happy.
00:06:00As someone who regularly uses superpowers, I can already see the benefit of G stack, even
00:06:05though some aspects of it are quite complex.
00:06:08But what about Gary's comment on Twitter saying that Markdown is the new code?
00:06:13Well, I can kind of see where he's coming from.
00:06:15I don't think he's saying someone with a computer science degree has wasted their time purely
00:06:20because you can write Markdown and it will write the code.
00:06:22I think it's more to do with the instructions because newer models are getting better at
00:06:27obeying Markdown instructions before there was a time when I would need to have a Claude
00:06:32code hook just to make sure he uses Bun to install instead of using NPM.
00:06:36But now I can put it in the Claude MD and with a good model like Opus, it tends to obey 90
00:06:42to 95% of the time.
00:06:44So I think what he's trying to say is that if you have a detailed enough and well-structured
00:06:49Markdown file, the model can create a good piece of software based on those instructions.
00:06:55But this isn't to say that GStack is just a bunch of Markdown instructions.
00:06:59Each skill has its own directory, even the ability to upgrade GStack.
00:07:03And if we focus on the browse skill, we can see there's a template file and the actual
00:07:08skill file.
00:07:09And this isn't anything to do with Go templates, regardless of what the GitHub page says.
00:07:14The way this works is if we go to scripts and then we go to genskill, the TypeScript file
00:07:20will read the template files and replace any placeholders inside them with actual Markdown.
00:07:26But I'm not going to focus on each skill individually because they're quite detailed.
00:07:30But what I will focus on is that the browse skill has more than just a skill MD file because
00:07:35we have a test directory here and we also have the source directory which contains the actual
00:07:40implementation for browser management and so on.
00:07:42So we can already see that the commands here are fairly involved.
00:07:46But if we take a look at the changelog, this shows some really interesting features like
00:07:49end-to-end observability, incremental eval saves and so on, which is used for developing
00:07:55the app.
00:07:56It shares reviews in a to-do format.
00:07:58It supports screenshot element and region clipping, not to mention all of the integrations it has
00:08:03with Greptile and the fact that it was built with Conductor in mind.
00:08:07So the million dollar question is, will I personally use GStack?
00:08:11And I would say actually I'm going to try it out for 30 days.
00:08:15So I'm going to delete the superpowers plugin and make GStack my main code tool for preparing
00:08:21features and fixing bugs and see how it goes.
00:08:23Who knows?
00:08:24I might just clone the next Vercel open source tool and start some more beef on Twitter.

Key Takeaway

GStack is a powerful, founder-centric expansion for Claude Code that automates the entire software lifecycle from architectural planning to automated QA and PR shipping.

Highlights

GStack is a specialized Claude Code toolkit developed by Y Combinator CEO Gary Tan to streamline development workflows.

The toolkit features nine specialized workflows, including "Plan CEO Review" and "Plan Engineering Review" to automate high-level decision making.

GStack includes a headless browsing mode using Playwright and integration with Greptile for automated code reviews and PR management.

The tool demonstrated high efficiency by helping the developer complete nearly 100 pull requests in just seven days.

Gary Tan's philosophy that "Markdown is the new code" emphasizes that well-structured instructions are becoming the primary driver for AI software creation.

The toolkit allows for deep customization, including UI adjustments and aspect ratio changes in React applications through AI-driven commands.

Timeline

Introduction to GStack and Gary Tan's Toolkit

The video introduces GStack, a secret toolkit used by Y Combinator CEO Gary Tan to achieve incredible coding productivity. The speaker highlights how Tan managed to process nearly 100 pull requests in one week using these specialized workflows. GStack requires Claude Code and Bun to be installed, after which it adds specific skills to the user's local environment. This setup phase is crucial because it populates the Claude MD file with necessary instructions and installs Playwright for automated browser tasks. It sets the stage for a demonstration of how AI can handle complex engineering management and coding tasks simultaneously.

The Planning Phase: CEO and Engineering Reviews

The speaker demonstrates building a React feature that captures tweet screenshots using the "Plan CEO Review" skill. This mode is unique because it challenges the developer's assumptions from a founder's perspective, suggesting scope expansions or architectural shortcuts. The toolkit generates an incredibly detailed implementation plan, complete with architecture diagrams and a list of items out of scope. It effectively acts as a virtual product manager or tech lead by locking in the tech stack and defining edge cases before a single line of code is written. This section emphasizes that GStack is designed for high-level strategic planning rather than just simple code generation.

Implementation, QA, and Automated Shipping

Once the plan is set, the speaker uses commands like "review" and "ship" to handle the heavy lifting of development. The "QA" command is particularly impressive, as it starts a local server and uses screenshots to verify that the new features work as intended. During the demo, GStack successfully identifies a JSON parsing bug and a 500 error, fixes them automatically, and pushes the verified code. The toolkit even creates a pull request on GitHub without manual intervention. Integration with Greptile further enhances this by providing automated summaries of resource exhaustion and race conditions within the PR.

Markdown as the New Code and Technical Architecture

The discussion shifts to Gary Tan's controversial claim that Markdown is becoming the primary medium for programming. The speaker explains that as models like Claude Opus improve, they can follow complex Markdown instructions with 95% accuracy, reducing the need for manual boilerplate. However, the video clarifies that GStack is not just a text file; it contains a sophisticated directory structure with TypeScript generators and template files. Each skill, such as the browsing tool, includes its own source code and test directories to ensure reliability. This section provides a deep dive into the underlying file structure that makes these AI skills functional.

Final Thoughts and 30-Day Challenge

In the concluding segment, the speaker reviews the GStack changelog, noting features like end-to-end observability and incremental eval saves. These professional-grade tools suggest that GStack is aimed at serious developers who want to maximize their output. The speaker expresses such high confidence in the tool that they decide to delete their previous "Superpowers" plugin and commit to using GStack for 30 days. The video ends with a humorous note about using these tools to build the next big tech platform and engage in Twitter debates. It serves as a strong endorsement of Gary Tan's vision for AI-assisted engineering.

Community Posts

View all posts