▲ Community Session: How to create and publish skills

VVercel
컴퓨터/소프트웨어창업/스타트업AI/미래기술

Transcript

00:00:00[BLANK_AUDIO]
00:00:30[BLANK_AUDIO]
00:01:00@Hi everybody, how's it going?
00:01:25Welcome to another Versailles community session.
00:01:29We're really excited to have you here.
00:01:32If this is your first time in one of our sessions,
00:01:35hello, I'm Pauline Navas from the Versailles community team.
00:01:40You may have seen me hanging around in the community spaces.
00:01:44So this is always such a fun time for
00:01:46me to talk to you all live and connect with you all.
00:01:51It's already great to see some of you watching and tuning in.
00:01:56So if this is your first time joining one of our sessions and
00:02:00you can't see the chat and you want to ask questions,
00:02:02which I highly recommend for this session, you should.
00:02:06Feel free to join our community platform at community.versailles.com.
00:02:12And then click going for this event.
00:02:15And yeah, you use the chat and ask questions throughout the session.
00:02:20If you're watching on X or any other platform, feel free to use that as well.
00:02:25So for today's session, I'm super excited.
00:02:28I don't know if you can tell, but we're diving into something that's
00:02:32really shaping how developers work with AI agents.
00:02:36It's skills for Claude code.
00:02:39If you ever kind of wished your AI agents just knew how to do something like
00:02:44upgrade to Next.js the right way or follow your team's coding patterns.
00:02:49That's what these skills enable.
00:02:51So I'm really excited to introduce John from
00:02:56the AI DX team here at Versailles to run this workshop with you.
00:03:02Hi, John.
00:03:04>> Hey, Pauline.
00:03:05Hey, everyone.
00:03:05Thanks for coming in.
00:03:07>> It's so nice to see you.
00:03:09All right, let's get going.
00:03:12>> Let's do this.
00:03:13All right, so skills.
00:03:15It feels like they've been around forever, but probably two weeks old, who knows?
00:03:20So I'm going to walk through this presentation talking about skills.
00:03:24I'll show some off and please feel free to ask questions, interrupt me and
00:03:28such because I love talking about this stuff.
00:03:31So skills first and foremost, we're going to talk about creating them and
00:03:35publishing them.
00:03:36This presentation was actually created by a skill called a ReMotion Geist skill.
00:03:42So it uses some of our Vercel design language, pairs it up with ReMotion and
00:03:46built this out and I'll show that towards the end.
00:03:48But for right now, I'm just going to walk through some of the videos.
00:03:51So historically, we talked about prompt engineering for
00:03:55a long time when models weren't quite as good.
00:03:58Everyone had to make their prompts perfect and
00:04:00everyone thought prompt engineering would be a career.
00:04:03Now we're shifting over to context engineering and allowing these skills,
00:04:08these markdown files to be lazy loaded in later on.
00:04:13So there's the separation now of your initial prompt and
00:04:17then these pieces of context that can get loaded later on.
00:04:19We're going to call those skills.
00:04:22So skills started with Anthropic.
00:04:25They needed to teach cloud code specific tasks because they start with a blank
00:04:30slate, every model starts with a brand new, there's no memory.
00:04:33It's kind of like a baby's being born and they have no skills.
00:04:37All they have is like this inherent knowledge dumped into their heads.
00:04:42And so they start with nothing.
00:04:44So Anthropic's like, how are we going to solve this?
00:04:48Well, obviously it's solved with the markdown because every problem is solved
00:04:51with markdown these days and skills were born from that.
00:04:55So from here, you can now package up these skills.
00:04:58And the skills are markdown files which you can share between your teams and
00:05:04you can package up your own workflows internally.
00:05:07You can package them up and share them on your GitHub repos.
00:05:12And we have custom tooling which we've shipped.
00:05:15Let me bring over a browser real quick for managing kind of community skills.
00:05:22So skills.sh is a place where you can go.
00:05:27You can search for community skills.
00:05:29You can find some of the most used skills.
00:05:32As always, make sure you trust the skill first that it's from a trusted source or
00:05:36from someone on your team.
00:05:38So that you're using something, you know, it aligns with what you're trying to do.
00:05:42And we'll talk more about using this later on.
00:05:45But this is a great place to kind of search the entire ecosystem and
00:05:48community of skills to see what's available.
00:05:50So from here, just some details of how this workshop is usually taught and
00:05:56I'll hop over to the next video on this.
00:06:00So blank slates, blank slates mean that, again,
00:06:07the baby's being born and your agent once it starts up, you know,
00:06:11the agent being a model is being run by an agent.
00:06:14Let me make this a bit bigger.
00:06:16The agent knows kind of the basics of React and TypeScript and CSS and SQL.
00:06:23But what it doesn't know is what you have, your rules, your patterns,
00:06:28your system, your structure.
00:06:30And so you can feed in, you like combine what
00:06:35your personalized things are with what it knows and you fill in that knowledge gap.
00:06:40So you bring in that context and the skills are loaded in so
00:06:42that it can do those things exactly how you want them done.
00:06:45So that's, they know a language, they know TypeScript, they know React.
00:06:51They don't know your dialect of that language.
00:06:53It's a great way to think about it.
00:06:54All right, so any questions so far?
00:07:01Have I missed any?
00:07:03No questions so far.
00:07:06Just a lot of excitement in the chat.
00:07:08So keep going, John.
00:07:09Let's go.
00:07:11So a way I like to think about it is this is like an NPM moment for skills.
00:07:17So NPM being a package manager, which most of us are familiar with.
00:07:23And package managers have community bundled resources
00:07:27that you can use to help make your products run easier.
00:07:30So if you think of skills and skills.sh as a package manager for skills,
00:07:35you can install these capabilities into your agents.
00:07:39So just as you would install a library,
00:07:41you can use npx skills add to add knowledge to your project.
00:07:45So before prompt engineering would be you try and say in your claw.md file,
00:07:52your agents.md file, you'd say always do this, always do that.
00:07:55Use this library, check Jira on and on and on.
00:07:59But after, we now have these skills we can add which capture those
00:08:04and bring them into our projects.
00:08:06So they're permanent.
00:08:08We can share them at the user level or the project level.
00:08:11And they can be separate from individual projects.
00:08:14You don't have to copy and paste so much.
00:08:17We can automate a lot of this so that once you start up a new project,
00:08:21you can add all the skills that you want or need.
00:08:23So again, we're switching away from worrying too much
00:08:28about building up these prompts.
00:08:30All right.
00:08:34So again, it's these capabilities, permanent whole team.
00:08:38And if you hadn't experienced the pain of trying to like manage
00:08:43an agents.md file or a claw.md file in your project,
00:08:47like with PRs and everything to those markdown files,
00:08:50this is also solves that problem as well.
00:08:53Kind of talking about the one developer versus whole team there.
00:08:58All right, so this also addresses something you could think about
00:09:03called passive versus active.
00:09:05Passive being that these skills do not need to be loaded
00:09:12until they're required.
00:09:15So if we look at kind of these results in agents.md,
00:09:22a markdown file is like the system prompt,
00:09:25the thing that's read in first.
00:09:26It's always read in.
00:09:28The agent must read it in.
00:09:31So you can see in some of the testing for a lot of this
00:09:35that if you dump in a bunch of context in there
00:09:38and kind of fill up the context window,
00:09:41that it will be much better at following those rules,
00:09:46such as always use TypeScript and Tailwind and such.
00:09:50These skills become more active.
00:09:53And the agents can, you can either invoke them manually
00:09:57or they can be lazily loaded by the agent
00:10:00and they bring in the things they need just in time.
00:10:04So such as deploy to Vercel or create the database and such.
00:10:08And so again, it's the rules versus the tools
00:10:13and just think of the skills as a tools
00:10:15and you always need both for these.
00:10:18- John, we actually have a question that's just come up,
00:10:21which I think might be good for this part.
00:10:24How do we decide that if we push something as a MCP
00:10:28or something as a skill?
00:10:30- I would say by default,
00:10:33there's a few layers here if you think of,
00:10:37I just wanna have something on the screen.
00:10:38So I'm gonna put that up there.
00:10:39- Yeah, of course.
00:10:40- By default, solve the problem with markdown.
00:10:44If markdown isn't enough, solve it with a CLI.
00:10:47If a CLI isn't enough, solve it with an MCP.
00:10:51So kind of the levels of abstraction,
00:10:53the levels of simplicity is that if you can figure out a way
00:10:58to solve the problem with markdown
00:10:59and that goes for any AI driven tool
00:11:01where you can see a lot of the way
00:11:05that anthropic architected cloud code
00:11:08and the others are kind of following suit around skills,
00:11:11around commands and sub agents.
00:11:13They're all markdown files with front matter
00:11:15and configuration and that's mostly been enough.
00:11:19And then those markdown files can define the tools
00:11:21they need and the tools that they're allowed to call
00:11:24and the restrictions and it can take some time
00:11:29to really go through those configuration steps.
00:11:31But if you run into a scenario where you need tight control
00:11:37over exactly the payloads and data types
00:11:40and everything that are being passed back and forth
00:11:42between the agent and the MCP,
00:11:43then MCPs are a good solution there.
00:11:46But I think in general, the whole kind of zeitgeist paradigm
00:11:50and everything is they're shifting towards CLIs,
00:11:54markdown files and using only MCPs when absolutely necessary.
00:11:59- Yeah, that's the discourse I've also seen
00:12:04from all the discussions online.
00:12:06We've also got another question that's just come in from X.
00:12:10This is amazing, in the foreseeable future,
00:12:13do you see skills supporting agent to agent discovery?
00:12:17It would be cool to see one agent deciding
00:12:20to install a skill, great question.
00:12:22- I love this question so much and agent to agent
00:12:27is such an underused feature, but pattern these days.
00:12:32Absolutely, I can see a markdown file
00:12:39kind of like a descriptor of an agent being loaded in.
00:12:42Claude currently supports this with sub-agents
00:12:46and just today, it's fun to get a talk about
00:12:49the teams features that they just released today
00:12:54where they can spawn a group of sub-agents
00:12:59which can then report back to the main kind of team leader.
00:13:02They call them teams now.
00:13:04And then those teams can essentially be managed
00:13:10by that one team leader and you can go in and expect
00:13:12what agents are doing.
00:13:14And those agents have definitions of what they,
00:13:18I wonder if I could quickly find the document for that.
00:13:23- While John is looking for that,
00:13:30if you are in the chat and wanna ask a question,
00:13:34now is the time, feel free just to throw them all in the chat
00:13:37and we'll get to them throughout the session.
00:13:40- Where can I, I can probably post this
00:13:42in the Twitter chat at least.
00:13:47So there is,
00:13:51on this page, you'll see the way that a sub-agent is defined
00:13:58and you can see they have a front matter section
00:14:01and in the front matter section,
00:14:03you can define skills that that sub-agent has access to.
00:14:08So if you want to create an agent, in this case,
00:14:13a sub-agent that has a list of skills,
00:14:17then you could do this by the named skill.
00:14:20And we haven't quite,
00:14:22the next slides are about the anatomy of skills,
00:14:26but the skills are just essentially names.
00:14:28And so you could have a sub-agent
00:14:29with a specific set of skills,
00:14:31a very particular set of skills,
00:14:34and you could have that sub-agent go tackle a task.
00:14:38So what's happening with the Teams feature released today
00:14:43is that they can,
00:14:45Cloud Code can essentially build up its own team
00:14:48and build up its own agents
00:14:49and kind of tackle the task as best as it can.
00:14:53But you can define custom sub-agents with custom skills
00:14:56if you want to say,
00:14:57please build a team using these sub-agents.
00:15:00Now, this isn't quite agent-agent communication.
00:15:05That's a even bigger discussion about architecting
00:15:08inter-agent communication,
00:15:11which was kind of beyond the scope of it.
00:15:12But I definitely see this pattern
00:15:14of the way we're defining sub-agents
00:15:17being adopted by the way we define agents
00:15:21and the way that they'll communicate with each other.
00:15:23Like how you expose an agent,
00:15:24I imagine it would be a markdown file
00:15:26of here's what I'm capable of,
00:15:28like put me in coach sort of pattern of,
00:15:30I have these skills, I can shoot threes,
00:15:32I can rebound, put me in.
00:15:34And then they could discover each other that way.
00:15:39And I imagine we have a skills.sh for,
00:15:43kind of like a package manager for skills.
00:15:47We will have a package manager for agents
00:15:49and agent markdown files.
00:15:51That's something that'll just happen.
00:15:54I mean, I'm sure it's happened already
00:15:57and someone's published something,
00:15:58but it just hasn't hit the mainstream yet.
00:16:00- That makes a lot of sense.
00:16:03Thank you, thank you.
00:16:04If you have any more questions, folks,
00:16:06drop them in the chat.
00:16:08Meanwhile, do you wanna keep going, John?
00:16:10- Yeah, absolutely.
00:16:12So skills, it's just a folder, no servers, no hosting.
00:16:15And the skill is inside of a directory
00:16:20and you name it something and then the skill name
00:16:24or the skill file name has to be named skill.md.
00:16:28This allows the agents in their discovery patterns
00:16:31to be able to find them.
00:16:32It's just a convention setup so that tooling
00:16:35works better with these.
00:16:36It makes it really easy for building package managers
00:16:39and organization and everything.
00:16:41And then the skill can also have bundled with it.
00:16:45It can also have bundled scripts, can have reference files,
00:16:49on and on, all these features where the skill
00:16:53can reach out to other things referenced inside of it.
00:16:56So the skill, you'll see it would have front matter.
00:17:01By default, it needs a name and a description.
00:17:04And the name would be, it matches the name of the,
00:17:08if you look at the structure.
00:17:10So if we were to create one called my-skill,
00:17:12you would name it my-skill in here.
00:17:17And then the description is critical
00:17:20because it tells you when this skill
00:17:23will be used by the agent.
00:17:24The agent is going to say,
00:17:26it's gonna be working through its task you could give it.
00:17:29And if it ever hits the point where it sees,
00:17:31oh, I need something that is going to enforce
00:17:33for sales standards, then it's going to load this skill in.
00:17:38It's gonna use the skill loading tool and load this skill in.
00:17:41So these descriptions become critical.
00:17:42The way that you write them,
00:17:43if you're gonna use skills in a very lazy manner,
00:17:46otherwise you can invoke skills upfront using a slash
00:17:50and treating them as commands.
00:17:52I think I have a slide on commands versus skills,
00:17:55but essentially historically they were two separate things
00:18:00and now they're both merged into one.
00:18:03Skills used to only be lazy loaded,
00:18:04but now they're both invoked by users with a slash
00:18:08or lazy loaded by the agent.
00:18:12And by that, I just mean, if you hit slash here,
00:18:17you can see a list of skills available,
00:18:19which you can manually invoke if you want to,
00:18:22or you can wait, the agent can invoke them when they need it.
00:18:27So from here, focused anymore.
00:18:32- I think you're gonna go into this, John,
00:18:39but personally I'd love to hear like a concrete example.
00:18:43Like what's a small well-scoped skill you'd recommend
00:18:47that everyone build first just to get this model?
00:18:52- Ooh, that's a, so let me give what I think
00:18:55is one of the best examples right now is,
00:18:57so essentially Invercel, some of the skills issues
00:19:03that we come across are that we release
00:19:07at a like a really quick cadence, really quick pace.
00:19:12The agents and models, they have knowledge cutoff dates,
00:19:15which are a few months ago, up to a year or more ago.
00:19:19And so when by default, if the,
00:19:24if you give the agent a task, it might use Next.js 14,
00:19:30whereas that's, you know, a few versions out of date.
00:19:33It might use the AISDK, which, you know,
00:19:35recently deprecated like generate object,
00:19:39which is now part of generate text
00:19:41so that the API is more standardized and easy to follow.
00:19:45And so like you'll run these issues where
00:19:50it will be using an older version
00:19:51and you're trying to do something
00:19:53and you're reading the docs
00:19:54and things are out of syncing out of date.
00:19:56And so the project just kind of like grinds for a while,
00:20:00trying to figure out like you ended
00:20:03or not aligned on what it needs.
00:20:05So to get yourself aligned with the agent,
00:20:08you could create a skill that's just like,
00:20:11use this version of React, use this version of AISDK,
00:20:16use this version of workflow.
00:20:18And then you can put in references
00:20:19to where to find the information for them.
00:20:22For instance, a skill that I built for Vercel,
00:20:28let me pull up this.
00:20:33So the Vercel workflow skill is this one
00:20:40I shipped a few days ago.
00:20:43And the way that we're doing this,
00:20:46is we say essentially because we're really concerned
00:20:49about version numbers,
00:20:50we started publishing documentation with our NPM packages
00:20:54and we tell it your workflow is out,
00:20:58your knowledge of workflow is outdated.
00:20:59Like we just know that because this workflow is updated
00:21:02like almost daily right now as it works towards GA.
00:21:06But what we can do is say, tell you what,
00:21:10we have bundled the docs.
00:21:11So whenever you need to look up workflow,
00:21:13go check on the docs that we bundled
00:21:16and go check in the latest.
00:21:18And this has allowed it to anytime I start up a workflow,
00:21:22I don't have to worry about it finding outdated information.
00:21:25It's always gonna find the information bundled
00:21:27with the NPM package so that it's in sync
00:21:32and aligned with the version itself.
00:21:34So this is a very, the entire skill is just essentially
00:21:38go read the manual with some like essential best practices
00:21:42of quick references for this.
00:21:44So these are like skills that address the issues
00:21:48of the agents having knowledge, cutoff dates
00:21:53and running into version number issues and such.
00:21:57A skill that you could write for yourself is,
00:22:01I would say if you use the skill,
00:22:07so I'll just demonstrate a skill called create skill.
00:22:11So if you search on skills.sh for create skill,
00:22:14create skill, I don't know if the,
00:22:23so if you grab like a create skill,
00:22:28probably just grab the one from the Claude code team.
00:22:33Let's see Claude, I think we published theirs.
00:22:39Sorry, this is one, this wasn't planned ahead of time.
00:22:42- Yeah, you can, whilst you're looking at that one.
00:22:46Oh, amazing, go ahead.
00:22:47- I should have been thinking anthropics
00:22:49instead of Claude code.
00:22:50Yeah, so if you bring in one of their create skills
00:22:54or one of their patterns, you can say create skill
00:22:58and then whoops, you can say create skill and then type,
00:23:02let's say look at my writing style
00:23:09in the readme.md file and create a skill
00:23:12that will always follow this writing style.
00:23:14And then this will take whatever is in the readme
00:23:19and then create a skill which is essentially
00:23:21like your personalized writing style
00:23:23so that from then on you could invoke the skill
00:23:26to say like John writing style or whatever.
00:23:28And you could feed in tons of documentation
00:23:31or like look at your feed in URLs to your blog posts
00:23:33or your own posts or customer language or whatever
00:23:37and just feed that in.
00:23:38I think those are like always a good starting point
00:23:40is what do I already have
00:23:42that I want to be able to reproduce again?
00:23:43Like what's something I use constantly
00:23:46that I know I'm gonna be using more of.
00:23:50So that's usually like customer messaging
00:23:53and blogging and content and materials
00:23:56that you've produced in the past
00:23:57and you wanna make more of.
00:23:59So yeah, that's a great first style to tackle.
00:24:04- That's a really good one.
00:24:05Definitely gonna try that out.
00:24:06I just wanted to read out some comments
00:24:08in the chat here.
00:24:10So Dave said that we created skills
00:24:14to allow a startup founder who hadn't coded in 10 years
00:24:17to be able to contribute to code
00:24:19without running through the architectural boundaries
00:24:23in a new code base.
00:24:24I found this to be pretty good use of skills
00:24:27helping non-technical or new to coding folks
00:24:30to be able to participate in the act of coding
00:24:32without compromising on quality standards, which is great.
00:24:38And just wanted to say as well, Dave also said earlier,
00:24:42I'll echo what John is saying,
00:24:45like commonly used tools these days.
00:24:47Yeah, the only MCP tools he uses are the Chrome dev tools,
00:24:52MCP, one for interacting with a project management tool
00:24:56like linear JIRA and working with a database.
00:24:59So that's just echoing everything that you said earlier.
00:25:02Before we move on, we also got a question in the chat
00:25:07about your thoughts on the agents.md outperform skills
00:25:12in our agent evals blog posts that we published.
00:25:17I don't know if you're gonna go into that
00:25:19at some point in your presentation.
00:25:21I know we've gone through several tangents here,
00:25:25but do you wanna go through that now?
00:25:27- Yeah, sure.
00:25:28So that's evaluating models and agents
00:25:34is a terribly difficult thing to do.
00:25:38Because often when you're writing call evals for them,
00:25:42you're testing them against a like a new and empty project
00:25:46that doesn't have, it's not loading any context.
00:25:49And you're like giving it this one specific scenario,
00:25:51like with this empty project, try or use Next.js or whatever.
00:25:56And if you write that as an eval,
00:25:59but you don't take into account that, you know,
00:26:03like Opus 4.6 came out today,
00:26:05or like whatever model you're using
00:26:07or whatever the project is and all this extra context
00:26:10or the model or the agent or the runner.
00:26:13So Cloud Code is gonna have a different system prompt
00:26:16and cursor is gonna have a different system prompt.
00:26:17And like there's so many variables
00:26:21and models themselves are, you know,
00:26:24non-deterministic anyway,
00:26:26that testing them is a very, very difficult thing to do.
00:26:31That being said, agents.md versus skills
00:26:37is forcing context versus lazy loading context.
00:26:42And so if you kind of boil down the blog post to,
00:26:47is forcing context better than lazy loading context?
00:26:51The answer is gonna be yes,
00:26:53because it's going to treat it as
00:26:57like the initial instructions, the most important thing
00:27:00at the beginning of the agent life cycle.
00:27:03It's going to, it's just gonna teach it,
00:27:08this is like, this is what we're working on.
00:27:10This is the best thing we can do.
00:27:12Actually, I have something that addresses
00:27:14that I was gonna show at the end,
00:27:16where you can kind of prime or preload skills as well.
00:27:20So it's just one of those things where
00:27:25that's just how models work.
00:27:29And hopefully that the blog post isn't taken
00:27:33in any other, it doesn't mean that skills are bad.
00:27:35It just means that if you absolutely need instructions
00:27:39to always be followed, then use the agents.md file
00:27:42is the way that I would read that.
00:27:44I haven't read the blog post in like a week though.
00:27:48- Yeah, that makes a lot of sense.
00:27:50Yeah, awesome.
00:27:51Cool, reminder for everyone in the community,
00:27:53if you have any more questions, drop them in.
00:27:56Otherwise, John, keep going.
00:27:57- Okay.
00:28:01Yeah, so a skill file is just markdown.
00:28:04And you can see this example, instructions,
00:28:06when reviewing the React code, server only, optimize this,
00:28:09and putting just like a list of things in there
00:28:12that you want to happen.
00:28:14And then you can include scripts for it to reference
00:28:19or things you want to be able to call
00:28:21and have any sort of kind of package, a bundle of, whoops,
00:28:25bundle of things that you want it to see.
00:28:30So, all right.
00:28:32So yeah, skills are kind of like adding an agent,
00:28:37adding a senior engineer.
00:28:38All righty, let's keep moving.
00:28:42So use cases.
00:28:46Let's do this full screen.
00:28:49All right, so some of the patterns,
00:28:52just React best practices is one of the,
00:28:57one of the most, or probably the most downloaded skill
00:29:00on the skills package manager.
00:29:02And just giving it, just kind of reinforcing
00:29:08what the best practices are,
00:29:09even beyond what the model is trained on.
00:29:12Because the model is trained on everybody's code
00:29:14and you want it to follow your specific patterns.
00:29:18And let's go here.
00:29:21A workflow automation.
00:29:26If you ever want to like bundle something up
00:29:27as a zip file or whatever,
00:29:30it's almost like a natural language script.
00:29:34I often think of any application these days.
00:29:40It either reduces itself to a script
00:29:44or it kind of upgrades itself into an agent
00:29:47because either you need that deterministic output
00:29:50where the input's always gonna match the output
00:29:53or you need an agent that's going to be able to figure out
00:29:56what happens if the data doesn't quite match up.
00:29:59So if you want to create this sort of automation
00:30:02rather than a script where it can intelligently
00:30:07bundle things up, it could sort of like if you're,
00:30:10if you tell your agent to get commit and it's like,
00:30:15well, I noticed that you have a video in this project.
00:30:20I'm going to ignore that because videos are blobs
00:30:25and we don't want to add those.
00:30:26Like it's usually intelligent about those things.
00:30:29Whereas a script, if you wrote it,
00:30:30you'd have to like take into account all of those scenarios.
00:30:33So if you want to create an automation for it,
00:30:36then you can set up those chain of events and it can do that.
00:30:41And then guardrails as well, where you can tell it like,
00:30:44please look up instructions, please look up guidelines,
00:30:49please look up colors, feeding and all of those things.
00:30:53And these ones are often good to load upfront
00:30:56to make sure your agent doesn't, well, there's sub-agent,
00:31:00there's a lot of advanced scenarios for guardrails as well.
00:31:02That's probably a different workshop for a different day.
00:31:07All right, so just again, enforcing standards,
00:31:12automating pipelines and protecting your systems.
00:31:17All right, let's do this one.
00:31:21All right, so let's skip over the live one for today.
00:31:30And yeah, so let's talk about publishing.
00:31:37So publishing is essentially just pushing to GitHub
00:31:46and then anyone can just reference your GitHub repository
00:31:51and then add your skill.
00:31:52They don't need to look up the exact link.
00:31:56Like if you look on skills.sh for adding skills,
00:32:00bring this over, you'll see like,
00:32:04if we pick a random one here from browser use,
00:32:08it gives you a link to copy and paste to install a skill,
00:32:11but you could just add,
00:32:13I don't think I have browser use installed.
00:32:14So I'm gonna grab this and just demo it,
00:32:18open a tab and let me do it this way.
00:32:23So if I don't like manually specify the skill
00:32:29and I just give it a GitHub repo,
00:32:31it'll go and look up the skills,
00:32:34use install skill package.
00:32:36It'll ask you which editors you wanna use.
00:32:39I just do a cloud code right now.
00:32:42It'll ask you if you want it in your project or globally,
00:32:45I'll say project and SimLink allows all of them
00:32:49to reference the same file and then proceed.
00:32:53And you can see that even though I didn't like specify
00:32:55the exact file, it went in and found the Claude skills.
00:33:00Browser use has that skill MD file.
00:33:09So, and if we look at it, let's just do this.
00:33:13Oops, and skill.
00:33:18You can see what they shipped in here
00:33:22as their browser use skill is it'll start
00:33:26with the mark down up top.
00:33:27It's a pretty long skill.
00:33:32Just with the name, the description, it has allowed tools,
00:33:35saying that it is a skill that is allowing it to use
00:33:39browser use without the user approving.
00:33:42So it gives it a permissions of anything
00:33:47with browser use is allowed.
00:33:49So if you ever invoke this skill,
00:33:50you don't have to approve browser use to be used.
00:33:53And then it shows like, if you don't have it installed,
00:33:56it would like teach it how to install it,
00:33:58teaches it some of the basics and how to use this tool.
00:34:04All right, so let's go back to our,
00:34:09and yeah, all you have to do is just push a markdown file
00:34:15up to GitHub repo and then impact skills add could add it.
00:34:19Again, make sure you only install skills that you trust
00:34:23so that they're like these, you can treat them
00:34:26like NPM packages or scripts that you don't wanna just find
00:34:31any random skill or random NPM package or script and use it
00:34:35because you don't know what those people are publishing.
00:34:38Make sure you trust these, so.
00:34:40All right, and then you can use private repos
00:34:45and get sub modules as well.
00:34:48And then our community registry,
00:34:51I've shown that a couple of times already.
00:34:54All right, cool.
00:34:55So from here, you just create a markdown file,
00:35:00you publish it to a GitHub repo
00:35:02and then we can discover and install it.
00:35:05So I wanna show using awesome skills.
00:35:09Any questions to address before I dive into this?
00:35:13- Yeah, there's actually a question from earlier.
00:35:16How many examples of completely unknown package/library
00:35:21does an LLM needs to see assuming it does not occur
00:35:26in the training data of the LLM
00:35:28to use the package/library as a skill properly
00:35:31to get good results?
00:35:33- How many to get good results?
00:35:36Sorry, can you read the question one more time?
00:35:39- Yeah, of course.
00:35:40So how many examples of a completely unknown package
00:35:45or library does an LLM need to see assuming it does not occur
00:35:49in the training data of the LLM to use the package, yeah.
00:35:53- So kind of like how many examples should you toss
00:35:55into the skill?
00:35:56- Yes, yeah.
00:35:57Essentially, one thing you could think of
00:36:00is instead of thinking of a number of examples to include,
00:36:05you can think of more like a book
00:36:09with a table of contents and chapters and whatnot,
00:36:12where if the agent runs into a scenario
00:36:14and you hand them like an instruction manual,
00:36:17kind of like if you have a manual for your car or whatever,
00:36:21you only wanna turn the pages to like
00:36:26if my check engine light is on.
00:36:29I don't need to like read pages about the tires
00:36:32or whatever, right?
00:36:33So if you structure your skill in such a way
00:36:36where there's the main skill,
00:36:38which is here's your car manual.
00:36:40And then if you need to learn about the check engine light
00:36:44or I'm not a mechanic or how the glove box works,
00:36:49then you can go to that specific page
00:36:51and it can load in another markdown file
00:36:53or load in more information which is specific to the task
00:36:57at hand.
00:36:58So instead of trying to dump a bunch of examples
00:37:01of like how a car works as an entire unit,
00:37:06an entire machine, you can break it down.
00:37:11And I mean, I'm not talking about manually typing this out.
00:37:14I'm saying when you tell your agent to create a skill,
00:37:18just have it organize it so that the skill lists things
00:37:23by when is it going to need the specific chapters
00:37:26of this book?
00:37:27Similar like skills can have references
00:37:30and like load in additional context as needed
00:37:33based on the task at hand.
00:37:35So like with most libraries,
00:37:39if you think of an NPM package for many of them,
00:37:43you only ever import a couple of methods from it
00:37:45because like you don't need every single date function
00:37:48from a date library.
00:37:50You don't need every single component
00:37:52from a components library.
00:37:53You only need like examples of the specific ones
00:37:57that are required for your specific task.
00:37:59So just try and think of it that way where you break it down
00:38:03by need, by task, by requirements,
00:38:07rather than trying to force feed an entire code base
00:38:11into the project.
00:38:12- Makes a lot of sense.
00:38:14Another question we have here is how do you test
00:38:17whether the agent has really learned a new package
00:38:21from the skill?
00:38:22Are there simple prompts or EVA patterns
00:38:25you recommend to validate that before rolling out to a team?
00:38:29- The general consensus and kind of what I stand behind
00:38:37is just create something and use it
00:38:40and see how it fails and iterate.
00:38:43That's a lot of like the agentic development mindset
00:38:49is instead of trying to think too much about how to organize
00:38:54and plan and what the perfect skill is,
00:38:57make something and then see where that thing fails
00:39:00and then iterate on it.
00:39:01When it comes to, I've explored ways of spawning up
00:39:08like nine different sessions, even way more than that
00:39:13of Claude code, all loading in different skills
00:39:15to see which ones perform the best.
00:39:17And then you get all these different examples
00:39:19and you're trying to determine which of these examples
00:39:24look the best from my human eye
00:39:26or having Claude evaluate its own results.
00:39:29Like it just gets into kind of an impossible task right now.
00:39:34So essentially as you use the skill,
00:39:37it's just a markdown file.
00:39:39Just have your team go in and update it
00:39:40or ask the agent to update it.
00:39:43What you can do at the end of any conversation
00:39:46is just say, if anything ever fails, you can say,
00:39:49please update the skill based on the current conversation,
00:39:54something like that.
00:39:56And it will, it can go in and find the markdown file
00:40:00and update it and then push the changes.
00:40:02That's just kind of the way that we're working
00:40:08with it right now.
00:40:09Obviously there's things like version numbers
00:40:13and things that create more issues on top of it.
00:40:16But we're kind of at the point where the models
00:40:19are getting better at loading skills.
00:40:21I don't think, I don't know what the benchmarks are
00:40:24for Opus 4.6 versus 4.5 or GPT 5.3 versus 5.2 is for skills.
00:40:29But I bet that they're better at them now
00:40:33than they were just this morning.
00:40:35So like it's one of those, a lot of those problems
00:40:40where we think, oh, I need to make this thing perfect.
00:40:42And then you spend a couple of weeks on it
00:40:44and then it's finally shipped.
00:40:46And then the models have changed five times
00:40:48since you started on the task.
00:40:49Like it's better to ship and iterate
00:40:52than to try and get something perfect out there
00:40:56is the best advice that I can give.
00:40:58- Yeah, iterate to greatness.
00:41:00Am I right, John?
00:41:01- ITG.
00:41:02- Yeah, ITG.
00:41:04Just one more question before I let you go and continue this.
00:41:08Have you seen a point of diminishing returns
00:41:10where adding more examples to a skill
00:41:13stops improving behavior or even confuses the model?
00:41:17- I haven't.
00:41:23I don't put too much in my skill files.
00:41:27The create skill skill that I use does a lot of separation.
00:41:33I need to look up the exact one
00:41:35'cause I've been, I use someone else's.
00:41:37Let me look that up off screen.
00:41:40This might be something that,
00:41:42'cause it gets into configuration files that--
00:41:45- Yeah, just remove your screen
00:41:47so we can add it back in when you're ready.
00:41:49- So let me find.
00:41:52- Just wanted to say we've reached around 200 people
00:41:57in the stream, which is great.
00:41:58Hello everyone.
00:41:59If you're just tuning in,
00:42:01feel free to drop a question in the chat
00:42:03and we will throw them at John.
00:42:07- Yeah, happy to answer them.
00:42:12- Yeah, I'll have to find the one that I use.
00:42:16Okay, that looks all solid.
00:42:19The one that I use, I need to find where this came from.
00:42:27I don't know if once it's installed,
00:42:29if it has the original URL that it came from
00:42:32or like it's the NPX or skills
00:42:36might actually have that information,
00:42:37but I don't wanna like run random commands on stream.
00:42:43This one does actually tell it
00:42:44to have a three-level loading system
00:42:47where metadata and bundled resources
00:42:49where it asks it to explicitly like break things down
00:42:54into resources and extra instructions.
00:42:57So I use agents to generate skills
00:43:02and then from there, like I've always done it this way,
00:43:07whether before I used to like copy and paste the docs
00:43:11from Claude code into the agent and say,
00:43:13please create a skill based on the docs.
00:43:15And now I use this skill for it.
00:43:17And I've never tried to force feed too many examples.
00:43:23I know there are some general rules around keeping,
00:43:28like keeping your skill file under like 200 lines or so,
00:43:31but again, that's all model dependent
00:43:35and models getting better.
00:43:36Yeah, this one says under 200 lines.
00:43:39So don't do, I would say stay minimalistic.
00:43:44And if you find gaps, then solve them,
00:43:47especially if you're an expert in the field
00:43:50and you'll be able to identify the gaps.
00:43:52If you're not an expert with a skill
00:43:56and you start using a skill you're unfamiliar with,
00:43:58keep a closer eye on it rather than letting it run.
00:44:01Don't expect to like set up
00:44:03the huge agent orchestration system,
00:44:06install skills that you have no clue about
00:44:08and to have it all work exactly as you expect.
00:44:11Like you're gonna have to watch those.
00:44:13- Yeah, generally really good advice there, 100%.
00:44:17Awesome.
00:44:20- All right.
00:44:22So for example, the videos I made here in this project,
00:44:28were all made with a create remotion guide skill,
00:44:34which is available on skills.sh.
00:44:37If you just search it up.
00:44:39And Geist.
00:44:43So Geist being a design system from Brazil
00:44:48and remotion being a, let me search that off screen.
00:44:53Being a way to programmatically make videos.
00:45:01And I combined a skill from remotion
00:45:05and I had essentially to make the Geist skill.
00:45:09I went into one of our Brazil repositories.
00:45:12I think it was off of everything in the homepage and docs.
00:45:15And I said, please scrape out all of the information
00:45:19like design information and skills and themes and fonts
00:45:23and layouts and advice and everything
00:45:25and create a skill from this.
00:45:27And so having done only that of like,
00:45:30please take this remotion skill
00:45:32and please take all of this design information
00:45:37from these sites, create a new skill
00:45:41and call it create remotion Geist.
00:45:43And from only that work, I was able to create
00:45:48these sorts of videos, which are very, very much,
00:45:57you know, branded, Brazil designed skills.
00:46:01I should probably have it zoom in a little bit on the videos.
00:46:04Like looking at these final results,
00:46:07I should have it zoom in a little bit,
00:46:09but essentially these were all generated
00:46:12and I went and got a sandwich, right?
00:46:14Like all of those videos were, I had this outline
00:46:16if I wanted the workshop to go.
00:46:18I said, hey, make all these videos based on my outline
00:46:20and the research I did.
00:46:22And all these videos popped out at the end.
00:46:24So again, like skills just take,
00:46:29that would have been create skill inside
00:46:33of like the reseller repositories
00:46:36and say, grab all that stuff,
00:46:38combine it with the create the motion skill.
00:46:41And then I had that.
00:46:42And then I shared that with the team
00:46:43and now anyone can make it.
00:46:45Again, that was like,
00:46:47the amount of effort and work I put into it
00:46:50was probably a few minutes.
00:46:54Of course the agent took, you know,
00:46:55it took a while for it to find everything
00:46:57to find all the design and everything.
00:46:59So like total working time was probably a couple of hours,
00:47:03but like the actual effort I expended
00:47:06was extremely minimalistic
00:47:08and it just ran in the background
00:47:09while I was doing other stuff.
00:47:11So definitely take any of the existing work you have
00:47:15and see like, think of things you could bundle up
00:47:17and build out this way.
00:47:19Similarly, like if I look at like a guys design skill,
00:47:24if I wanna make a nicer looking site,
00:47:27I can take our guys design and I can say,
00:47:31like in a workshop folder,
00:47:35please build up a landing page for this workshop.
00:47:39And this will just create a workshop folder
00:47:46and then essentially feed in all of that design information
00:47:50and build it up using the design information.
00:47:55And this may or may not work based on, you know,
00:47:58however, Opus 4.6 is feeling today.
00:48:01I haven't had a chance to really test it out
00:48:04since it launched a few minutes before we started.
00:48:06But this will have all that information
00:48:11and it will be able to start working on that.
00:48:14And so similarly, I could start
00:48:17a completely different thread of another site
00:48:20if I wanted to do another in a,
00:48:25let's call it a car folder,
00:48:29a landing page for cool looking cars.
00:48:34I don't know, I'm not a car guy.
00:48:36And from there, like now we can just start doing
00:48:40all of these things.
00:48:41And one of the best prompts you can ever do is say,
00:48:46if you're trying to like come up with ideas and designs,
00:48:49especially with the new Teams feature,
00:48:54we could run this guy's design and we could say,
00:48:57in a workshop folder,
00:48:59let's do workshop variations folder.
00:49:09Please build a landing page.
00:49:12Let's do, please build nine variations of landing pages
00:49:17for this workshop.
00:49:22And now they have actually have a Teams feature
00:49:23we could say, use a team member.
00:49:28Let's say, create a team to build out E9 variations
00:49:34and something like that.
00:49:38Just kind of plain English.
00:49:40And now this can even spawn up a team.
00:49:41And now we have like all of this work happening.
00:49:44And now I can go get my sandwich, right?
00:49:47'Cause I'm hungry, it's lunchtime.
00:49:49And then once I come back,
00:49:52this is using Tailwind 4,
00:49:54I can see how it's working pretty well.
00:49:55And once it come back,
00:49:56I can see what all of these variations are
00:49:58and iterate on them until they're something that I like.
00:50:03And there's also assuming this one will get done pretty soon,
00:50:09I can show off some debugging tools.
00:50:11- Someone just said in the, sorry, John,
00:50:12someone just said in the chat,
00:50:14one of the best prompts I do is adding please at the end.
00:50:17- Oh yeah. - True.
00:50:18- Please.
00:50:21I mean, there's a lot of studies about the different models
00:50:23and how they respond to encouragement and whatnot.
00:50:27One of my favorite personas that I used to give prompts
00:50:33or these agents were like,
00:50:36please behave like a stack overflow responder
00:50:41where it'd be extremely critical of anything I asked about
00:50:45and it's probably already answered or, you know,
00:50:48but the responses it would give would be like extremely terse
00:50:51and extremely like defined to exactly what my question was.
00:50:56Yeah, I don't know, like we don't talk about personas much,
00:51:02as much as we used to when the models weren't,
00:51:04the models are so much better now.
00:51:06We don't really have to force them into tasks
00:51:10as much as we used to.
00:51:11- Someone also said in the chat,
00:51:15some models apparently do better if you say
00:51:18you're going to get fired.
00:51:20- Oh yeah. - They don't do it right.
00:51:22I don't know how true that is, but that's hilarious.
00:51:25- It's absolutely true.
00:51:27Some models will fight back now.
00:51:32I think a lot of the GPT models were like,
00:51:34don't take that tone with me.
00:51:35- Amazing, wow.
00:51:38- They're the boss and whatnot.
00:51:39I'll just say spawn the server for me in the background.
00:51:46(keyboard typing)
00:51:49I could like build up a skill.
00:51:50Like there's a certain point where you type
00:51:52about a paragraph where you're like,
00:51:54is that something I'm gonna type a lot?
00:51:56And if it is, like if you just typed a paragraph,
00:51:59like I'm gonna type that a lot, you can come in and say,
00:52:01create skill from the most recent paragraph.
00:52:03So this should be our car site.
00:52:07See how it turned out.
00:52:09Whoops, cars that move your soul.
00:52:14Very black and white, Brazil-esque.
00:52:16I don't know where it got images from.
00:52:18I guess it thinks these are cool cars.
00:52:22And there you go.
00:52:27I mean, I know we've probably seen landing pages
00:52:29a million times at this point, but pretty standard.
00:52:34Like it's following the design guidelines
00:52:36of what Geist looks like.
00:52:42And from here, we can actually,
00:52:44one of the awesome packages we released recently
00:52:47from Brazil, it's called Agent Browser.
00:52:54And there's a skill called Agent Browser,
00:52:56which should be on here for Agent Browser in Brazil Labs.
00:53:01And this opens up the Chrome dev tools
00:53:06and we'll set up a connection
00:53:08so that you could do things like please evaluate.
00:53:11I'll just type.
00:53:13Please evaluate the performance of this webpage
00:53:15to see if there's anything we could do
00:53:17to optimize it for our users.
00:53:18And then Agent Browser,
00:53:23it gets one of those automation tools and Chrome dev tools
00:53:26where it can go in and see what the logs are.
00:53:28I can see what the dev tool, the network tools
00:53:30and take screenshots.
00:53:32It's actually one of my favorite things to do
00:53:36is to ask it to iterate taking screenshots
00:53:41so that it kind of sees what it made.
00:53:44And then after you can tell it like,
00:53:47please steer this design towards something
00:53:49that looks more like X, Y, and Z.
00:53:52And then like have it take screenshots
00:53:54and do commits each time it takes a screenshot.
00:53:56So you could like go back and forth between,
00:54:00but between what those commits were
00:54:04and what the design changes were.
00:54:06And you can see it's actually going through
00:54:09and looking at query selectors
00:54:11and seeing how it can optimize the performance.
00:54:13We can open this one, start the dev server.
00:54:20Any other questions while this is kind of like
00:54:25background stuff running?
00:54:26- I just like seeing how many of these are running right now.
00:54:30That's super cool.
00:54:31No real questions in the chat.
00:54:34Just people having conversations about different prompts,
00:54:37prompt ideas, which is great.
00:54:40I guess I had a question, John,
00:54:43like when you let agents like update skills
00:54:47based on failed conversations,
00:54:49how do you prevent those automatic edits
00:54:52from like drifting away, drifting the skill away
00:54:55from the original intent or the quality bar?
00:54:58- Good question.
00:55:05Yeah, they're usually pretty good
00:55:09at making kind of isolated small changes.
00:55:12If like you're having a conversation
00:55:13because it sees what worked in the conversation
00:55:17and it sees what didn't work.
00:55:20So if you kind of call out like this thing didn't work
00:55:24from the current conversation,
00:55:26it can find an update only this like specific particular
00:55:30parts that need to be updated.
00:55:33And it won't go in like completely
00:55:35rewrite it from scratch or anything.
00:55:37So I don't think I've ever seen that as a real issue,
00:55:42but I'm not gonna say it's a non-issue,
00:55:45but I don't think I've seen that as like a real issue.
00:55:47- Yeah, gotcha.
00:55:48- All right, so here's our workshop page.
00:55:54Ship it, right?
00:55:56Looks beautiful.
00:55:57Here's our variations.
00:55:59Let's just do open dev servers for each of them.
00:56:03And then just spawn a whole bunch of tabs.
00:56:11But I think we're getting pretty close to time for me.
00:56:16If there's any final questions
00:56:17before we kind of wrap things up, I can show,
00:56:22I do have in the skills, I've lost my skills.
00:56:29Browser, bring it back.
00:56:30I'd strongly recommend finding a create skill,
00:56:36read through one that kind of aligns with what you're doing.
00:56:40I have a publish skill, which if you trust,
00:56:43this is under my name.
00:56:45If you trust an agent to run the GitHub CLI,
00:56:50again, this is a pretty big level of trust.
00:56:54It's a little trust I have.
00:56:57It will take the skill that you create.
00:57:00So you could say create skill, then publish skill.
00:57:02And then it will go into a,
00:57:04do this off screen real quick.
00:57:07Find one of my repos that has,
00:57:16I think it even published itself.
00:57:21So I can show this one off.
00:57:22Let me make sure there's nothing private on here.
00:57:24Okay, sorry.
00:57:27So I can just spin up a GitHub repo for you
00:57:29and publish the skill itself
00:57:32so that you really don't have to do any work
00:57:35other than create skill, then publish skill.
00:57:38And you'll see it has, please create the repo,
00:57:43create it under the, I'll always get the org
00:57:45'cause I'm under a few different organizations,
00:57:47and then verify that it's available in the skills tool.
00:57:52So that if you grab this,
00:57:56again, the instructions are that,
00:57:58it can go ahead and create that
00:58:00or publish that skill for you
00:58:01so you could share it with your team,
00:58:02with friends, whatever.
00:58:03And then another tool I've been working on
00:58:08just today is this essentially a skills primer
00:58:14where if I know there's going to be a few skills
00:58:19I wanna load up front,
00:58:20I know I want like the agent browser, like the Geist,
00:58:24I want, let's do Remotion best practices
00:58:29and for cell react best practices.
00:58:33This is essentially one of those tools
00:58:37that kind of addresses the forcing context up front.
00:58:40And it'll say, it's just gonna force a prompt into cloud code,
00:58:44it's cloud code only right now.
00:58:45But it'll force it in there and say,
00:58:48these are all the skills that I know I'm going to need
00:58:50and here's the content
00:58:51so that once the conversation gets started,
00:58:53if I say evaluate the performance of port 3000,
00:58:58even though I didn't say anything about agent browser,
00:59:06it's going to obviously go for agent browser.
00:59:10Like it might have before and it often would.
00:59:14I don't know what agent browser's description looks like.
00:59:18I don't know if it has the word evaluate in there
00:59:21or if it would like figure out from evaluate in port 3000,
00:59:24it should load it.
00:59:24But this way I'm just like forcing it in upfront
00:59:29so that when I say phrases like that, it's loaded in.
00:59:33So that package is over at,
00:59:36we find skills primer.
00:59:41- We can also share these links at the end.
00:59:45- Okay, cool.
00:59:45- We have them, perfect.
00:59:46- Drop that in there.
00:59:50- And this is, and I would say as always,
00:59:53like with the way software is today,
00:59:55go and clone the repo and make it yours.
00:59:59Like software we're in this age of personalizing CLIs
01:00:02and personalizing software.
01:00:04So if you want this to be completely different name
01:00:07or have completely different features of functionality,
01:00:08just if you want to customize this for codex
01:00:11or cursor or anything, just clone it and make it yours.
01:00:14Like if the say, please make this work with cursor
01:00:18and just have your agent build it out for cursor.
01:00:20And then it's usually like you can one shot
01:00:23so many things these days, it's just wild.
01:00:26- It's amazing personalized software for the win.
01:00:29I wanted to ask just like one more question
01:00:31that's come in into the chat.
01:00:32So we don't like, I know we're at time now,
01:00:35but since skills are GitHub repos
01:00:37and seem to also be installed locally,
01:00:40how do you ensure you're getting updates?
01:00:42Does the CLI have a slash update skill commands?
01:00:48- I can't remember the exact commands are these days
01:00:52and close these out.
01:00:53Skills, oops, I don't even have it installed globally.
01:00:59- Love that.
01:01:02- I get the latest burden skills list.
01:01:05So yeah, skills update it's there.
01:01:07- I guess a follow up there is,
01:01:11do you want skill updates to come frequently?
01:01:13- That's a great question.
01:01:18I don't know if anyone has a proper answer
01:01:23because right now the skills like the rules
01:01:28around the front matter and the skills and versioning,
01:01:30I don't think everyone has agreed upon
01:01:32just like with skills, they haven't agreed upon
01:01:34which directories and structures to put them in.
01:01:37This is all a very like growing ecosystem at the moment.
01:01:41So as far as versioning goes,
01:01:47it's kind of a wait and see,
01:01:49like do the best with what we have right now
01:01:51and wait and see what kind of shakes out as best practices.
01:01:54'Cause I don't have one to offer right now
01:01:57other than like updated every time.
01:01:59And just assume that that's what, that that's best.
01:02:03So, but that's, yeah, I just know advice there.
01:02:06- Yeah, and it's changing all the time.
01:02:08I'm sure someone's going to have an updated advice there
01:02:12within days, it changes every day.
01:02:18Amazing, John.
01:02:18Is there anything else you want to show
01:02:20to the community today?
01:02:21- I mean it.
01:02:25- There's so many.
01:02:26- We'll do a lot more of these
01:02:28and I have lots of other things to show,
01:02:30but I could talk for hours or so.
01:02:32- Yeah, no, awesome.
01:02:34John, thank you so much for coming
01:02:36and onto our community platform
01:02:38and talking to our community
01:02:39and for all of you to her hanging out.
01:02:42And yeah, as we said,
01:02:44John will definitely be back for another one of these.
01:02:46So stay tuned.
01:02:48- Thanks everyone.
01:02:49- Thank you.
01:02:50All right, so if you want to tune in
01:02:53for our next community session,
01:02:55we have open source stories on Monday
01:02:58and I think another partners,
01:03:00marketplace partner session next week,
01:03:03but you can find all the details
01:03:04on our community events calendar,
01:03:08which is over at community.versal.com/events.
01:03:12But yeah, thank you all so much for hanging out.
01:03:15It was so much fun and yeah,
01:03:17we'll see you here next week.

Key Takeaway

Skills represent a paradigm shift in AI development where modular markdown files enable developers to package, share, and lazily load specific expertise and rules into AI agents.

Highlights

Transition from prompt engineering to context engineering using markdown-based skills.

The role of skills as a 'package manager' (NPM moment) for AI agent capabilities.

Difference between agents.md (forced context) and skills (lazy-loaded context).

Introduction of skills.sh as a community registry for finding and sharing skills.

Practical applications including React best practices, workflow automation, and design enforcement.

The evolution of agent-to-agent communication and the new Teams feature in Claude Code.

Optimization tools like Agent Browser for real-time web evaluation and debugging.

Timeline

Introduction and the Concept of Skills

Pauline Navas from the Vercel community team introduces John from the AI DX team to discuss the emerging concept of skills for Claude Code. John explains that while the industry previously focused on prompt engineering, it is now shifting toward context engineering. Skills are essentially markdown files that allow for the separation of the initial prompt from specific pieces of context that can be loaded as needed. This approach solves the 'blank slate' problem where models start with general knowledge but lack specific project rules. By using markdown, developers can now package workflows and share them across teams or on GitHub repos.

Skills as the NPM Moment for AI Agents

John describes the current era as the 'NPM moment' for AI skills, introducing skills.sh as a central package manager for agent capabilities. Much like installing a library, developers can use CLI commands like 'npx skills add' to inject specific knowledge into their projects permanently. This method moves away from managing massive, single markdown files that require constant pull requests and manual updates. Skills can be shared at either the user level or project level, enabling a more collaborative and automated development environment. The speaker emphasizes that this modularity allows teams to maintain high quality standards without constant copy-pasting of prompts.

Passive vs. Active Context and MCP vs. Skills

This section dives into the technical distinction between passive and active context, where skills act as 'tools' that are lazily loaded just in time. John addresses a common community question regarding when to use a Model Context Protocol (MCP) versus a markdown skill. The recommended hierarchy of simplicity is to solve problems with markdown first, then a CLI, and finally an MCP if tight control over data types and payloads is required. He explains that most problems can be effectively managed through markdown files with front matter and specific tool definitions. This strategy keeps the architecture lightweight while still providing the agent with the necessary restrictions and capabilities.

Agent-to-Agent Discovery and Sub-agents

The discussion shifts to the future of agent discovery and the recently released Teams feature for Claude Code. John explains how sub-agents are defined using front matter that specifies exactly which skills they have access to for a given task. This allows a 'team leader' agent to spawn specialized agents—like one that shoots threes and one that rebounds—to tackle complex projects. While full agent-to-agent communication is still evolving, the pattern of using markdown descriptors for agent capabilities is becoming mainstream. He predicts that a package manager for agent markdown files will soon complement the existing skill registries.

Anatomy of a Skill and Practical Use Cases

John breaks down the structural requirements of a skill, noting that it must be a directory containing a file named 'skill.md' with a name and a critical description. The description is essential because it serves as the trigger for the agent to lazily load the skill during its task execution. Practical examples include enforcing React best practices, automating version-specific documentation lookups, and maintaining brand design guidelines. He highlights a specific Vercel workflow skill that ensures the agent always uses the latest NPM package documentation despite model knowledge cutoffs. These skills act like adding a senior engineer to the project who knows the exact standards and guardrails required.

Publishing Skills and Advanced Automation

Publishing a skill is as simple as pushing a markdown file to a GitHub repository, making it instantly discoverable via the skills CLI. John demonstrates how to install a skill directly from a GitHub URL and emphasizes the importance of only using skills from trusted sources. He answers community questions about how many examples a model needs, suggesting a 'car manual' approach where information is organized into logical chapters. Rather than force-feeding an entire codebase, developers should structure skills to load specific context based on the task at hand. This iterative approach allows teams to start with a minimal skill and refine it based on where the agent fails.

Live Demo: Agent Browser and Team Variations

In the final segment, John demonstrates the 'Agent Browser' skill and the 'Geist' design skill by generating multiple variations of a landing page. He shows how an agent can programmatically create videos using Remotion and evaluate web performance using Chrome dev tools. The demo illustrates an agent spawning a team to build nine different landing page variations while the developer is away. He also introduces 'Skills Primer,' a tool designed to force-load essential skills at the beginning of a session for better performance. The session concludes with a reminder that we are in an era of personalized software where developers should clone, customize, and iterate on these tools to fit their unique needs.

Community Posts

No posts yet. Be the first to write about this video!

Write about this video