Transcript

00:00:00Many people have started perfecting their own agent skills and open sourcing them for the community.
00:00:05While most of their skills are genuinely useful, some of them are just weird.
00:00:08But despite being weird, they are useful in ways you wouldn't expect.
00:00:12One of them solved the biggest problem we faced while handling multiple sessions
00:00:16and it did it in a way that was fun but actually worked.
00:00:18Another one fixed the token bloat problem and it did it in a way we didn't see coming.
00:00:22We kept finding more like these, ones that genuinely helped even though they sounded ridiculous at first.
00:00:27And all of them ended up in our workflow in ways that made everything more interesting than before.
00:00:32Now if you are someone like us who uses multiple Claude sessions at the same time
00:00:36and lets them run simultaneously on different tasks,
00:00:39you must have had to manually keep track of which session has completed its work and which has not.
00:00:43There are also cases where you think Claude has spent enough time
00:00:47only for you to open that session and see it blocked on a permission prompt.
00:00:50For this you need P on Ping, which is a skill that notifies whenever Claude has completed a task
00:00:55or needs a permission prompt so that we can give attention to the session.
00:00:58But it does not use standard notifications.
00:01:00It actually uses voices from popular games with multiple modes and different game characters.
00:01:05You can set this up for any coding agent that you use.
00:01:08You can install this plugin using the installation command according to the operating system you are on and run it.
00:01:13After the setup you can use the slash command to pick your favorite voice from multiple voice packs.
00:01:18Now whenever you give Claude a task and get busy with some other tasks,
00:01:22when Claude finishes you will get a notification with the game character's voice in the background.
00:01:27It uses expressions from the game to indicate the task is done and all vary depending on the task.
00:01:32When you start a new session, you also get a voice notification indicating Claude is ready to work.
00:01:37This way you don't have to manually check everything and instead get engaging notifications to do the same.
00:01:42We share all the tools and workflows we find on building products with AI on this channel.
00:01:47So if you want more videos on that subscribe and keep an eye out for future videos.
00:01:51We always talk about using adversarial review mode because it critically evaluates across so many different aspects and that's what makes it so effective.
00:01:58So there is a skill in this pack called dog food.
00:02:00What this skill does is explore a web app and identify bugs and UX issues using an adversarial review style.
00:02:06It uses the agent browser which is a CLI tool that allows agents to interact with pages by sending keys and referencing elements properly.
00:02:15We have covered this already in our previous video where we talk about how to set it up and use it.
00:02:19So it is important to ensure that the agent browser is installed when you install this skill.
00:02:24You can provide a link to the website you want it to test or simply tell it to test the app.
00:02:28You can also provide a hosted URL or a local host link.
00:02:32Once you do that, it first initializes a report and then uses the agent browser to go through the pages of your application one by one.
00:02:39After completing an in-depth review of the application, it reports all issues it finds.
00:02:43This includes steps to reproduce each bug, screenshots and a full breakdown of critical, medium and low priority issues.
00:02:50It even records a video showing the entire walkthrough, making it a highly detailed review.
00:02:54Now if you are someone who's annoyed when Claude gives unnecessarily long explanations with answers filled with excited words that are of no help,
00:03:02especially getting too annoying when it's not doing the task at hand correctly,
00:03:06Caveman is a plugin that solves exactly that by making Claude talk like a caveman, cutting down 75% of tokens from its response all while maintaining technical accuracy.
00:03:15And the idea behind it is that the way cavemen use fewer words to convey the whole point they want to convey, the same way Caveman's skill works.
00:03:23It makes Claude give a reply using fewer tokens by using direct words and cutting down articles and adding to the point words.
00:03:31Especially this will cut down those filler words that Claude tends to inject that are not even relevant because we are just concerned with getting the work done.
00:03:38There are different modes in this plugin, the highest one of which is the Wengian mode.
00:03:42This uses the Chinese language instead of English because Chinese words represent a whole sentence in much fewer tokens, while English takes a lot more tokens to say the exact same thing.
00:03:52But before switching to Chinese, keep this in mind that the accuracy of models on languages other than English is generally lower,
00:03:59so it's better to stick with the English caveman language than Wengian.
00:04:02And with this plugin the main benefit you get is that you receive responses that are easier to read while maintaining accuracy because this way only the fluff is removed and Claude still gets the whole point across.
00:04:13It's available for all major agents, but for Claude code you first need to install the plugin marketplace command and run it.
00:04:20Once this plugin marketplace is installed, you can run the plugin command, search for caveman and install it in whichever scope you want.
00:04:27After installing, you can access the plugin once you reload the plugins.
00:04:30You can set the intensity level by using the caveman command and specifying the intensity level you want.
00:04:35From that moment onwards, all the explanations will be cut straight to the point.
00:04:39So if you ask it to explain any particular part of the app, it will explain every aspect of the app using fewer words that are easier to understand and consume,
00:04:48often using arrows to explain the whole flow in a much more compact way than it would without this plugin.
00:04:54Now if you track your projects using git and use it as a knowledge base for tracking what has been done in your project, you can use this skill called git time travel.
00:05:02It basically gives your agent expertise in navigating git history and enables it to understand the entire history like a time travel log.
00:05:09When you install the skill, the skill.md file is installed along with additional references containing patterns and validations.
00:05:16These check for different types of issues like force pushing to main or rebasing without proper backup, which can lead to problems later on.
00:05:23You can use this skill to analyze any issue that comes up in your git logs.
00:05:27Once you provide a prompt, it follows the instructions from the skill file.
00:05:30After going through the entire history like time travel, it gives a detailed report.
00:05:34It points out everything that went wrong and provides recommendations and areas that need attention.
00:05:39But before we move forwards, let's have a word by our sponsor, FreeBuff.
00:05:42You're in the middle of a build and your coding agent is lagging, burning through credits and asking permission for every command.
00:05:48FreeBuff skips all of that.
00:05:49FreeBuff is the free coding agent up to 10 times faster than Claude code.
00:05:53Use any terminal except the Apple native terminal, run this command and you're all set.
00:05:58No subscription, no config, it's funded by simple text ads so it costs you nothing.
00:06:02Say you're deep in a project and need to test something in the browser, review your code or search through your code base.
00:06:08FreeBuff's got 9 sub-agents that kick in and handle all of that at 300 tokens per second.
00:06:13You finish a task and don't know what to tackle next.
00:06:15It drops 3 follow-up prompts you can just click to keep going.
00:06:19You can also connect your chat GPT subscription that unlocks GPT 5.4 for plan and review.
00:06:25Your code base isn't stored and nothing trains on your data.
00:06:27Try FreeBuff for free today.
00:06:29Links in the pinned comment.
00:06:30Now if you are working on an app and want to have issues identified before the app goes into production, you can use the pre-mortem skill from this skill pack.
00:06:38What this does is look at the code base, identify all the fragile areas and predict possible issues that could occur with the current implementation.
00:06:45It analyzes the code from different angles and then writes proper realistic reports for bugs that haven't even occurred yet but have a possibility of happening in the future when the app goes live in production.
00:06:56When you install the skill, you get a skill.md file that details everything needed to identify issues in the app.
00:07:02This includes the full workflow, how it should handle different aspects and what patterns need to be checked for reporting.
00:07:08And this catalog is quite extensive.
00:07:10The report also follows a proper format that defines how everything should be documented.
00:07:14You can use this skill in any project where it's installed.
00:07:17Just run the pre-mortem command and it will start analyzing the code base and generate a thorough report once the analysis is complete.
00:07:24It may also ask which aspects you want to focus on.
00:07:27The final report will contain all the bugs present in the current code base along with issues that might occur in the future so you can take action on them in time.
00:07:35In the same resource skill pack, there is another skill called mutation testing.
00:07:40It analyzes your entire test suite and evaluates it by introducing different types of bugs and mutations one at a time.
00:07:46It checks whether the test cases are strong enough to catch them.
00:07:49It makes mutations in the code and then reverts them, analyzes the gaps and generates a report with recommended changes.
00:07:55Once you run the skill, it starts by analyzing the project structure, finding the test files and then testing all of them one by one.
00:08:02Since it uses Git to revert the mutations, it ensures that all changes are committed beforehand.
00:08:08It applies changes to different components and checks whether the tests correctly detect those changes, verifying whether all test files are properly written or not.
00:08:16Once it has gone through all the checks, it generates a complete report with a mutation score.
00:08:21It lists uncaught issues and recommends improvements needed to make the test suite more complete and reliable.
00:08:27Now there is another skill that goes by the name of the fool.
00:08:30This skill critically analyzes and stress tests an idea, plan, decision or proposal.
00:08:35It uses multiple modes and stories to help you understand whether the direction you are taking is actually right and will sustain in the long term.
00:08:42It has multiple modes that you can choose from.
00:08:44Once you install this skill, it brings all the skill.md files and references for different modes into your project.
00:08:51You can use the command and provide whatever you want it to challenge.
00:08:54It first asks how you want the idea to be challenged and you can choose any option.
00:08:58Based on your selection, it loads the relevant references from the skill so it can reason accordingly.
00:09:03At the end of its process, it generates a detailed report containing multiple failure modes.
00:09:08It explains why things might fail and what consequences could result from those failure chains.
00:09:13It then lists all the findings in a structured sequence.
00:09:15You can push back on its analysis and iterate with it, refining your ideas along the way.
00:09:20Also, if you are enjoying our content, consider pressing the hype button because it helps us create more content like this and reach out to more people.
00:09:27Now if you've tried to do research on Reddit via Claude Code, you might have noticed that Reddit blocks bots like Claude Code making it difficult to access content.
00:09:35And Reddit is one of the most important sources for user input because a lot of people go there to share reviews and discuss different issues.
00:09:42So input from Reddit is very critical if you are researching the market.
00:09:45For that purpose, there is a skill pack that includes a skill called Reddit Fetch.
00:09:49What this skill does is fetch content from Reddit using either the Gemini CLI or a curl fallback so it can access Reddit more reliably.
00:09:57It works by first trying to use the Gemini CLI via TMUX.
00:10:01TMUX is a terminal multiplexer that allows you to spawn terminals within a session and handle multiple tasks in parallel.
00:10:08If that method fails, it falls back to using the curl JSON API.
00:10:12The skill provides detailed instructions on how to use both approaches properly.
00:10:16Once you install this skill, you can use it and specify the topic you want to research on Reddit.
00:10:21In the end, it provides a detailed report on what people on Reddit are actually saying about the topic or issue.
00:10:26And now, since you all know that agents tend to converge toward common patterns when building the UI, they often end up using the same purple and white theme.
00:10:35So there is this skill called Color Expert that acts as a guide and provides agents the understanding of color science.
00:10:41It covers different aspects like VCAG, palettes and more.
00:10:44Now you might think is that there are already so many UI skills out there, how is this one any different?
00:10:49But this one is different because it contains multiple references with literally 100+ markdown files that provide detailed guidance on what the right UI choices are and what are not.
00:10:59These references are collected through multiple grounded sources including Wikipedia, YouTube scripts and more.
00:11:05So we tested it on our app which was a landing page for our community.
00:11:08The agent first loaded the skill, understood the codebase and the guide properly and then started implementing the app following the patterns listed in the skill.
00:11:16When it did, you could see that the UI it generated was much more balanced, using white space and other elements properly.
00:11:22It used a color palette that was more interactive and engaging and it captured attention toward what mattered.
00:11:28Overall, it was a much more improved website than it would have been without the skill, despite the simple prompt we gave.
00:11:34That brings us to the end of this video.
00:11:35If you'd like to support the channel and help us keep making videos like this, you can do so by using the super thanks button below.
00:11:42As always, thank you for watching and I'll see you in the next one.

Key Takeaway

Implementing specialized skill packs like Caveman for 75% token reduction and Pre-mortem for predictive bug analysis transforms Claude from a general assistant into a precise, automated software engineering agent.

Highlights

The P on Ping skill uses game character voices to notify users when tasks finish or require permission prompts during simultaneous multi-session Claude workflows.

Caveman mode reduces token consumption by 75% by removing filler words and articles while maintaining the technical accuracy of the response.

Dog Food utilizes an agent browser CLI tool to navigate web apps, identify bugs, and generate reports with reproduction steps and walkthrough videos.

Mutation testing introduces bugs into a codebase to evaluate if an existing test suite is strong enough to detect them, resulting in a mutation score.

Reddit Fetch bypasses bot blocks using the Gemini CLI via TMUX or a curl fallback to extract user reviews and market research data.

Color Expert incorporates over 100 markdown reference files to guide agents in applying color science and WCAG standards beyond default themes.

Timeline

Automated Notifications for Multi-Session Workflows

  • P on Ping monitors multiple Claude sessions to notify users of task completion or pending permission prompts.
  • The tool replaces standard system notifications with character voice lines from popular video games.
  • Installation requires an OS-specific command followed by a slash command to select preferred voice packs.

Managing several simultaneous Claude sessions often leads to idle time when agents wait for manual permissions or finish tasks unnoticed. P on Ping solves this by using distinct game-inspired expressions that vary based on the specific task status. Users receive a voice notification when a session starts and when it completes, removing the need for manual status checks.

Adversarial UI Testing and Token Optimization

  • Dog Food performs adversarial reviews of web apps to document UX issues and critical bugs.
  • Caveman mode cuts 75% of token usage by stripping conversational fluff and irrelevant filler words.
  • Wengian mode uses Chinese characters to represent complex sentences in even fewer tokens for maximum compression.

The Dog Food skill uses the agent browser CLI to interact with elements, capturing screenshots and recording video walkthroughs of identified issues. To combat Claude's tendency for long-winded explanations, the Caveman plugin forces a direct communication style using arrows and compact flows. While Wengian mode offers higher token density, English Caveman mode is recommended for maintaining higher model accuracy.

Predictive Analysis and Version Control Forensics

  • Git Time Travel analyzes historical logs to identify risky patterns like force pushes or rebasing without backups.
  • Pre-mortem identifies fragile code areas to predict bugs before they occur in a production environment.
  • Both skills utilize skill.md files that define specific workflows and reporting formats for the agent.

Git Time Travel acts as a knowledge base expert, navigating the entire commit history to provide recommendations on what went wrong in a project. The Pre-mortem skill focuses on future prevention by analyzing the codebase from multiple angles and generating realistic reports for potential failures. These tools ensure that developers can take action on architectural weaknesses before they manifest as live incidents.

Mutation Testing and Strategic Idea Evaluation

  • Mutation testing applies temporary code changes to verify if a test suite can successfully detect injected bugs.
  • The Fool provides critical stress tests for proposals and business decisions using multiple failure mode simulations.
  • Test suites receive a numerical mutation score based on their ability to catch uncaught issues.

Mutation testing requires a clean Git state because it automatically modifies components, runs tests, and then reverts the changes. This process highlights gaps in testing coverage that standard metrics might miss. For high-level planning, The Fool uses structured references to challenge the long-term sustainability of an idea, allowing users to iterate on their proposals through adversarial reasoning.

Market Research and Enhanced UI Design

  • Reddit Fetch uses TMUX and Gemini CLI to access Reddit content that normally blocks automated bots.
  • Color Expert uses 100+ markdown files of grounded data to improve agent-generated UI layouts.
  • Standardized color palettes and white space management create more engaging interfaces than default agent themes.

Reddit is a vital source for user reviews, but its bot-blocking measures often hinder AI research. Reddit Fetch solves this by using a terminal multiplexer or curl fallback to gather authentic user sentiment. To fix the common problem of AI agents using repetitive purple and white themes, Color Expert provides a library of color science data from Wikipedia and other sources, resulting in more interactive and visually balanced landing pages.

Community Posts

No posts yet. Be the first to write about this video!

Write about this video