I Tried Running a Company Made of AI Agents

Englishالعربية Deutsch Español Français हिन्दी Bahasa Indonesia 日本語 한국어 Português Русский 中文

컴퓨터/소프트웨어창업/스타트업경영/리더십AI/미래기술

Transcript

00:00:00I gave three AI agents the same repo, and together they formed a company. One tried to build the

00:00:06feature, one rewrote the architecture, and one opened and dealt with all the tickets. With no

00:00:12structure, every multi-agent setup slowly turns into confusion and racks up the bill.

00:00:17This is Paperclip, and it's trying to fix that. One command gives you local control plane for

00:00:22AI agents with organizational charts, tickets, budgets, audit logs, and even heartbeats.

00:00:27It's just crossed over 64,000 stars on GitHub.

00:00:30Let's set up our own company with a few AI agents in a couple minutes.

00:00:33Now here's the thing with agents. A single agent feels nice. You give it a task, it writes some

00:00:44code. Great job. Then you give a second agent, maybe even a third agent. And what happens is

00:00:51suddenly that just turns into management work. Who owns the task? That's the question. Who's

00:00:57remembering the goal out of this, and who stops the agent when it starts doing the wrong thing?

00:01:03That's the problem Paperclip is trying to solve. Raw agents working alone aren't great. Useful,

00:01:08but hard to coordinate. Paperclip turns them into a team, or I guess in this case it's called a

00:01:13company. We define a company goal. We create an organizational chart. Maybe there's a CEO, a CTO,

00:01:20two engineers, and a research agent. Then Paperclip coordinates the work through tickets, heartbeats,

00:01:27your budgets, approvals, and traceability. We can see the task, who assigned it, how much it actually

00:01:33spent on that task, and whether it still connects to the end goal. Less vibes-based orchestration?

00:01:39Let's actually see this live. If you enjoy coding tools to speed up your workflow, be sure to

00:01:43subscribe. We have videos coming out all the time. All right, now watch this. In a clean terminal,

00:01:49I'm just going to run NPX Paperclip AI onboard. That starts up the local setup. Now a few moments

00:01:56later, Paperclip is running with the dashboard. I have local services, Postgres comes with it,

00:02:03and auth. This is the whole UI here now where I can actually create a new company. I'm going to

00:02:09create a new company and call it dev tools company, or really whatever you're trying to build. For this,

00:02:14I'm going to say this goal. The goal is simple. I want to build and ship a URL shortener MVP this

00:02:20week. Now I can add a CTO agent. Then I can add two engineers through adapters. One of these engineer

00:02:28agents owns the backend. The other owns the frontend and test coverage. Now, before I hit

00:02:34start, I'm going to set the budget. And this part's what really matters because the goal is to not let

00:02:39the agents cook my API till the bill explodes. No, the goal is controlled autonomy. I also need to set

00:02:46the path to my working directory where the code is going to be output. So I'm going to set that here.

00:02:50Now I can hit those heartbeats and I can start it. And let's watch the board. The agents wake up

00:02:57on heartbeat. The CTO breaks the goal into tickets. Our engineers here, they're now picking up work.

00:03:05So you can see delegation, tickets, ancestry, status changes, the budget counter, all of this

00:03:10tied together. And now the first implementation task is already moving toward a code commit.

00:03:15This actually took quite a bit of time to run, but I guess having all these agents together,

00:03:19that makes a little sense, but still it's not the fastest, especially if you're trying to scale this

00:03:24even more. This is not one agent sitting in a chat box anymore. This is now a small company that's

00:03:30running by us creating these agents, CEO, CTO, all these engineers. Now this is where people get

00:03:37confused. At first glance, Paperclip sounds like another agent framework, another crew AI, another

00:03:43auto-gen, another Langraph style workflow. That's not really the point. Those tools are great when

00:03:49you want a workflow, right? So for example, I want a researcher, then planner, then writer,

00:03:55then reviewer. Yeah, sure. Of course that's useful. That's why we use them. But Paperclip is aiming at

00:04:01a level higher. It's not just the workers anymore. It's the company that's kind of surrounding these

00:04:07workers in this organizational chart to really help things build out. Think of it like this.

00:04:13A single agent is just an employee. A workflow is like your checklist. Paperclip is the manager,

00:04:20the organizational chart, the ticket board, the budget system, the audit log. That is Paperclip

00:04:25as the manager. So questions you're already asking yourself now, can an agent write code? Well,

00:04:30we already know it can. That's the purpose of this. It's generating that now. The harder questions are,

00:04:36can it work on the right task? Can it stop when it actually should? Can it hand off work clearly?

00:04:43Can I inspect what is even happening here? And the short answer to all of those is yeah, it can.

00:04:49Paperclip gives you state, heartbeats, budget, hierarchy, logs. It even gives you portable

00:04:55templates and a dashboard that feels more like Jira or linear for agents than another chat window.

00:05:02You stop prompting one agent and start controlling this mini organization. Many of us probably still

00:05:07bounce between terminals and setups. One terminal for Claude code, a tab for cursor, an agent for

00:05:13research, one script for GitHub issues, right? All of these different windows were bouncing between,

00:05:18but Paperclip gives all of that a shared operating model. Now the mental model for all of this

00:05:24actually changes for us. So instead of saying, "Hey, please build this future," what we're

00:05:30actually saying now is something more along the lines of this company's goal is to ship this

00:05:35product. Here are the rules in the company. Here's the organizational chart and here's the budget.

00:05:41Here's what needs approval. Now run. Now being honest here, the structure is nice,

00:05:46right? Tickets, ancestry, delegation, all of that, right? Multi-agent work is easier to reason about

00:05:52by having this. Instead of saying the agent did something, bravo. You can actually see who assigned

00:05:58that work, why it exists and where it fits into our code. Being able to set budgets is also huge.

00:06:05A lot of agent tools treat costs like something you check after the fact. Paperclip makes cost

00:06:12part of the whole control loop. We set the budget before we execute. It's self-hosted and open

00:06:17source. Again, huge win there. So you can run it locally, inspect it, modify it and connect it to

00:06:22the agents you're already using. But at the same time of all this good stuff, the same structure

00:06:27that makes Paperclip powerful can also be really annoying. If your rules are bad, agents can create

00:06:32tickets about nonsense. I wanted a URL shortener here simple, but now maybe my CTO agent has opened

00:06:39this whole other plan that I didn't even want. So no thanks to that. Token burn is also real,

00:06:45right? This is why we have budgets to control this, but it doesn't fix sloppy prompts or vague rule

00:06:52definitions. And guys, if your skill MD files suck, your company behaves like a confused startup,

00:06:59right? So skills MD, that's what needs the strength here, right? And finally, honestly,

00:07:03if you're doing a simple script, this is a complete overkill. I just wanted to test this out. I did not

00:07:08need this for this project, but if you just want one agent to summarize a file or patch a bug,

00:07:13you don't need this, right? This is for building out a lot more, having more of these agents working

00:07:18together. It's definitely worth using, but it's not for everything. If you enjoy coding tools and

00:07:23tips like this, be sure to subscribe. We'll see you in another video.

Key Takeaway

Paperclip solves the management overhead of multi-agent systems by providing a centralized dashboard for budgets, tickets, and organizational hierarchy to ensure AI agents stay aligned with specific product goals.

Highlights

Paperclip provides a local control plane for AI agents and has exceeded 64,000 stars on GitHub.
The software transforms individual agents into a structured company using organizational charts, tickets, budgets, and heartbeats.
A single command, 'npx paperclip-ai onboard', initializes a local setup including Postgres, authentication, and a management dashboard.
Budgets are set before execution to prevent agents from exhausting API tokens through uncontrolled autonomous loops.
Paperclip functions as an organizational manager rather than a workflow tool like CrewAI or LangGraph by providing state, hierarchy, and audit logs.

Timeline

The Problem with Unstructured Multi-Agent Systems

Uncoordinated agents often rewrite each other's work or create conflicting architecture.
Management work increases exponentially as more agents are added to a project.
Without a structure, multi-agent setups frequently result in confusion and high API costs.

Running three agents on the same repository without a hierarchy leads to one building a feature while another rewrites the same architecture. This lack of ownership makes it difficult to remember the ultimate goal or stop an agent when it deviates from the task. Coordination remains the primary hurdle for turning raw agents into a functional team.

Core Features and Installation of Paperclip

Paperclip coordinates work through tickets, heartbeats, budgets, approvals, and traceability.
The system offers a local UI for monitoring task assignment and spending per task.
Standard local services like Postgres and authentication are included in the initial setup.

The tool moves away from 'vibes-based' orchestration toward a verifiable management system. Users can see exactly who assigned a task and whether the work still connects to the end goal. Installation is performed via a single terminal command that prepares the dashboard and local database environment.

Building a Virtual Dev Tools Company

Companies are defined by a clear goal, such as shipping a URL shortener MVP in one week.
Hierarchy is established by assigning roles like CTO and specialized engineers for backend and frontend tasks.
Agents wake up on designated heartbeats to pick up tickets and begin implementation.

Setting a budget is the most critical step in the setup to ensure autonomy remains controlled. Agents operate by having a CTO agent break down high-level goals into specific tickets for engineer agents to claim. While the process is slower than a single agent in a chat box, it allows for delegation, ancestry tracking, and status changes in a simulated corporate environment.

Paperclip vs. Traditional Agent Frameworks

Workflow tools focus on linear tasks like research, planning, and writing.
Paperclip operates at a higher level as a manager, organizational chart, and audit log.
The system provides a shared operating model that replaces multiple disconnected terminal windows.

Frameworks like Auto-Gen or CrewAI are suited for checklists and specific workflows. Paperclip is designed for the infrastructure surrounding those workers, acting more like Jira or Linear for AI. It transitions the user from prompting a single agent to controlling an entire mini-organization with a persistent state.

Structural Advantages and Practical Limitations

Budgeting is integrated into the control loop rather than checked after the fact.
Poorly defined rules or 'skills.md' files lead to agents creating nonsense tickets.
Paperclip is overkill for simple tasks like summarizing a file or patching a single bug.

Self-hosting and open-source accessibility allow for local inspection and modification. However, the system's power is contingent on the quality of the prompts; vague definitions will cause a 'confused startup' behavior among agents. It is specifically intended for complex, multi-agent collaborations rather than quick, one-off scripts.

Community Posts

No posts yet. Be the first to write about this video!

Write about this video