Observe. Govern. Improve. Repeat.

The operating system for AI-assisted engineering

Your team is shipping with AI agents, but nobody knows what they're doing, whether they follow standards, or if they're getting better. crewkit gives you the marketplace, observability, and experimentation layer to turn ad-hoc AI usage into a managed, measurable engineering practice.

curl -fsSL https://crewkit.io/install.sh | sh

Start Building Read the Docs

Observe

Every session, cost, and outcome across the team

Govern

Playbooks, roles, and enforceable conventions

Improve

A/B test agents and iterate with real data

terminal

$ crewkit code

crewkit v0.1.17

ready acme-corp/customer-portal

synced 3 agents, 5 skills, 2 playbooks (7 conventions)

context 12 rules, 2 artifacts loaded

launching main* (session #284)

From zero to optimized in four steps

Get your team up and running in minutes. crewkit handles discovery, configuration, and measurement so you can focus on shipping.

Install & Connect

One command to install. crewkit auto-detects your org, project, and git context.

Equip Your Team

Browse the marketplace. Pick agents, skills, and playbooks. They sync to every developer automatically.

Observe & Measure

Every session is tracked: tokens, costs, success rates, and quality scores across your entire team.

Iterate & Optimize

A/B test resource versions. Deploy what works. Your agents get better every week.

Resource Marketplace

Pre-built agents, skills, and playbooks

Browse a growing library of battle-tested resources. Pick what fits your stack, customize for your team, and deploy instantly. Every resource is versioned, measurable, and continuously improved by the community.

Agents

Specialized AI personas with deep domain expertise

rails-expert

Rails conventions, migrations, testing

frontend-expert

React, TypeScript, accessibility

api-designer

REST/GraphQL API design patterns

security-reviewer

OWASP, auth, vulnerability scanning

+ dozens more in the marketplace

Skills

Reusable capabilities that extend what agents can do

create-pr

Draft PRs with proper descriptions

review-code

Structured code review with checklist

run-tests

Execute and analyze test suites

plan-feature

Break features into tasks and subtasks

+ dozens more in the marketplace

Playbooks

Org-wide conventions and coding standards, enforced automatically

Rails 8 Standards

Rubocop, RSpec, service objects

React + TypeScript

Strict mode, hooks, testing library

API Security

Auth, rate limiting, input validation

CI/CD Pipeline

GitHub Actions, deploy checks, rollbacks

+ dozens more in the marketplace

Platform network effects: more teams, better resources

Every team that uses crewkit contributes performance data that makes resources better for everyone. Session analytics reveal which agent configurations, skill patterns, and conventions actually work -- then the community iterates. This data flywheel is what turns a tool into a platform.

Platform-- curated base resources

Organization-- team standards & conventions

Project-- repo-specific context

65+

Resources available

3-tier

Inheritance model

Explore the Marketplace

Platform Features

Govern, experiment, and optimize -- all in one platform

Once your team has the right resources from the marketplace, crewkit gives you the governance, experimentation, and observability to turn them into a measurable engineering advantage.

Governance That Scales With Your Team

Enforce coding standards and best practices automatically. Junior developers get coaching mode, seniors get full autonomy -- all controlled through role-based agent configuration.

Role-based behavior -- Coaching, collaborative, and autonomous modes adapt agent output per developer
Playbook enforcement -- Conventions auto-inject into every session, no manual setup per project
Convention tracking -- Monitor adherence and overrides across your organization

.claude/agents/rails-expert.md

# Rails Expert

**Role detected: Junior Developer**

## Coaching Mode Active

CRITICAL: You are in COACHING MODE.

- DO NOT write code for them

- GUIDE them step-by-step

- ASK clarifying questions

- EXPLAIN concepts

## Organization Standards

- Use RSpec (not MiniTest)

- Follow Rubocop Shopify guide

A/B Test Agent Configurations Like You Test Code

Stop guessing which prompts work best. Run controlled experiments, measure real outcomes, and deploy winning configurations with statistical confidence.

Controlled experiments -- Split traffic between resource versions with automatic assignment
Real-time metrics -- Track accuracy, cost, token usage, and quality scores live
Statistical significance -- Deploy only when p-values confirm the winner

dashboard -- experiment results

Experiment: swift-amber-falcon

Resource: rails-expert (agent)

Status: Running | Created: 3 days ago

Control Variant Delta

Sessions 47 53 +6

Accuracy 87.2% 92.1% +4.9%

Avg Cost $0.034 $0.029 -14.7%

Avg Turns 8.3 6.1 -26.5%

Statistically significant (p=0.023)

Recommendation: Deploy variant

Full Visibility Into Every AI Session

Track sessions, costs, and outcomes across your entire organization. Drill into any conversation to see exactly what happened, what it cost, and how it performed.

Session telemetry -- Turns, tokens, costs, tool usage, and duration per conversation
Team analytics -- Timeseries, per-agent, per-project, and cost-breakdown dashboards
AI-powered analysis -- Automatic session summaries, coaching tips, and quality scores

dashboard -- analytics overview

acme-corp | Last 7 days

Sessions

147 total | 89.1% success rate

Avg duration: 12m 34s | Avg turns: 8.2

Cost

$18.42 total | $0.125 avg/session

1.2M input tokens | 89K output tokens

By Agent

rails-expert 52 sessions 91% success

frontend-expert 38 sessions 87% success

api-designer 31 sessions 93% success

Built for the full AI development lifecycle

Rich Context Injection

Artifacts, rules, and project context are automatically loaded into every session -- so agents always know your codebase.

Playbook-Driven Consistency

Define conventions once at the org level. They propagate to every project and every developer -- no drift, no exceptions.

Continuous Improvement Loop

Configure resources, sync to your team, measure performance, and iterate. Your agents get better every sprint.

Your agents should get better every week

Teams using crewkit have a curated marketplace, real-time observability, and experiment-driven optimization -- the infrastructure layer every AI-adopting engineering team needs. Set up in under 60 seconds.

$brew install crewkit-io/tap/crewkit

Start Building Talk to Sales

Observe. Govern. Improve. Repeat.

The operating system for AI-assisted engineering

curl -fsSL https://crewkit.io/install.sh | sh

Start Building Read the Docs

Observe

Every session, cost, and outcome across the team

Govern

Playbooks, roles, and enforceable conventions

Improve

A/B test agents and iterate with real data

terminal

$ crewkit code

crewkit v0.1.17

ready acme-corp/customer-portal

synced 3 agents, 5 skills, 2 playbooks (7 conventions)

context 12 rules, 2 artifacts loaded

launching main* (session #284)

From zero to optimized in four steps

Get your team up and running in minutes. crewkit handles discovery, configuration, and measurement so you can focus on shipping.

Install & Connect

One command to install. crewkit auto-detects your org, project, and git context.

Equip Your Team

Browse the marketplace. Pick agents, skills, and playbooks. They sync to every developer automatically.

Observe & Measure

Every session is tracked: tokens, costs, success rates, and quality scores across your entire team.

Iterate & Optimize

A/B test resource versions. Deploy what works. Your agents get better every week.

Resource Marketplace

Pre-built agents, skills, and playbooks

Agents

Specialized AI personas with deep domain expertise

rails-expert

Rails conventions, migrations, testing

frontend-expert

React, TypeScript, accessibility

api-designer

REST/GraphQL API design patterns

security-reviewer

OWASP, auth, vulnerability scanning

+ dozens more in the marketplace

Skills

Reusable capabilities that extend what agents can do

create-pr

Draft PRs with proper descriptions

review-code

Structured code review with checklist

run-tests

Execute and analyze test suites

plan-feature

Break features into tasks and subtasks

+ dozens more in the marketplace

Playbooks

Org-wide conventions and coding standards, enforced automatically

Rails 8 Standards

Rubocop, RSpec, service objects

React + TypeScript

Strict mode, hooks, testing library

API Security

Auth, rate limiting, input validation

CI/CD Pipeline

GitHub Actions, deploy checks, rollbacks

+ dozens more in the marketplace

Platform network effects: more teams, better resources

Platform-- curated base resources

Organization-- team standards & conventions

Project-- repo-specific context

65+

Resources available

3-tier

Inheritance model

Explore the Marketplace

Platform Features

Govern, experiment, and optimize -- all in one platform

Once your team has the right resources from the marketplace, crewkit gives you the governance, experimentation, and observability to turn them into a measurable engineering advantage.

Governance That Scales With Your Team

Enforce coding standards and best practices automatically. Junior developers get coaching mode, seniors get full autonomy -- all controlled through role-based agent configuration.

Role-based behavior -- Coaching, collaborative, and autonomous modes adapt agent output per developer
Playbook enforcement -- Conventions auto-inject into every session, no manual setup per project
Convention tracking -- Monitor adherence and overrides across your organization

.claude/agents/rails-expert.md

# Rails Expert

**Role detected: Junior Developer**

## Coaching Mode Active

CRITICAL: You are in COACHING MODE.

- DO NOT write code for them

- GUIDE them step-by-step

- ASK clarifying questions

- EXPLAIN concepts

## Organization Standards

- Use RSpec (not MiniTest)

- Follow Rubocop Shopify guide

A/B Test Agent Configurations Like You Test Code

Stop guessing which prompts work best. Run controlled experiments, measure real outcomes, and deploy winning configurations with statistical confidence.

Controlled experiments -- Split traffic between resource versions with automatic assignment
Real-time metrics -- Track accuracy, cost, token usage, and quality scores live
Statistical significance -- Deploy only when p-values confirm the winner

dashboard -- experiment results

Experiment: swift-amber-falcon

Resource: rails-expert (agent)

Status: Running | Created: 3 days ago

Control Variant Delta

Sessions 47 53 +6

Accuracy 87.2% 92.1% +4.9%

Avg Cost $0.034 $0.029 -14.7%

Avg Turns 8.3 6.1 -26.5%

Statistically significant (p=0.023)

Recommendation: Deploy variant

Full Visibility Into Every AI Session

Track sessions, costs, and outcomes across your entire organization. Drill into any conversation to see exactly what happened, what it cost, and how it performed.

Session telemetry -- Turns, tokens, costs, tool usage, and duration per conversation
Team analytics -- Timeseries, per-agent, per-project, and cost-breakdown dashboards
AI-powered analysis -- Automatic session summaries, coaching tips, and quality scores

dashboard -- analytics overview

acme-corp | Last 7 days

Sessions

147 total | 89.1% success rate

Avg duration: 12m 34s | Avg turns: 8.2

Cost

$18.42 total | $0.125 avg/session

1.2M input tokens | 89K output tokens

By Agent

rails-expert 52 sessions 91% success

frontend-expert 38 sessions 87% success

api-designer 31 sessions 93% success

Built for the full AI development lifecycle

Rich Context Injection

Artifacts, rules, and project context are automatically loaded into every session -- so agents always know your codebase.

Playbook-Driven Consistency

Define conventions once at the org level. They propagate to every project and every developer -- no drift, no exceptions.

Continuous Improvement Loop

Configure resources, sync to your team, measure performance, and iterate. Your agents get better every sprint.

Your agents should get better every week

$brew install crewkit-io/tap/crewkit

Start Building Talk to Sales

The operating system for AI-assisted engineering

From zero to optimized in four steps

Install & Connect

Equip Your Team

Observe & Measure

Iterate & Optimize

Pre-built agents, skills, and playbooks

Agents

Skills

Playbooks

Platform network effects: more teams, better resources

Govern, experiment, and optimize -- all in one platform

Governance That Scales With Your Team

A/B Test Agent Configurations Like You Test Code

Full Visibility Into Every AI Session

Built for the full AI development lifecycle

Rich Context Injection

Playbook-Driven Consistency

Continuous Improvement Loop

Your agents should get better every week

Command Palette

The operating system for AI-assisted engineering

From zero to optimized in four steps

Install & Connect

Equip Your Team

Observe & Measure

Iterate & Optimize

Pre-built agents, skills, and playbooks

Agents

Skills

Playbooks

Platform network effects: more teams, better resources

Govern, experiment, and optimize -- all in one platform

Governance That Scales With Your Team

A/B Test Agent Configurations Like You Test Code

Full Visibility Into Every AI Session

Built for the full AI development lifecycle

Rich Context Injection

Playbook-Driven Consistency

Continuous Improvement Loop

Your agents should get better every week

Command Palette