Platform Features
Everything you need to manage AI-assisted engineering
Observe every session. Govern agent behavior across your team. Continuously improve with data-driven experimentation. Three pillars, one platform.
A Marketplace of Agents, Skills, and Playbooks
Stop building everything from scratch. crewkit ships with a growing library of production-ready resources -- agents tuned for specific frameworks, skills that codify repeatable workflows, and playbooks that encode your team's engineering standards.
Every resource uses 3-tier inheritance: Platform resources provide a curated baseline, organizations layer on team-specific guidelines, and projects add repository context. Change a resource at one tier and it propagates to every downstream consumer automatically.
Resources are versioned with immutable snapshots. Roll back to any previous version, fork a platform resource to customize it, or publish your own to the organization catalog. Network effects mean the best configurations rise to the top.
- Curated resource library -- Agents for Rails, React, DevOps, security, database migrations, and more -- ready to install
- 3-tier inheritance -- Platform, Organization, and Project tiers compose and override cleanly. One change propagates everywhere.
- Immutable versioning -- Every save creates a version snapshot. Compare diffs, roll back instantly, or fork to customize.
- Publish and share -- Share proven configurations across your org. Publish to the marketplace. Network effects compound.
Pre-Built Agents for Every Part of Your Stack
Agents are the core of crewkit. Each agent is a specialized AI configuration tuned for a specific domain -- Rails backend work, React frontend architecture, security reviews, database optimization, API design, and dozens more.
When a developer starts a session, crewkit syncs the right agents for the project. The agent receives the full context chain: platform-level best practices, organization coding standards, and project-specific conventions. No manual configuration. No copy-pasting prompts.
Agents adapt to each developer's role. Junior engineers get coaching mode -- the agent guides them step by step, explains concepts, and asks clarifying questions instead of writing code directly. Senior engineers get autonomous mode with full agency. Collaborative mode sits between, pairing on solutions together.
- Domain-specific agents -- rails-expert, frontend-expert, security-expert, api-designer, devops-engineer, and more
- Role-based behavior -- Coaching, collaborative, and autonomous modes adapt agent output per developer seniority
- Task delegation -- Agents delegate subtasks to specialized agents. Every SubagentSession is tracked with resource version.
- Context injection -- Project context, conventions, artifacts, and team standards are loaded automatically into every session
Reusable Skills That Codify How Work Gets Done
Skills are reusable capability modules that extend what AI agents can do. Instead of ad-hoc prompting, skills encode repeatable workflows -- committing code, reviewing pull requests, planning features, running QA checks, writing documentation, and more.
When a developer invokes a skill, crewkit tracks the invocation with the exact resource version used. This creates an auditable trail of which skill version produced which output, enabling teams to measure skill effectiveness and iterate on the workflow definition.
Skills compose with agents. A developer working with the rails-expert agent can invoke the /commit skill, which follows team conventions for message format, runs pre-commit checks, and stages the right files. Skills turn tribal knowledge into automated workflows anyone on the team can use.
- Pre-built skill library -- /commit, /review-pr, /plan-feature, /qa, /refactor, /document, and dozens more
- Invocation tracking -- Every skill use is recorded with resource_id and version_hash for full traceability
- Composable with agents -- Skills work alongside any agent. The agent provides domain context, the skill provides the workflow.
- Custom skills -- Build org-specific skills for your deployment process, migration patterns, or review standards
Playbooks That Enforce Your Engineering Standards
Playbooks codify your team's engineering standards into enforceable conventions that agents follow automatically. Define your testing strategy, naming conventions, architecture patterns, and code style once -- and they propagate to every AI session across the organization.
crewkit's AI-powered convention extraction can analyze your existing codebase and generate playbook conventions from the patterns already present. Instead of writing standards from scratch, let crewkit discover what your team already does well and codify it. Stack-based matching ensures Rails projects get Ruby conventions and React projects get TypeScript conventions.
When an agent deviates from a convention, crewkit tracks the override. Over time, this data reveals which conventions are working and which need revision -- turning static standards into a living, data-driven engineering practice.
- AI-powered extraction -- Analyze your codebase to discover and codify existing patterns into playbook conventions
- Stack-based matching -- Rails, React, Python, Go, and more -- conventions auto-match to the right projects
- Convention override tracking -- Monitor when and why agents deviate. Data shows which standards need updating.
- Playbook marketplace -- Browse community playbooks or publish your own. Subscribe to playbooks for automatic updates.
Full Visibility Into Every AI Session
crewkit captures comprehensive telemetry from every AI coding session -- tokens consumed, costs incurred, models used, tools invoked, session duration, turn counts, and outcome quality. All of this data flows into dashboards built for engineering leaders who need answers, not just data.
The analytics suite includes four specialized views: a summary dashboard with KPI cards, timeseries charts showing trends over any time range, per-agent breakdowns revealing which agents perform best, and cost analysis by model showing exactly where your budget goes. Filter by project, developer, agent, or time range.
Sessions are automatically analyzed by AI for quality scoring, generating coaching tips, identifying patterns, and summarizing outcomes. This transforms raw telemetry into actionable insights that help teams improve their AI-assisted workflows week over week.
- Session telemetry -- Turns, tokens, costs, tool usage, duration, and model breakdown per conversation
- Four analytics views -- Summary KPIs, timeseries trends, per-agent performance, and cost breakdown by model
- AI-powered analysis -- Automatic session summaries, quality scores, coaching tips, and pattern detection
- Filter and drill down -- Slice data by project, developer, agent, time range, or any combination
A/B Test Agent Configurations Like You Test Code
Tweaking a prompt should not be guesswork. crewkit's experimentation framework lets you run controlled A/B tests on any resource version -- agents, skills, playbooks, or rules. Split traffic between control and variant, then let real usage data determine the winner.
Every experiment tracks session-level metrics: accuracy scores, cost per session, token efficiency, turn counts, and quality ratings. crewkit computes statistical significance using p-values and confidence intervals, so you deploy changes only when the data confirms improvement -- not when it feels right.
When an experiment reaches significance, crewkit generates a recommendation: upgrade to the variant, roll back to control, or collect more data if results are inconclusive. The winning version can be deployed to the full team with a single action, completing the improvement loop.
- Controlled experiments -- Split traffic between resource versions with automatic, unbiased assignment
- Real-time metrics -- Track accuracy, cost, token efficiency, quality scores, and turn counts live
- Statistical significance -- p-values and confidence intervals ensure you deploy only proven improvements
- Automated recommendations -- Upgrade, rollback, or collect more data -- crewkit tells you what to do next
Complete Session Hierarchy With Sharing and Import
Every AI coding session in crewkit follows a three-level hierarchy: Runs represent the CLI process, Conversations capture the primary interaction developers see, and Tasks track work delegated to specialized agents. This structure gives teams both a high-level overview and the ability to drill into any detail.
Sessions can be shared with teammates or stakeholders using access-controlled links. Set visibility to team, organization, or public. Add annotations to highlight specific moments. Managers can review sessions to understand how the team uses AI, while developers can share interesting solutions or ask for feedback.
Teams migrating to crewkit can import their historical sessions from JSONL exports. The import pipeline deduplicates by session ID, parses model metrics, and indexes everything for search. Thread grouping connects related conversations, so multi-session workflows stay coherent.
- Three-level hierarchy -- Run, Conversation, and Task levels give both overview and drill-down capability
- Session sharing -- Share sessions with access controls, visibility settings, and inline annotations
- Historical import -- Import past sessions from JSONL. Deduplication, metric parsing, and search indexing included.
- Thread grouping -- Related conversations are grouped into threads for multi-session workflow tracking
Governance That Scales With Your Team
Not every developer should interact with AI the same way. crewkit's role-based agent configuration lets you define how agents behave for different team members. Junior developers get coaching mode -- agents guide them, explain concepts, and ask clarifying questions. Senior developers get autonomous mode with full implementation capability.
Organizations define custom roles mapped to agent behavior modifiers. A "junior" role might activate coaching mode on all agents, while a "tech-lead" role enables autonomous mode with architecture-level prompts. Roles are assigned per member and enforced automatically in every session.
Every modification is tracked through audit trails. Paper Trail records who changed what, when, and why. Convention enforcement data shows how well the team adheres to standards. Security events are logged for compliance. This is the governance layer that regulated industries need for AI-assisted development.
- Role-based behavior -- Coaching, collaborative, and autonomous modes tailored to developer seniority
- Custom roles -- Define organization-specific roles with granular agent behavior modifiers
- Audit trails -- Paper Trail tracks every change. Security events logged for compliance requirements.
- Convention enforcement -- Measure adherence to standards across the organization with override tracking
Rich Project Context and Artifact Management
AI agents produce better results when they understand the full picture. crewkit's artifact system lets teams upload PRDs, technical specifications, API contracts, design documents, and transcripts. These artifacts are indexed for semantic search and automatically injected into relevant sessions.
When a developer starts a session on a project with uploaded artifacts, the agent receives the relevant context without manual prompting. A developer working on the auth module gets the auth PRD and API contract. A developer refactoring the database layer gets the data model specification. Context is matched by project and topic.
The dashboard provides a management interface for uploading, organizing, and searching artifacts. Teams can manage their knowledge base alongside their agent configurations, ensuring that every session starts with the right information and the agent never asks questions that are already answered in existing documentation.
- Upload anything -- PRDs, transcripts, contracts, design docs, API specifications -- all indexed for search
- Semantic search -- Find relevant artifacts across the knowledge base using natural language queries
- Automatic context injection -- Relevant artifacts are loaded into sessions based on project and topic matching
- Dashboard management -- Upload, organize, tag, and search artifacts from the web interface or CLI
A CLI Built for How Developers Actually Work
crewkit lives where developers work -- the terminal. The CLI is built in Rust for speed and reliability, with a TUI (terminal user interface) that displays session metrics, summaries, and git context in a sidebar alongside the AI conversation. Zero-config setup means a single install and crewkit code gets you running.
The leader key system (F12 + second key) provides a namespace of keyboard shortcuts that never conflict with the AI assistant or terminal. F12+p opens the command palette for fuzzy-searching all available commands. F12+i shows session info. F12+s opens the share dialog. F12+d opens the dashboard for the current conversation.
Progressive onboarding means crewkit code handles everything inline -- authentication, project initialization, and resource sync happen automatically on first run. No separate setup commands, no configuration files to edit, no dead ends. The CLI detects the git remote, resolves the project, and starts the session with full context.
- Zero-config setup -- Install, run crewkit code, and you are working. Auto-detects org and project from git.
- TUI with sidebar -- Session metrics, AI summary, git context, and cost tracking visible alongside the conversation
- Leader key system -- F12+key shortcuts for command palette, session info, sharing, and dashboard -- no conflicts
- Progressive onboarding -- Auth, init, and sync happen inline on first run. Never dead-ends. Always recoverable.
Fits into your existing workflow
crewkit integrates with the tools your team already uses. No workflow changes required -- just better visibility and control.
Git Integration
Auto-detects repositories and branches. Sessions track git context including branch, SHA, and component.
CI/CD Ready
Resource sync works in CI pipelines. Validate agent configurations as part of your deployment process.
API-First Design
Full REST API for custom integrations. Build dashboards, automations, or connect to your existing tools.
Multi-Repo Support
Workspace projects span multiple repositories. Mono-repos and multi-repo architectures both work seamlessly.
Built for the full AI development lifecycle
From first install to production optimization, crewkit covers every stage of AI-assisted engineering.
Real-Time Streaming
Session events stream live. See what your team is working on right now, not after the fact.
Multi-Tenant Isolation
Row-level security ensures organizations never see each other's data. Pundit policies enforce access.
Enterprise Auth
JWT tokens, passkeys, magic links, and device flow. SSO/SAML for team plans. Security events audited.
Keyboard-First UX
Leader keys, command palette, and shortcuts. Designed for developers who live in the terminal.
Command System
Reusable commands that any developer can invoke. Shared across the organization with versioning.
Dashboard + CLI
Power users stay in the terminal. Managers use the web dashboard. Same data, two interfaces.
Open Source CLI
The CLI is open source. Inspect the code, contribute, or run it without the hosted platform.
Continuous Improvement
Configure, sync, measure, iterate. Your agents get measurably better every sprint.
Your agents should get better every week
Set up in under 60 seconds. No credit card required. Start with the free tier and scale as your team grows.
Free tier includes 1 seat, 2 projects, and 500 sessions/month. View pricing