Plans and Quotas
IntelliRag offers three plans designed to scale from individual developers to large engineering organizations. All plans include the full platform - MCP tools, CLI, dashboard, and indexer. Plans differ in capacity limits and support level.
| Feature | Free | Pro | Enterprise |
|---|---|---|---|
| Workspaces | 1 | 10 | Unlimited |
| Repositories | 3 | 50 | Unlimited |
| Team members | 1 | 25 | Unlimited |
| LLM credits | 1,000/mo | 50,000/mo | Custom |
| Search queries | 5,000/mo | 100,000/mo | Custom |
| Index minutes | 60/mo | 1,000/mo | Custom |
| Vector storage | 100 MB | 10 GB | Custom |
| API calls | 10,000/mo | 500,000/mo | Custom |
| Support | Community | Dedicated |
Upgrade or downgrade at any time from the dashboard under Settings > Billing. Plan changes take effect immediately. When downgrading, existing data is retained but new writes are blocked if you exceed the lower plan’s limits.
Usage dimensions
Section titled “Usage dimensions”IntelliRag meters usage across five dimensions. Each dimension is tracked independently, and any single dimension reaching its limit triggers the corresponding quota state.
LLM credits
Section titled “LLM credits”Consumed by enrichment tasks: module summaries, debt triage, dead code review, schema annotations, contract inference, and event descriptions. Each enrichment job consumes credits based on token usage. Disabling enrichment for non-critical repositories is the most effective way to reduce LLM credit consumption.
Search queries
Section titled “Search queries”Each semantic search, symbol lookup, or pattern search counts as one query. This includes queries from the MCP tools, CLI, and API. Health check endpoints are excluded.
Index minutes
Section titled “Index minutes”Wall-clock time for indexer runs, measured from index start to completion. Incremental indexes consume significantly fewer minutes than full reindexes because they only process changed files.
Vector storage
Section titled “Vector storage”Total size of embeddings stored across all 7 Qdrant collections (code chunks, module summaries, pattern matches, git archaeology, debt vectors, API contracts, and event catalog). Removing unused repositories frees vector storage immediately.
API calls
Section titled “API calls”All API requests (reads and writes) count toward this dimension. Health check endpoints (/healthz, /readyz) are excluded.
Quota states
Section titled “Quota states”Quota enforcement uses a four-state model. Each usage dimension is evaluated independently - one dimension can be in a warning state while others remain healthy.
| State | Threshold | Behavior |
|---|---|---|
| Healthy | Below 80% | Normal operation. |
| Warning | 80% | Dashboard warning banner. Email notification sent. |
| Soft limit | 100% | Write operations blocked. Read operations continue normally. |
| Hard limit | 120% | All operations blocked except reads of existing data. |
When a quota is exceeded, API responses include the X-Quota-State header indicating the current state. Blocked requests return HTTP 429 with a JSON body describing which dimension triggered the limit.
Quotas reset on the first of each month at 00:00 UTC. Enterprise plans have custom limits negotiated per contract and may include rollover provisions.