Plans and Quotas

IntelliRag offers three plans designed to scale from individual developers to large engineering organizations. All plans include the full platform - MCP tools, CLI, dashboard, and indexer. Plans differ in capacity limits and support level.

Plans

Feature	Free	Pro	Enterprise
Workspaces	1	10	Unlimited
Repositories	3	50	Unlimited
Team members	1	25	Unlimited
LLM credits	1,000/mo	50,000/mo	Custom
Search queries	5,000/mo	100,000/mo	Custom
Index minutes	60/mo	1,000/mo	Custom
Vector storage	100 MB	10 GB	Custom
API calls	10,000/mo	500,000/mo	Custom
Support	Community	Email	Dedicated

Upgrade or downgrade at any time from the dashboard under Settings > Billing. Plan changes take effect immediately. When downgrading, existing data is retained but new writes are blocked if you exceed the lower plan’s limits.

Usage dimensions

IntelliRag meters usage across five dimensions. Each dimension is tracked independently, and any single dimension reaching its limit triggers the corresponding quota state.

LLM credits

Consumed by enrichment tasks: module summaries, debt triage, dead code review, schema annotations, contract inference, and event descriptions. Each enrichment job consumes credits based on token usage. Disabling enrichment for non-critical repositories is the most effective way to reduce LLM credit consumption.

Search queries

Each semantic search, symbol lookup, or pattern search counts as one query. This includes queries from the MCP tools, CLI, and API. Health check endpoints are excluded.

Index minutes

Wall-clock time for indexer runs, measured from index start to completion. Incremental indexes consume significantly fewer minutes than full reindexes because they only process changed files.

Vector storage

Total size of embeddings stored across all 7 Qdrant collections (code chunks, module summaries, pattern matches, git archaeology, debt vectors, API contracts, and event catalog). Removing unused repositories frees vector storage immediately.

API calls

All API requests (reads and writes) count toward this dimension. Health check endpoints (/healthz, /readyz) are excluded.

Quota states

Quota enforcement uses a four-state model. Each usage dimension is evaluated independently - one dimension can be in a warning state while others remain healthy.

State	Threshold	Behavior
Healthy	Below 80%	Normal operation.
Warning	80%	Dashboard warning banner. Email notification sent.
Soft limit	100%	Write operations blocked. Read operations continue normally.
Hard limit	120%	All operations blocked except reads of existing data.

When a quota is exceeded, API responses include the X-Quota-State header indicating the current state. Blocked requests return HTTP 429 with a JSON body describing which dimension triggered the limit.

Quotas reset on the first of each month at 00:00 UTC. Enterprise plans have custom limits negotiated per contract and may include rollover provisions.