Skip to content

Plans and Quotas

IntelliRag offers three plans designed to scale from individual developers to large engineering organizations. All plans include the full platform - MCP tools, CLI, dashboard, and indexer. Plans differ in capacity limits and support level.

Feature Free Pro Enterprise
Workspaces 1 10 Unlimited
Repositories 3 50 Unlimited
Team members 1 25 Unlimited
LLM credits 1,000/mo 50,000/mo Custom
Search queries 5,000/mo 100,000/mo Custom
Index minutes 60/mo 1,000/mo Custom
Vector storage 100 MB 10 GB Custom
API calls 10,000/mo 500,000/mo Custom
Support Community Email Dedicated

Upgrade or downgrade at any time from the dashboard under Settings > Billing. Plan changes take effect immediately. When downgrading, existing data is retained but new writes are blocked if you exceed the lower plan’s limits.

IntelliRag meters usage across five dimensions. Each dimension is tracked independently, and any single dimension reaching its limit triggers the corresponding quota state.

Consumed by enrichment tasks: module summaries, debt triage, dead code review, schema annotations, contract inference, and event descriptions. Each enrichment job consumes credits based on token usage. Disabling enrichment for non-critical repositories is the most effective way to reduce LLM credit consumption.

Each semantic search, symbol lookup, or pattern search counts as one query. This includes queries from the MCP tools, CLI, and API. Health check endpoints are excluded.

Wall-clock time for indexer runs, measured from index start to completion. Incremental indexes consume significantly fewer minutes than full reindexes because they only process changed files.

Total size of embeddings stored across all 7 Qdrant collections (code chunks, module summaries, pattern matches, git archaeology, debt vectors, API contracts, and event catalog). Removing unused repositories frees vector storage immediately.

All API requests (reads and writes) count toward this dimension. Health check endpoints (/healthz, /readyz) are excluded.

Quota enforcement uses a four-state model. Each usage dimension is evaluated independently - one dimension can be in a warning state while others remain healthy.

State Threshold Behavior
Healthy Below 80% Normal operation.
Warning 80% Dashboard warning banner. Email notification sent.
Soft limit 100% Write operations blocked. Read operations continue normally.
Hard limit 120% All operations blocked except reads of existing data.

When a quota is exceeded, API responses include the X-Quota-State header indicating the current state. Blocked requests return HTTP 429 with a JSON body describing which dimension triggered the limit.

Quotas reset on the first of each month at 00:00 UTC. Enterprise plans have custom limits negotiated per contract and may include rollover provisions.