Skip to content

Configuration

The indexer is configured through CLI flags and environment variables. Most settings have sensible defaults, so the minimal invocation requires only your API credentials.

Terminal window
cd /path/to/your/repo
indexer index \
--api-url https://api.intellirag.io \
--api-key your-api-key

That’s it. --repo-path defaults to the current directory (.), so just run from your repo root. The indexer reads git remote get-url origin to identify the repository. If the current directory is not a git repository, the indexer exits with an error.

The indexer reads git remote get-url origin from the repository path and sends it to the API server during authentication. The server resolves:

  • tenant_id - derived from the API key
  • repo_id - matched by the normalized remote URL
  • workspace_id - the workspace containing the matched repository

You do not need to specify these IDs manually. The repository must already exist in the dashboard - the indexer performs a lookup, not an auto-create.

The indexer normalizes remote URLs before matching:

  • Trims trailing .git suffix
  • Converts to lowercase
  • Converts SSH URLs to HTTPS format

This normalization runs identically in both the indexer and the API server, so git@github.com:Org/Repo.git and https://github.com/org/repo resolve to the same repository.

Flag Env var Default Description
--api-url RAG_API_URL (required) API server URL
--api-key RAG_API_KEY (required) Your API key
--repo-path - . Path to the git repository
--repo-url - (auto from git) Override the git remote URL
--repo-id - (auto-resolved) Override the repository ID
--workspace-id - (auto-resolved) Override the workspace ID
--tenant-id - (auto-resolved) Override the tenant ID
--workers - NumCPU * 2 Number of concurrent analyzer workers
--batch-size - 500 Records per batch write
--embedding-url RAG_EMBEDDING_URL (from API) Embedding service URL
--embedding-key RAG_EMBEDDING_KEY (from API key) Embedding service API key

Flags take precedence over environment variables. Environment variables are useful for CI/CD pipelines where you want to avoid passing secrets as command-line arguments.

For CI and production use, set credentials as environment variables rather than passing them as flags:

Terminal window
export RAG_API_URL=https://api.intellirag.io
export RAG_API_KEY=your-api-key
# Run from your repo root - no --repo-path needed
indexer index
name: Index codebase
on:
push:
branches: [main]
jobs:
index:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for git archaeology
- name: Run IntelliRag indexer
run: |
curl -sL https://github.com/intellirag/intellirag.io/releases/latest/download/indexer-linux-amd64 -o indexer
chmod +x indexer
./indexer index
env:
RAG_API_URL: https://api.intellirag.io
RAG_API_KEY: ${{ secrets.RAG_API_KEY }}

Use fetch-depth: 0 to clone full git history. The indexer uses git history for archaeology analysis (code churn, ownership, change patterns). A shallow clone still works but produces less rich results.

- name: Run IntelliRag indexer
run: |
docker run --rm \
-v ${{ github.workspace }}:/repo \
-e RAG_API_URL=https://api.intellirag.io \
-e RAG_API_KEY=${{ secrets.RAG_API_KEY }} \
ghcr.io/intellirag/indexer:latest \
index --repo-path /repo

The --workers flag controls the number of concurrent analyzer goroutines. The default (NumCPU * 2) works well for most cases. Increase it on machines with fast I/O but fewer cores, or decrease it on memory-constrained environments.

The --batch-size flag controls how many records are sent per batch write to the API server. The default of 500 balances throughput with API gateway overhead. Larger values reduce the number of HTTP requests but increase per-request latency.