Configuration
The indexer is configured through CLI flags and environment variables. Most settings have sensible defaults, so the minimal invocation requires only your API credentials.
Minimal invocation
Section titled “Minimal invocation”cd /path/to/your/repoindexer index \ --api-url https://api.intellirag.io \ --api-key your-api-keyThat’s it. --repo-path defaults to the current directory (.), so just run from your repo root. The indexer reads git remote get-url origin to identify the repository. If the current directory is not a git repository, the indexer exits with an error.
Auto-resolution
Section titled “Auto-resolution”The indexer reads git remote get-url origin from the repository path and sends it to the API server during authentication. The server resolves:
- tenant_id - derived from the API key
- repo_id - matched by the normalized remote URL
- workspace_id - the workspace containing the matched repository
You do not need to specify these IDs manually. The repository must already exist in the dashboard - the indexer performs a lookup, not an auto-create.
URL normalization
Section titled “URL normalization”The indexer normalizes remote URLs before matching:
- Trims trailing
.gitsuffix - Converts to lowercase
- Converts SSH URLs to HTTPS format
This normalization runs identically in both the indexer and the API server, so git@github.com:Org/Repo.git and https://github.com/org/repo resolve to the same repository.
CLI flags
Section titled “CLI flags”| Flag | Env var | Default | Description |
|---|---|---|---|
--api-url |
RAG_API_URL |
(required) | API server URL |
--api-key |
RAG_API_KEY |
(required) | Your API key |
--repo-path |
- | . |
Path to the git repository |
--repo-url |
- | (auto from git) | Override the git remote URL |
--repo-id |
- | (auto-resolved) | Override the repository ID |
--workspace-id |
- | (auto-resolved) | Override the workspace ID |
--tenant-id |
- | (auto-resolved) | Override the tenant ID |
--workers |
- | NumCPU * 2 |
Number of concurrent analyzer workers |
--batch-size |
- | 500 |
Records per batch write |
--embedding-url |
RAG_EMBEDDING_URL |
(from API) | Embedding service URL |
--embedding-key |
RAG_EMBEDDING_KEY |
(from API key) | Embedding service API key |
Flags take precedence over environment variables. Environment variables are useful for CI/CD pipelines where you want to avoid passing secrets as command-line arguments.
Environment variables
Section titled “Environment variables”For CI and production use, set credentials as environment variables rather than passing them as flags:
export RAG_API_URL=https://api.intellirag.ioexport RAG_API_KEY=your-api-key
# Run from your repo root - no --repo-path neededindexer indexRunning in CI
Section titled “Running in CI”GitHub Actions
Section titled “GitHub Actions”name: Index codebaseon: push: branches: [main]
jobs: index: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 with: fetch-depth: 0 # Full history for git archaeology
- name: Run IntelliRag indexer run: | curl -sL https://github.com/intellirag/intellirag.io/releases/latest/download/indexer-linux-amd64 -o indexer chmod +x indexer ./indexer index env: RAG_API_URL: https://api.intellirag.io RAG_API_KEY: ${{ secrets.RAG_API_KEY }}Use fetch-depth: 0 to clone full git history. The indexer uses git history for archaeology analysis (code churn, ownership, change patterns). A shallow clone still works but produces less rich results.
Docker in CI
Section titled “Docker in CI” - name: Run IntelliRag indexer run: | docker run --rm \ -v ${{ github.workspace }}:/repo \ -e RAG_API_URL=https://api.intellirag.io \ -e RAG_API_KEY=${{ secrets.RAG_API_KEY }} \ ghcr.io/intellirag/indexer:latest \ index --repo-path /repoTuning
Section titled “Tuning”Workers
Section titled “Workers”The --workers flag controls the number of concurrent analyzer goroutines. The default (NumCPU * 2) works well for most cases. Increase it on machines with fast I/O but fewer cores, or decrease it on memory-constrained environments.
Batch size
Section titled “Batch size”The --batch-size flag controls how many records are sent per batch write to the API server. The default of 500 balances throughput with API gateway overhead. Larger values reduce the number of HTTP requests but increase per-request latency.
Next steps
Section titled “Next steps”- Review supported languages to understand what the indexer extracts
- See framework detection for framework-specific intelligence