> ## Documentation Index
> Fetch the complete documentation index at: https://docs.rightnowai.co/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Providers

> Configure OpenRouter BYOK and local AI models

## OpenRouter BYOK (Free Tier)

**OpenRouter is the only supported BYOK provider** - all other AI providers route through OpenRouter's unified API.

### Setup Steps

1. **Create Account**: Sign up at [openrouter.ai](https://openrouter.ai)
2. **Get API Key**: Visit [openrouter.ai/settings/keys](https://openrouter.ai/settings/keys)
3. **Configure RightNow AI**:
   * Go to **Settings** → **AI Providers** → **OpenRouter**
   * Enter your OpenRouter API key
   * Test connection

### Available Models

Access 200+ models through OpenRouter's unified API:

**Free Models** (with your API key):

* `google/gemini-2.0-flash-exp:free`
* `mistralai/mistral-small-3.1-24b-instruct:free`

**Premium Models** (with your API key):

* **OpenAI**: GPT-4, GPT-4 Turbo, GPT-3.5 Turbo
* **Anthropic**: Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku
* **DeepSeek**: R1 series, Chat models
* **Mistral**: Large, Codestral 2501
* **Google**: Gemini 2.0 Flash

### Provider Routing

All cloud providers automatically route through OpenRouter:

* **OpenAI** → OpenRouter → OpenAI
* **Anthropic** → OpenRouter → Anthropic
* **DeepSeek** → OpenRouter → DeepSeek
* **Mistral** → OpenRouter → Mistral
* **Google** → OpenRouter → Google

## RightNow Pro (Managed Service)

**No API key setup required** - fully managed OpenRouter integration.

### Benefits

* **Curated Models**: Optimized selection for CUDA development
* **Usage Tracking**: Comprehensive analytics and billing
* **Priority Access**: Faster response times and premium models
* **Seamless Experience**: No API key management needed

### Available Models

**Chat Models**:

* `anthropic/claude-sonnet-4`
* `google/gemini-2.5-flash`
* `deepseek/deepseek-chat-v3-0324`

**FIM Models** (Autocomplete):

* `codestral-2501`
* `deepseek-r1-distill-qwen-7b`

### Upgrade

Ready to upgrade? Visit [rightnowai.co/pricing](https://www.rightnowai.co/pricing) to get started with RightNow Pro.

## Local Models (Privacy-First)

Complete offline capability with no data leaving your machine.

### Ollama

**Setup**:

1. Install [Ollama](https://ollama.ai) on your system
2. Pull a model: `ollama pull codellama`
3. Configure RightNow AI:
   * **Settings** → **AI Providers** → **Ollama**
   * Set endpoint: `http://localhost:11434`
   * Select your model and test connection

**Benefits**:

* Easy local model management
* CUDA acceleration support
* Automatic model updates

### vLLM

**Setup**:

1. Install vLLM: `pip install vllm`
2. Start server: `python -m vllm.entrypoints.api_server --model codellama/CodeLlama-7b-Instruct-hf`
3. Configure RightNow AI:
   * **Settings** → **AI Providers** → **vLLM**
   * Set endpoint and model
   * Test connection

**Benefits**:

* High-performance inference server
* Optimized for CUDA GPUs
* Excellent throughput for large models

### LM Studio

**Setup**:

1. Download and install [LM Studio](https://lmstudio.ai)
2. Download a CUDA-compatible model
3. Start local server in LM Studio
4. Configure RightNow AI:
   * **Settings** → **AI Providers** → **LM Studio**
   * Configure endpoint and test connection

**Benefits**:

* User-friendly interface
* GPU acceleration support
* Easy model management

<Tip>
  Use local models for privacy-sensitive projects where code cannot leave your machine.
</Tip>