AI Providers

OpenRouter BYOK (Free Tier)

OpenRouter is the only supported BYOK provider - all other AI providers route through OpenRouter’s unified API.

Setup Steps

Create Account: Sign up at openrouter.ai
Get API Key: Visit openrouter.ai/settings/keys
Configure RightNow AI:
- Go to Settings → AI Providers → OpenRouter
- Enter your OpenRouter API key
- Test connection

Available Models

Access 200+ models through OpenRouter’s unified API: Free Models (with your API key):

google/gemini-2.0-flash-exp:free
mistralai/mistral-small-3.1-24b-instruct:free

Premium Models (with your API key):

OpenAI: GPT-4, GPT-4 Turbo, GPT-3.5 Turbo
Anthropic: Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku
DeepSeek: R1 series, Chat models
Mistral: Large, Codestral 2501
Google: Gemini 2.0 Flash

Provider Routing

All cloud providers automatically route through OpenRouter:

OpenAI → OpenRouter → OpenAI
Anthropic → OpenRouter → Anthropic
DeepSeek → OpenRouter → DeepSeek
Mistral → OpenRouter → Mistral
Google → OpenRouter → Google

RightNow Pro (Managed Service)

No API key setup required - fully managed OpenRouter integration.

Benefits

Curated Models: Optimized selection for CUDA development
Usage Tracking: Comprehensive analytics and billing
Priority Access: Faster response times and premium models
Seamless Experience: No API key management needed

Available Models

Chat Models:

anthropic/claude-sonnet-4
google/gemini-2.5-flash
deepseek/deepseek-chat-v3-0324

FIM Models (Autocomplete):

codestral-2501
deepseek-r1-distill-qwen-7b

Upgrade

Ready to upgrade? Visit rightnowai.co/pricing to get started with RightNow Pro.

Local Models (Privacy-First)

Complete offline capability with no data leaving your machine.

Ollama

Setup:

Install Ollama on your system
Pull a model: ollama pull codellama
Configure RightNow AI:
- Settings → AI Providers → Ollama
- Set endpoint: http://localhost:11434
- Select your model and test connection

Benefits:

Easy local model management
CUDA acceleration support
Automatic model updates

vLLM

Setup:

Install vLLM: pip install vllm
Start server: python -m vllm.entrypoints.api_server --model codellama/CodeLlama-7b-Instruct-hf
Configure RightNow AI:
- Settings → AI Providers → vLLM
- Set endpoint and model
- Test connection

Benefits:

High-performance inference server
Optimized for CUDA GPUs
Excellent throughput for large models

LM Studio

Setup:

Download and install LM Studio
Download a CUDA-compatible model
Start local server in LM Studio
Configure RightNow AI:
- Settings → AI Providers → LM Studio
- Configure endpoint and test connection

Benefits:

User-friendly interface
GPU acceleration support
Easy model management

Use local models for privacy-sensitive projects where code cannot leave your machine.

Getting Started

Core Features

Configuration

Support

OpenRouter BYOK (Free Tier)

Setup Steps

Available Models

Provider Routing

RightNow Pro (Managed Service)

Benefits

Available Models

Upgrade

Local Models (Privacy-First)

Ollama

vLLM

LM Studio

Getting Started

Core Features

Configuration

Support

​OpenRouter BYOK (Free Tier)

​Setup Steps

​Available Models

​Provider Routing

​RightNow Pro (Managed Service)

​Benefits

​Available Models

​Upgrade

​Local Models (Privacy-First)

​Ollama

​vLLM

​LM Studio

OpenRouter BYOK (Free Tier)

Setup Steps

Available Models

Provider Routing

RightNow Pro (Managed Service)

Benefits

Available Models

Upgrade

Local Models (Privacy-First)

Ollama

vLLM

LM Studio