> ## Documentation Index > Fetch the complete documentation index at: https://docs.rightnowai.co/llms.txt > Use this file to discover all available pages before exploring further. # Advanced Features > Personalized AI rules and profiling persistence for power users ## .rightnowrules - Personalized AI Guidelines RightNow AI's `.rightnowrules` feature provides hardware-aware, personalized AI assistance tailored to your CUDA development workflow and GPU architecture. ### Overview The `.rightnowrules` file contains custom AI guidelines that enhance every interaction - from chat conversations to code completions. It combines: * **Hardware Detection**: Automatic GPU architecture analysis * **User Preferences**: Your experience level and optimization focus * **Project Context**: Workspace-specific CUDA development guidelines ### How It Works Automatically creates personalized rules based on detected hardware and user preferences Seamlessly integrates with chat, autocomplete, and code editing features ### Creating .rightnowrules **Command Palette Method**: 1. Open Command Palette (`Ctrl+Shift+P`) 2. Run: `RightNow: Generate .rightnowrules File` 3. Confirm hardware detection and preferences 4. File created in workspace root **Auto-Suggestion**: * RightNow AI detects CUDA files (`.cu`, `.cuh`, `.cuf`) in your workspace * Shows notification when no `.rightnowrules` exists * Click "Generate" for instant setup Create `.rightnowrules` in your workspace root: ```markdown theme={null} # CUDA Development Guidelines # Hardware: RTX 4090 (Ada Lovelace) | CUDA 12.3 # Profile: Intermediate | Machine Learning | Balanced **Key Guidelines:** - Optimize for tensor core utilization on Ada Lovelace - Balance memory coalescing with occupancy - Prioritize mixed precision for AI workloads - Target 70%+ GPU utilization for optimal performance ## My Preferences - Use explicit memory management over automatic - Optimize for inference latency < 10ms - Prefer readable code over micro-optimizations - Focus on FP16/BF16 tensor operations ``` ### Hardware-Aware Generation RightNow AI automatically detects and optimizes for your specific GPU: **Supported Architectures**: * **Pascal** (GTX 10 series): Focus on memory optimization * **Turing** (RTX 20 series): RT cores and tensor cores * **Ampere** (RTX 30 series): Sparse tensors and structural sparsity * **Ada Lovelace** (RTX 40 series): 3rd-gen RT cores, shader efficiency * **Hopper** (H100): Transformer engine, thread block clusters **Detection Includes**: * Compute capability and SM count * Tensor core availability and generation * Memory bandwidth and capacity * Multi-GPU configurations **Experience Level**: * `beginner`: Focus on correctness and learning * `intermediate`: Balance performance and readability * `expert`: Aggressive optimizations and advanced features **Primary Use Case**: * `machine_learning`: Tensor operations, mixed precision * `scientific_computing`: Double precision, memory bandwidth * `graphics`: RT cores, rasterization pipelines * `general`: Balanced approach across domains **Optimization Focus**: * `performance`: Maximum throughput and speed * `memory`: Minimize memory usage and bandwidth * `power`: Energy-efficient implementations * `balanced`: Compromise across all factors ### Example Generated Content ```markdown theme={null} # CUDA Development Guidelines # Auto-generated for Windows on 1/4/2025 **Hardware:** RTX 4090 (Ada Lovelace) | CUDA 12.3 **User Profile:** Intermediate developer | machine_learning focus | balanced optimization **Key Guidelines:** - Optimize memory coalescing and shared memory usage - Balance occupancy with register pressure - Prioritize tensor operations and mixed precision when available - Your Ada Lovelace GPU supports 3rd-gen tensor cores for AI workloads ## My Preferences # Add your specific guidelines here: # - Coding style (e.g., "prefer explicit memory management") # - Performance targets (e.g., "optimize for latency < 10ms") # - Project constraints (e.g., "limited to 4GB GPU memory") # - Architecture preferences (e.g., "focus on Ampere features") ``` ### Integration with AI Features System messages include your `.rightnowrules` as context for all conversations Autocomplete respects your guidelines and coding preferences Ctrl+K editing follows your optimization priorities and style ### Multi-Workspace Support * **Per-Workspace Rules**: Each workspace folder can have its own `.rightnowrules` * **Rule Combination**: Multiple workspace folders combine their rules intelligently * **Context Switching**: AI automatically adapts when switching between projects ## Profiling Data Persistence RightNow AI maintains comprehensive profiling history in `.rightnow/profiling/kernels.json` for tracking optimization progress across sessions. ### File Structure ``` .rightnow/ └── profiling/ └── kernels.json # Persistent profiling database ``` ### What Gets Stored **Core Performance Data**: * Execution time and GPU utilization * Memory throughput and occupancy * SM efficiency and warp efficiency * Cache hit rates (L1/L2) and register usage **Advanced Analytics**: * Branch efficiency and instruction replay overhead * Global/shared memory efficiency * Temperature and power consumption * Roofline analysis (compute vs memory bound) **Session Management**: * Multiple profiling sessions per kernel * Timestamps for optimization timeline * Performance trend analysis * Before/after comparison data **Content-Based Keys**: * Stable kernel identification across code changes * Preserves history when line numbers change * Detects real kernel modifications vs. cosmetic edits **NCU Integration**: * Official NVIDIA Nsight Compute recommendations * Architecture-specific optimization suggestions * Bottleneck identification and solutions * Performance improvement tracking ### Example Data Structure ```json theme={null} { "version": "1.0", "lastUpdated": "2024-01-31T10:00:00Z", "kernels": { "file:///C:/projects/matmul.cu:matrixMul:a1b2c3:L45": { "kernelName": "matrixMul", "sourceFile": "matmul.cu", "lineNumber": 45, "sessions": [ { "executionTime": 12.5, "smEfficiency": 85.2, "memoryThroughput": 450.8, "occupancy": 68.4, "l1CacheHitRate": 89.3, "recommendations": [ "Consider increasing occupancy by reducing register usage", "Memory access pattern is well coalesced" ], "timestamp": 1706698800000, "performance": "fast" } ] } } } ``` ### Benefits View performance improvements over time and identify regression points Profiling data survives editor restarts and code modifications Share profiling data across team members for collaborative optimization Content-based kernel IDs preserve history through code changes ### Configuration Both features work automatically with minimal setup: **User Preferences** (Settings → RightNow AI): * `cudaExperienceLevel`: Beginner, Intermediate, Expert * `cudaPrimaryUseCase`: ML, Scientific, Graphics, General * `cudaOptimizationFocus`: Performance, Memory, Power, Balanced **File Management**: * `.rightnowrules`: Version control recommended for team guidelines * `.rightnow/profiling/`: Add to `.gitignore` for personal profiling data Pro tip: Commit `.rightnowrules` to share team coding guidelines, but keep `.rightnow/profiling/` local for individual optimization tracking.