> ## Documentation Index
> Fetch the complete documentation index at: https://docs.rightnowai.co/llms.txt
> Use this file to discover all available pages before exploring further.

# Advanced Features

> Personalized AI rules and profiling persistence for power users

## .rightnowrules - Personalized AI Guidelines

RightNow AI's `.rightnowrules` feature provides hardware-aware, personalized AI assistance tailored to your CUDA development workflow and GPU architecture.

### Overview

The `.rightnowrules` file contains custom AI guidelines that enhance every interaction - from chat conversations to code completions. It combines:

* **Hardware Detection**: Automatic GPU architecture analysis
* **User Preferences**: Your experience level and optimization focus
* **Project Context**: Workspace-specific CUDA development guidelines

### How It Works

<CardGroup cols={2}>
  <Card title="Auto-Generation" icon="wand-magic-sparkles">
    Automatically creates personalized rules based on detected hardware and user preferences
  </Card>

  <Card title="AI Integration" icon="robot">
    Seamlessly integrates with chat, autocomplete, and code editing features
  </Card>
</CardGroup>

### Creating .rightnowrules

<Tabs>
  <Tab title="Automatic Generation">
    **Command Palette Method**:

    1. Open Command Palette (`Ctrl+Shift+P`)
    2. Run: `RightNow: Generate .rightnowrules File`
    3. Confirm hardware detection and preferences
    4. File created in workspace root

    **Auto-Suggestion**:

    * RightNow AI detects CUDA files (`.cu`, `.cuh`, `.cuf`) in your workspace
    * Shows notification when no `.rightnowrules` exists
    * Click "Generate" for instant setup
  </Tab>

  <Tab title="Manual Creation">
    Create `.rightnowrules` in your workspace root:

    ```markdown theme={null}
    # CUDA Development Guidelines
    # Hardware: RTX 4090 (Ada Lovelace) | CUDA 12.3
    # Profile: Intermediate | Machine Learning | Balanced

    **Key Guidelines:**
    - Optimize for tensor core utilization on Ada Lovelace
    - Balance memory coalescing with occupancy
    - Prioritize mixed precision for AI workloads
    - Target 70%+ GPU utilization for optimal performance

    ## My Preferences
    - Use explicit memory management over automatic
    - Optimize for inference latency < 10ms
    - Prefer readable code over micro-optimizations
    - Focus on FP16/BF16 tensor operations
    ```
  </Tab>
</Tabs>

### Hardware-Aware Generation

RightNow AI automatically detects and optimizes for your specific GPU:

<AccordionGroup>
  <Accordion icon="microchip" title="GPU Architecture Detection">
    **Supported Architectures**:

    * **Pascal** (GTX 10 series): Focus on memory optimization
    * **Turing** (RTX 20 series): RT cores and tensor cores
    * **Ampere** (RTX 30 series): Sparse tensors and structural sparsity
    * **Ada Lovelace** (RTX 40 series): 3rd-gen RT cores, shader efficiency
    * **Hopper** (H100): Transformer engine, thread block clusters

    **Detection Includes**:

    * Compute capability and SM count
    * Tensor core availability and generation
    * Memory bandwidth and capacity
    * Multi-GPU configurations
  </Accordion>

  <Accordion icon="user" title="User Preference Integration">
    **Experience Level**:

    * `beginner`: Focus on correctness and learning
    * `intermediate`: Balance performance and readability
    * `expert`: Aggressive optimizations and advanced features

    **Primary Use Case**:

    * `machine_learning`: Tensor operations, mixed precision
    * `scientific_computing`: Double precision, memory bandwidth
    * `graphics`: RT cores, rasterization pipelines
    * `general`: Balanced approach across domains

    **Optimization Focus**:

    * `performance`: Maximum throughput and speed
    * `memory`: Minimize memory usage and bandwidth
    * `power`: Energy-efficient implementations
    * `balanced`: Compromise across all factors
  </Accordion>
</AccordionGroup>

### Example Generated Content

```markdown theme={null}
# CUDA Development Guidelines  
# Auto-generated for Windows on 1/4/2025

**Hardware:** RTX 4090 (Ada Lovelace) | CUDA 12.3

**User Profile:** Intermediate developer | machine_learning focus | balanced optimization

**Key Guidelines:**
- Optimize memory coalescing and shared memory usage
- Balance occupancy with register pressure  
- Prioritize tensor operations and mixed precision when available
- Your Ada Lovelace GPU supports 3rd-gen tensor cores for AI workloads

## My Preferences
# Add your specific guidelines here:
# - Coding style (e.g., "prefer explicit memory management")
# - Performance targets (e.g., "optimize for latency < 10ms")  
# - Project constraints (e.g., "limited to 4GB GPU memory")
# - Architecture preferences (e.g., "focus on Ampere features")
```

### Integration with AI Features

<CardGroup cols={3}>
  <Card title="Chat Assistant" icon="comments">
    System messages include your `.rightnowrules` as context for all conversations
  </Card>

  <Card title="Code Completion" icon="code">
    Autocomplete respects your guidelines and coding preferences
  </Card>

  <Card title="Quick Edit" icon="pen">
    Ctrl+K editing follows your optimization priorities and style
  </Card>
</CardGroup>

### Multi-Workspace Support

* **Per-Workspace Rules**: Each workspace folder can have its own `.rightnowrules`
* **Rule Combination**: Multiple workspace folders combine their rules intelligently
* **Context Switching**: AI automatically adapts when switching between projects

## Profiling Data Persistence

RightNow AI maintains comprehensive profiling history in `.rightnow/profiling/kernels.json` for tracking optimization progress across sessions.

### File Structure

```
.rightnow/
└── profiling/
    └── kernels.json  # Persistent profiling database
```

### What Gets Stored

<AccordionGroup>
  <Accordion icon="database" title="Comprehensive Metrics">
    **Core Performance Data**:

    * Execution time and GPU utilization
    * Memory throughput and occupancy
    * SM efficiency and warp efficiency
    * Cache hit rates (L1/L2) and register usage

    **Advanced Analytics**:

    * Branch efficiency and instruction replay overhead
    * Global/shared memory efficiency
    * Temperature and power consumption
    * Roofline analysis (compute vs memory bound)
  </Accordion>

  <Accordion icon="clock-rotate-left" title="Historical Tracking">
    **Session Management**:

    * Multiple profiling sessions per kernel
    * Timestamps for optimization timeline
    * Performance trend analysis
    * Before/after comparison data

    **Content-Based Keys**:

    * Stable kernel identification across code changes
    * Preserves history when line numbers change
    * Detects real kernel modifications vs. cosmetic edits
  </Accordion>

  <Accordion icon="lightbulb" title="AI Recommendations">
    **NCU Integration**:

    * Official NVIDIA Nsight Compute recommendations
    * Architecture-specific optimization suggestions
    * Bottleneck identification and solutions
    * Performance improvement tracking
  </Accordion>
</AccordionGroup>

### Example Data Structure

```json theme={null}
{
  "version": "1.0",
  "lastUpdated": "2024-01-31T10:00:00Z",
  "kernels": {
    "file:///C:/projects/matmul.cu:matrixMul:a1b2c3:L45": {
      "kernelName": "matrixMul",
      "sourceFile": "matmul.cu", 
      "lineNumber": 45,
      "sessions": [
        {
          "executionTime": 12.5,
          "smEfficiency": 85.2,
          "memoryThroughput": 450.8,
          "occupancy": 68.4,
          "l1CacheHitRate": 89.3,
          "recommendations": [
            "Consider increasing occupancy by reducing register usage",
            "Memory access pattern is well coalesced"
          ],
          "timestamp": 1706698800000,
          "performance": "fast"
        }
      ]
    }
  }
}
```

### Benefits

<CardGroup cols={2}>
  <Card title="Optimization Tracking" icon="chart-line">
    View performance improvements over time and identify regression points
  </Card>

  <Card title="Session Persistence" icon="floppy-disk">
    Profiling data survives editor restarts and code modifications
  </Card>

  <Card title="Team Collaboration" icon="users">
    Share profiling data across team members for collaborative optimization
  </Card>

  <Card title="Smart Identification" icon="fingerprint">
    Content-based kernel IDs preserve history through code changes
  </Card>
</CardGroup>

### Configuration

Both features work automatically with minimal setup:

**User Preferences** (Settings → RightNow AI):

* `cudaExperienceLevel`: Beginner, Intermediate, Expert
* `cudaPrimaryUseCase`: ML, Scientific, Graphics, General
* `cudaOptimizationFocus`: Performance, Memory, Power, Balanced

**File Management**:

* `.rightnowrules`: Version control recommended for team guidelines
* `.rightnow/profiling/`: Add to `.gitignore` for personal profiling data

<Tip>
  Pro tip: Commit `.rightnowrules` to share team coding guidelines, but keep `.rightnow/profiling/` local for individual optimization tracking.
</Tip>
