Hallucination Detection

The hallucination detection analyzer is Vow's core feature, designed to identify when AI models generate fabricated APIs, imports, functions, or other non-existent code elements.

How It Works

The Allowlist Approach

Vow uses an allowlist-based approach to detect hallucinations:

Known Package Database: Maintains a curated list of real packages, APIs, and functions
Import Verification: Checks if imported packages actually exist
API Validation: Verifies that called functions/methods are real
Cross-reference: Compares generated code against known good patterns

# ✅ Real import - will pass
import requests
response = requests.get("https://api.github.com/users/octocat")

# ❌ Hallucinated import - will be flagged
import nonexistent_magic_lib
data = nonexistent_magic_lib.do_impossible_thing()

Detection Mechanisms

1. Import Analysis

# Real imports (in allowlist)
import os                    # ✅ Standard library
import requests              # ✅ Popular package
from flask import Flask      # ✅ Known framework

# Hallucinated imports (not in allowlist)
import magic_ai_lib          # ❌ Doesn't exist
from super_utils import *    # ❌ Vague/fabricated
import openai_v4             # ❌ Version doesn't exist

2. API Endpoint Validation

# Suspicious API patterns
requests.get("https://api.nonexistent.com/v1/data")    # ❌ Fake domain
requests.post("https://api.example.com/secret")        # ❌ Too generic
fetch("https://internal-api.company.com/admin")        # ❌ Assumed internal API

3. Function Call Verification

# Real function calls
os.path.exists("/tmp")           # ✅ Standard library
requests.get().json()            # ✅ Known method chain

# Hallucinated function calls  
requests.get().auto_parse()      # ❌ Method doesn't exist
os.smart_cleanup()               # ❌ Function doesn't exist

Supported Languages

Language	Import Detection	API Validation	Function Verification	Coverage
Python	✅ Full	✅ Full	✅ Full	95%+
JavaScript	✅ Full	✅ Partial	✅ Full	85%+
TypeScript	✅ Full	✅ Partial	✅ Full	85%+
Go	✅ Full	❌ Limited	✅ Partial	70%+
Rust	✅ Full	❌ Limited	✅ Partial	65%+
Java	🔄 Coming Soon	🔄 Coming Soon	🔄 Coming Soon	-

Known Package Database

Python Packages

Vow includes knowledge of:

Standard Library: All built-in modules (os, sys, json, etc.)
Popular Packages: Top 1000 PyPI packages by download count
Common Patterns: Typical import styles and usage patterns

# Example Python package definitions
python_packages:
  requests:
    version_range: ">=2.0.0"
    common_imports:
      - "import requests"
      - "from requests import get, post"
    known_methods:
      - "get"
      - "post" 
      - "put"
      - "delete"
    common_patterns:
      - "requests.get().json()"
      - "requests.post(url, json=data)"

JavaScript/Node.js Packages

Built-ins: All Node.js core modules
NPM Popular: Top 500 most downloaded packages
Browser APIs: DOM, Fetch, etc.

Custom Package Lists

Add your organization's internal packages:

# .vow/known-packages.yaml
custom_packages:
  python:
    - name: "internal_utils"
      versions: ["1.0.0", "1.1.0"]
      imports:
        - "from internal_utils import helper"
    - name: "company_api_client"
      versions: [">=2.0.0"]

Configuration

Basic Configuration

# .vow.yaml
analyzers:
  hallucination_detection:
    enabled: true
    
    # Strictness level
    strictness: medium  # low, medium, high, paranoid
    
    # Package sources to check
    check_sources:
      - pypi          # Python Package Index
      - npm           # NPM Registry
      - crates_io     # Rust Crates
      - custom        # Your custom packages
    
    # What to check
    check_types:
      - imports       # import statements
      - api_calls     # HTTP API endpoints
      - functions     # Function/method calls

Strictness Levels

Low Strictness

Only flags obviously fake packages
Allows common placeholder names
Minimal false positives

# Would NOT be flagged in low strictness
import utils              # Generic but common
from helpers import *     # Vague but acceptable

Medium Strictness (Default)

Balanced approach
Flags suspicious patterns
Some false positives acceptable

# Would be flagged in medium strictness
import magic_helper       # "magic" is suspicious
from ai_utils import *    # AI-related names are suspicious

High Strictness

Very conservative
Flags anything not explicitly known
Higher false positive rate

# Would be flagged in high strictness
import custom_lib         # Not in allowlist
import internal_tool      # Unknown package

Paranoid Mode

Maximum detection
Flags even borderline cases
High false positive rate but catches everything

Limitations

1. Custom/Internal Packages

Vow doesn't know about your internal packages by default:

# Will be flagged even if these are real internal packages
import company_internal_lib
from team_utils import helper

Solution: Add them to your custom package list.

2. Version-Specific APIs

Vow may not track every version of every package:

# Might be flagged if using very new features
import requests
response = requests.get(url, timeout=30.5)  # New timeout format

3. Dynamic Imports

Runtime imports are harder to verify:

# Harder to verify statically
module_name = "requests" 
imported_module = __import__(module_name)

4. Language Coverage

Some languages have limited coverage - see the table above.

Fine-tuning

Reducing False Positives

1. Custom Allowlist

# .vow/known-packages.yaml
allowlist:
  python:
    - "internal_package"
    - "legacy_tool"
  javascript:
    - "@company/utils"

2. Ignore Patterns

# .vow.yaml
hallucination_detection:
  ignore_patterns:
    - "test_*"           # Test files often have mock imports
    - "*_mock"           # Mock modules
    - "example_*"        # Example code

3. Confidence Thresholds

hallucination_detection:
  confidence_threshold: 0.7  # Only flag high-confidence issues
  min_severity: medium       # Skip low-severity issues

Handling Special Cases

Commented Code

# This won't be flagged (commented)
# import fake_library

# This WILL be flagged (active code)
import fake_library

Documentation Examples

# Mark documentation files as examples
file_types:
  documentation:
    patterns: ["*.md", "*.rst", "docs/**"]
    relaxed_checking: true

Common Issues and Solutions

Issue: Internal Package Flagged

❌ Import 'company_utils' not found in known packages

Solution: Add to custom allowlist

custom_packages:
  python:
    - name: "company_utils"

Issue: New Package Version

❌ Method 'requests.Session().mount()' may be hallucinated

Solution: Update package database or reduce strictness

# Update package database
vow update-packages

# Or reduce strictness for this project
vow check . --strictness low

Issue: Dynamic Code

# This pattern is hard to verify
getattr(requests, 'get')('https://api.example.com')

Solution: Use static imports when possible, or add ignore patterns.

Best Practices

1. Regular Updates

Keep the package database updated:

# Update monthly
vow update-packages --auto-schedule monthly

2. Project-Specific Configuration

Create .vow.yaml files for each project:

# For a data science project
analyzers:
  hallucination_detection:
    strictness: low  # Many ML packages
    custom_packages:
      - "internal_ml_utils"

3. CI Integration

Use in CI but handle false positives:

# .github/workflows/vow.yml
- name: Check for hallucinations
  run: |
    vow check . --format sarif --output results.sarif
    # Continue on failure but upload results
  continue-on-error: true

4. Team Coordination

Share package lists across team:

# Export your package list
vow packages export team-packages.yaml

# Import on other machines
vow packages import team-packages.yaml

Next Steps

Code Analyzer - Related code analysis features
Known Packages - Managing package lists
Writing Rules - Custom detection rules

Vow Documentation