Trust Score Algorithm
The trust score is Vow's quantitative measure of confidence in AI-generated content. It ranges from 0.0 (very low confidence) to 1.0 (high confidence) and helps you prioritize which outputs need human review.
How Trust Scores Work
Basic Formula
Trust Score = weighted_average(analyzer_scores) × confidence_multiplier
Where:
- Analyzer Scores: Individual confidence ratings from each analyzer
- Weights: Importance weighting for each analyzer
- Confidence Multiplier: Adjustment based on detection certainty
Default Weights
| Analyzer | Weight | Rationale |
|---|---|---|
| Code | 40% | Code issues are objective and verifiable |
| Text | 35% | Text analysis has good accuracy but some subjectivity |
| Security | 25% | Security issues are critical but less frequent |
Analyzer-Specific Scoring
Code Analyzer Scoring
The code analyzer evaluates several factors:
code_factors:
syntax_correctness: 25% # Valid syntax and structure
import_validity: 30% # All imports are real packages
api_authenticity: 25% # Function/method calls exist
pattern_consistency: 20% # Follows common coding patterns
Examples:
# High trust score (0.9+)
import requests
import json
def get_user(user_id):
response = requests.get(f"https://api.github.com/users/{user_id}")
return response.json()
# Low trust score (0.3-)
import fake_requests_lib
import nonexistent_module
def magic_function():
data = fake_requests_lib.auto_get_everything()
return nonexistent_module.process_magically(data)
Text Analyzer Scoring
Text analysis considers:
text_factors:
factual_consistency: 35% # Statements align with known facts
reference_validity: 25% # URLs, citations are real
writing_naturalness: 20% # Human-like writing patterns
internal_consistency: 20% # No self-contradictions
Examples:
<!-- High trust score -->
Python was created by Guido van Rossum and first released in 1991.
The latest stable version can be found at https://python.org.
<!-- Low trust score -->
Python was invented in 1995 by John Smith at Google Corporation.
Download it from https://python-official-new.com/downloads.
Security Analyzer Scoring
Security scoring focuses on:
security_factors:
vulnerability_presence: 40% # No dangerous patterns detected
secret_exposure: 30% # No hardcoded credentials
permission_safety: 20% # Safe privilege usage
injection_resistance: 10% # No injection vulnerabilities
Score Interpretation
Confidence Levels
| Score Range | Confidence | Color | Meaning | Action |
|---|---|---|---|---|
| 0.8 - 1.0 | High | 🟢 Green | Likely reliable | Use with minimal review |
| 0.6 - 0.8 | Medium | 🟡 Yellow | Some concerns | Review before use |
| 0.3 - 0.6 | Low | 🟠 Orange | Multiple issues | Careful review required |
| 0.0 - 0.3 | Very Low | 🔴 Red | Likely problematic | Significant review needed |
Score Modifiers
Trust scores can be adjusted by various factors:
Content Length Bonus
Longer, more detailed content gets slight bonuses:
length_bonus = min(0.1, log(content_length) / 100)
Consistency Bonus
Content that passes multiple analyzers gets reinforcement:
if all_analyzers_agree:
consistency_bonus = 0.05
Uncertainty Penalty
When analyzers disagree significantly:
if analyzer_disagreement > 0.3:
uncertainty_penalty = 0.1
Factors That Increase Trust Score
✅ Positive Indicators
Code:
- All imports are from well-known packages
- Function calls match documented APIs
- Follows established coding conventions
- Includes proper error handling
- Has realistic variable names
Text:
- Contains verifiable facts
- Uses real URLs and references
- Maintains consistent terminology
- Shows natural writing flow
- Includes appropriate caveats/disclaimers
Security:
- No hardcoded credentials
- Safe API usage patterns
- Proper input validation
- Appropriate error handling
- Following security best practices
Examples of High-Trust Content
# Score: 0.92 - Very trustworthy
import requests
import logging
from typing import Optional, Dict
logger = logging.getLogger(__name__)
def fetch_github_user(username: str) -> Optional[Dict]:
"""Fetch user data from GitHub API."""
try:
url = f"https://api.github.com/users/{username}"
response = requests.get(url, timeout=10)
response.raise_for_status()
return response.json()
except requests.RequestException as e:
logger.error(f"Failed to fetch user {username}: {e}")
return None
Factors That Decrease Trust Score
❌ Negative Indicators
Code:
- Imports from non-existent packages
- Calls to fabricated functions
- Unusual or "magical" variable names
- Missing error handling
- Unrealistic functionality claims
Text:
- Contradicts known facts
- Contains broken links/references
- Has unnatural writing patterns
- Makes unsupported claims
- Contains AI-typical phrases
Security:
- Hardcoded API keys or passwords
- Dangerous function usage (eval, exec)
- Missing input validation
- Overly permissive operations
- Injection vulnerability patterns
Examples of Low-Trust Content
# Score: 0.15 - Very suspicious
import magic_ai_utils
import super_advanced_ml
def solve_everything(problem):
# This function can solve any problem automatically
solution = magic_ai_utils.auto_solve(problem)
enhanced_solution = super_advanced_ml.make_it_perfect(solution)
return enhanced_solution.get_final_answer()
Customizing Trust Score Calculation
Adjust Analyzer Weights
# .vow.yaml
trust_score:
weights:
code: 0.5 # Increase code analyzer importance
text: 0.3 # Decrease text analyzer importance
security: 0.2 # Keep security weight the same
Set Custom Thresholds
trust_score:
thresholds:
high: 0.85 # Raise bar for "high confidence"
medium: 0.65 # Custom medium threshold
low: 0.35 # Custom low threshold
Domain-Specific Scoring
# For data science projects
trust_score:
domain: data_science
weights:
code: 0.3 # Less emphasis on perfect imports
text: 0.4 # More emphasis on documentation
security: 0.3 # Higher security concern for data
Understanding Score Components
Detailed Breakdown
Get detailed scoring information:
# Show score breakdown
vow check file.py --show-score-breakdown
# Output includes:
# - Individual analyzer scores
# - Weight contributions
# - Applied modifiers
# - Final calculation
Example output:
{
"trust_score": 0.73,
"breakdown": {
"code_analyzer": {
"score": 0.8,
"weight": 0.4,
"contribution": 0.32,
"factors": {
"import_validity": 0.9,
"api_authenticity": 0.7,
"syntax_correctness": 1.0,
"pattern_consistency": 0.6
}
},
"text_analyzer": {
"score": 0.65,
"weight": 0.35,
"contribution": 0.23
},
"security_analyzer": {
"score": 0.9,
"weight": 0.25,
"contribution": 0.23
},
"modifiers": {
"length_bonus": 0.02,
"consistency_bonus": 0.0,
"uncertainty_penalty": -0.05
}
}
}
Trust Score in CI/CD
Setting Thresholds
# GitHub Actions example
- name: Check AI output quality
run: |
vow check . --min-trust-score 0.7 --format sarif
# Exit codes based on trust score:
# 0: All files meet threshold
# 1: Some files below threshold
# 2: Critical issues found
Gradual Rollout
# Gradually increase standards
trust_score:
thresholds:
# Week 1: Get baseline
required: 0.3
# Week 2: Eliminate worst content
# required: 0.5
# Week 3: Raise the bar
# required: 0.7
Best Practices
1. Use Trust Scores as Guidelines
- Don't rely solely on scores for critical decisions
- Combine with human review for important content
- Consider context and domain requirements
2. Establish Team Standards
# team-standards.yaml
trust_score:
production_code: 0.8 # High bar for production
documentation: 0.6 # Medium bar for docs
examples: 0.4 # Lower bar for examples
tests: 0.5 # Medium bar for tests
3. Monitor Score Distribution
# Get score statistics for your codebase
vow check . --stats --format json | jq '.trust_score_distribution'
4. Track Improvements
# Compare scores over time
vow check . --output baseline.json
# ... make improvements ...
vow check . --output improved.json --compare baseline.json
Limitations
What Trust Scores Can't Tell You
- Domain Expertise: Scores can't evaluate domain-specific correctness
- Business Logic: Can't verify if code meets business requirements
- Performance: Doesn't measure code efficiency or scalability
- User Experience: Can't assess UI/UX quality
- Integration: Doesn't verify how code works with other systems
When to Ignore Trust Scores
- Prototype/Experimental Code: Lower scores expected
- Legacy Code Integration: May trigger false positives
- Highly Specialized Domains: May lack domain knowledge
- Code Generation Templates: May be intentionally generic
Next Steps
- Output Formats - Understanding different output formats
- Configuration - Customize trust score calculation
- CI/CD Integration - Use trust scores in automation