ASIC-RAG-CHIMERA Security Model
ASIC-RAG-CHIMERA Security Model
Overview
ASIC-RAG-CHIMERA implements a comprehensive security model designed to protect sensitive documents while enabling efficient retrieval and AI-powered responses. This document details the security architecture, threat model, and cryptographic implementations.
Security Objectives
- Confidentiality: Document contents are encrypted at rest
- Integrity: Cryptographic verification of document authenticity
- Privacy: Search keywords are hashed before indexing
- Access Control: Time-limited keys for document access
- Auditability: Complete audit trail of all operations
Cryptographic Primitives
Encryption: AES-256-GCM
All document content is encrypted using AES-256 in Galois/Counter Mode (GCM).
Properties:
- 256-bit key size (128-bit security level)
- Authenticated encryption (integrity + confidentiality)
- 12-byte random nonce per encryption
- 16-byte authentication tag
Implementation:
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
key = AESGCM.generate_key(bit_length=256)
aesgcm = AESGCM(key)
nonce = os.urandom(12)
ciphertext = aesgcm.encrypt(nonce, plaintext, associated_data)
Key Derivation: PBKDF2
Block-specific keys are derived using PBKDF2-HMAC-SHA256.
Parameters:
- Iterations: 100,000 (adjustable)
- Salt: 32 bytes random per derivation
- Output: 32 bytes (256 bits)
Implementation:
from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
kdf = PBKDF2HMAC(
algorithm=hashes.SHA256(),
length=32,
salt=salt,
iterations=100000
)
derived_key = kdf.derive(password)
Hashing: SHA-256
All cryptographic hashing uses SHA-256.
Uses:
- Tag/keyword hashing for opaque indexing
- Block hash for integrity verification
- Merkle tree construction
- Session key derivation
Key Management
Key Hierarchy
┌─────────────────────────────────────────────────────────┐
│ Master Key │
│ (User-provided, 256 bits) │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Block Keys │
│ PBKDF2(master_key || block_hash || salt) │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Session Keys │
│ HMAC(master_key, session_id || block_hash || nonce) │
│ (30-second TTL) │
└─────────────────────────────────────────────────────────┘
Temporary Key Lifecycle
- Generation: Key created for specific block access
- Active: Key can be used for encryption/decryption
- Expiration: Key automatically expires after TTL (30s default)
- Revocation: Key can be manually revoked
- Cleanup: Expired keys are removed by background thread
Key Security Properties
| Property | Implementation |
|---|---|
| Secrecy | Keys never stored in plaintext |
| Uniqueness | Each block has unique derived key |
| Forward Secrecy | Session keys expire, limiting exposure |
| Key Rotation | Supported via re-encryption |
Privacy Protection
Opaque Tag Indexing
Keywords are never stored in plaintext. Before indexing:
# Keyword is hashed before any storage
tag_hash = hashlib.sha256(keyword.encode()).digest()
index.add_tag(tag_hash, block_id)
Privacy Properties:
- Original keywords cannot be recovered from index
- Search requires knowing the exact keyword
- Similar keywords produce completely different hashes
Local LLM Processing
All AI processing happens locally:
- Qwen3-0.6B runs on local hardware
- No data is sent to external APIs
- Model can be air-gapped from network
Integrity Verification
Block Chain Structure
Each block contains a hash of the previous block:
Block N-1 Block N Block N+1
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ prev_hash │←────────│ prev_hash │←────────│ prev_hash │
│ content │ │ content │ │ content │
│ block_hash ─┼────────►│ block_hash ─┼────────►│ block_hash │
└─────────────┘ └─────────────┘ └─────────────┘
Merkle Tree Verification
Documents are organized in a Merkle tree for efficient verification:
Root Hash
/ \
Hash AB Hash CD
/ \ / \
Hash A Hash B Hash C Hash D
| | | |
Doc A Doc B Doc C Doc D
Properties:
- O(log n) proof size
- Any document can be verified without accessing others
- Tampering is detectable at any level
Threat Model
Threats Addressed
| Threat | Mitigation |
|---|---|
| Data at rest exposure | AES-256-GCM encryption |
| Key compromise | Time-limited session keys |
| Keyword leakage | SHA-256 hashed tags |
| Document tampering | Merkle tree + block hashes |
| Unauthorized access | Master key requirement |
| Memory snooping | Secure key derivation |
Threats NOT Addressed
| Threat | Reason |
|---|---|
| Compromised master key | User responsibility |
| Side-channel attacks | Requires hardware mitigation |
| Malicious LLM responses | LLM safety is separate concern |
| Coercion attacks | Out of scope |
Attack Resistance
Brute Force Resistance
- AES-256: 2^256 key space (computationally infeasible)
- PBKDF2 100K iterations: ~100ms per attempt
- At 10K attempts/sec: 10^70 years to exhaust space
Dictionary Attacks
- Tag hashes include no salt (by design for searchability)
- Mitigation: Encourage complex, multi-word tags
- Alternative: Searchable encryption (future enhancement)
Replay Attacks
- Session keys include unique nonce
- Keys expire after 30 seconds
- Each encryption uses unique nonce
Security Recommendations
Master Key Management
# Generate secure master key
import secrets
master_key = secrets.token_bytes(32)
# Store securely (example: use system keychain)
# NEVER hardcode in source code
# NEVER store in plain text files
Configuration Hardening
# Recommended security settings
encryption:
pbkdf2_iterations: 100000 # Minimum 100K
key_generator:
default_ttl: 30 # Maximum 60 seconds
audit:
enabled: true
log_level: "INFO"
Deployment Considerations
- Storage: Use encrypted filesystem for additional protection
- Network: Keep system isolated or use TLS
- Access: Implement OS-level access controls
- Backup: Encrypt backups with separate key
- Logging: Enable audit logging for compliance
Cryptographic Agility
The system is designed for cryptographic agility:
# Encryption algorithm is configurable
config = {
"algorithm": "AES-256-GCM", # Current default
# Future options:
# "algorithm": "ChaCha20-Poly1305",
# "algorithm": "AES-256-GCM-SIV",
}
Compliance Considerations
The security model supports compliance with:
- GDPR: Data encryption, access controls
- HIPAA: Encryption at rest, audit trails
- SOC 2: Access logging, key management
- PCI DSS: Strong cryptography, key rotation
Note: Compliance requires proper deployment configuration and operational procedures beyond the software itself.
Security Audit Checklist
- Master key stored securely
- PBKDF2 iterations ≥ 100,000
- Session key TTL ≤ 60 seconds
- Audit logging enabled
- Storage directory has restricted permissions
- No plaintext keys in logs or errors
- Key rotation procedure documented
- Backup encryption verified