8.4 KiB
Linux Hello Performance Benchmarks
This document describes the performance benchmarks for the Linux Hello face authentication system. These benchmarks are used for optimization and regression testing.
Overview
The benchmark suite measures performance of critical authentication pipeline components:
| Component | Description | Target Metric |
|---|---|---|
| Face Detection | Locate faces in camera frames | Frames per second |
| Embedding Extraction | Extract facial features | Embeddings per second |
| Template Matching | Compare embeddings (cosine similarity) | Comparisons per second |
| Anti-Spoofing | Liveness detection pipeline | Latency (ms) |
| Encryption/Decryption | AES-256-GCM operations | Throughput (MB/s) |
| Secure Memory | Allocation, zeroization, constant-time ops | Overhead (ns) |
Performance Goals
For a responsive authentication experience, the total authentication time should be under 100ms. This breaks down as follows:
| Stage | Target | Notes |
|---|---|---|
| Frame Capture | <33ms | 30 FPS minimum |
| Face Detection | <20ms | Per frame |
| Embedding Extraction | <30ms | Per detected face |
| Anti-Spoofing (per frame) | <15ms | Single frame analysis |
| Template Matching | <5ms | Against up to 100 templates |
| Encryption Round-trip | <10ms | For template storage/retrieval |
| Total Pipeline | <100ms | Single-frame authentication |
Additional Targets
- Multi-frame anti-spoofing: <150ms for 10-frame temporal analysis
- Secure memory operations: <1% overhead vs non-secure operations
- Constant-time comparisons: Timing variance <1% between match/no-match
Running Benchmarks
Prerequisites
Ensure you have Rust 1.75+ installed and the project dependencies:
cd linux-hello
cargo build --release -p linux-hello-daemon
Run All Benchmarks
cargo bench -p linux-hello-daemon
Run Specific Benchmark Groups
# Face detection only
cargo bench -p linux-hello-daemon -- face_detection
# Template matching
cargo bench -p linux-hello-daemon -- template_matching
# Encryption
cargo bench -p linux-hello-daemon -- encryption
# Secure memory operations
cargo bench -p linux-hello-daemon -- secure_memory
# Full authentication pipeline
cargo bench -p linux-hello-daemon -- full_pipeline
Generate HTML Reports
Criterion automatically generates HTML reports in target/criterion/. Open target/criterion/report/index.html in a browser to view detailed results with graphs.
# After running benchmarks
firefox target/criterion/report/index.html
Compare Against Baseline
To track regressions, save a baseline and compare:
# Save current results as baseline
cargo bench -p linux-hello-daemon -- --save-baseline main
# After changes, compare against baseline
cargo bench -p linux-hello-daemon -- --baseline main
Benchmark Descriptions
Face Detection (face_detection)
Tests the face detection algorithms at common camera resolutions:
- QVGA (320x240)
- VGA (640x480)
- 720p (1280x720)
- 1080p (1920x1080)
What it measures:
simple_detection: Basic placeholder algorithmdetector_trait: Full FaceDetect trait implementation
Embedding Extraction (embedding_extraction)
Tests embedding generation at various input sizes and output dimensions:
- Face sizes: 64x64, 112x112, 160x160, 224x224
- Embedding dimensions: 64, 128, 256, 512
What it measures:
- Time to extract a normalized embedding vector from a face region
Template Matching (template_matching)
Tests comparison operations:
- Cosine similarity at different dimensions
- Euclidean distance calculations
- Matching against databases of 1-100 templates
What it measures:
- Single comparison latency
- Throughput when matching against template databases
Anti-Spoofing (anti_spoofing)
Tests liveness detection components:
- Single frame IR/depth/texture analysis
- Full temporal pipeline (10 frames with movement/blink detection)
What it measures:
- Per-frame analysis latency
- Full pipeline latency for multi-frame analysis
Encryption (encryption)
Tests AES-256-GCM encryption used for template storage:
- Encrypt/decrypt at various data sizes
- Round-trip (encrypt then decrypt)
- PBKDF2 key derivation overhead
What it measures:
- Throughput (bytes/second)
- Latency for template-sized data
Secure Memory (secure_memory)
Tests security-critical memory operations:
- SecureEmbedding creation (with memory locking)
- Constant-time cosine similarity
- Secure byte comparison (SecureBytes)
- Memory zeroization
What it measures:
- Overhead vs non-secure operations
- Timing consistency (for constant-time operations)
Full Pipeline (full_pipeline)
Tests complete authentication flows:
auth_pipeline_no_crypto: Detection + extraction + matchingauth_pipeline_with_antispoofing: Full pipeline with liveness checks
What it measures:
- End-to-end authentication latency
Reference Results
Expected results on reference hardware (AMD Ryzen 7 5800X, 32GB RAM):
| Benchmark | Expected Time | Notes |
|---|---|---|
| Face detection (VGA) | ~50 us | Placeholder algorithm |
| Embedding extraction (112x112) | ~100 us | Placeholder algorithm |
| Cosine similarity (128-dim) | ~500 ns | SIMD-optimized |
| Template matching (5 templates) | ~3 us | Linear scan |
| Anti-spoofing (single frame) | ~2 ms | VGA resolution |
| AES-GCM encrypt (512 bytes) | ~20 us | With PBKDF2 |
| Secure memory zero (1KB) | ~500 ns | Volatile writes |
| Constant-time eq (256 bytes) | ~300 ns | Using subtle crate |
| Full pipeline (no crypto) | ~200 us | Detection + match |
| Full pipeline (with anti-spoof) | ~2.5 ms | Complete auth |
Note: Production performance with ONNX models will differ significantly. These benchmarks use placeholder algorithms for testing infrastructure.
Interpreting Results
Understanding Criterion Output
template_matching/cosine_similarity/128
time: [487.23 ns 489.12 ns 491.34 ns]
thrpt: [2.0353 Melem/s 2.0444 Melem/s 2.0523 Melem/s]
change: [-1.2% +0.3% +1.8%] (p = 0.72 > 0.05)
No change in performance detected.
- time: [lower bound, estimate, upper bound] with 95% confidence
- thrpt: Throughput (operations per second)
- change: Comparison vs previous run (if available)
Performance Regressions
Criterion will flag significant regressions:
- Performance degraded: >5% slower with high confidence
- Performance improved: >5% faster with high confidence
Investigate regressions before merging code changes.
Adding New Benchmarks
When adding new functionality, include appropriate benchmarks:
fn bench_new_feature(c: &mut Criterion) {
let mut group = c.benchmark_group("new_feature");
// Set throughput for rate-based benchmarks
group.throughput(Throughput::Elements(1));
group.bench_function("operation_name", |b| {
// Setup (not measured)
let input = prepare_input();
b.iter(|| {
// This code is measured
black_box(your_function(black_box(&input)))
});
});
group.finish();
}
// Add to criterion_group!
criterion_group!(benches, ..., bench_new_feature);
Continuous Integration
Benchmarks should be run in CI on performance-critical PRs:
# Example GitHub Actions workflow
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run benchmarks
run: cargo bench -p linux-hello-daemon -- --noplot
- name: Store results
uses: actions/upload-artifact@v4
with:
name: benchmark-results
path: target/criterion/
Troubleshooting
High Variance in Results
If benchmarks show high variance (wide confidence intervals):
- Close other applications
- Disable CPU frequency scaling:
sudo cpupower frequency-set -g performance - Increase sample size:
group.sample_size(200); - Run on an idle system
Benchmarks Too Slow
For slow benchmarks, reduce sample size:
group.sample_size(10); // Default is 100
group.measurement_time(std::time::Duration::from_secs(5));
Memory Issues
If benchmarks fail with OOM or memory errors:
- Reduce iteration count
- Clean up large allocations in benchmark functions
- Check for memory leaks with valgrind
License
These benchmarks are part of the Linux Hello project and are released under the GPL-3.0 license.