Files
Linux-Hello/docs/BENCHMARKS.md
2026-01-15 22:40:51 +01:00

8.4 KiB

Linux Hello Performance Benchmarks

This document describes the performance benchmarks for the Linux Hello face authentication system. These benchmarks are used for optimization and regression testing.

Overview

The benchmark suite measures performance of critical authentication pipeline components:

Component Description Target Metric
Face Detection Locate faces in camera frames Frames per second
Embedding Extraction Extract facial features Embeddings per second
Template Matching Compare embeddings (cosine similarity) Comparisons per second
Anti-Spoofing Liveness detection pipeline Latency (ms)
Encryption/Decryption AES-256-GCM operations Throughput (MB/s)
Secure Memory Allocation, zeroization, constant-time ops Overhead (ns)

Performance Goals

For a responsive authentication experience, the total authentication time should be under 100ms. This breaks down as follows:

Stage Target Notes
Frame Capture <33ms 30 FPS minimum
Face Detection <20ms Per frame
Embedding Extraction <30ms Per detected face
Anti-Spoofing (per frame) <15ms Single frame analysis
Template Matching <5ms Against up to 100 templates
Encryption Round-trip <10ms For template storage/retrieval
Total Pipeline <100ms Single-frame authentication

Additional Targets

  • Multi-frame anti-spoofing: <150ms for 10-frame temporal analysis
  • Secure memory operations: <1% overhead vs non-secure operations
  • Constant-time comparisons: Timing variance <1% between match/no-match

Running Benchmarks

Prerequisites

Ensure you have Rust 1.75+ installed and the project dependencies:

cd linux-hello
cargo build --release -p linux-hello-daemon

Run All Benchmarks

cargo bench -p linux-hello-daemon

Run Specific Benchmark Groups

# Face detection only
cargo bench -p linux-hello-daemon -- face_detection

# Template matching
cargo bench -p linux-hello-daemon -- template_matching

# Encryption
cargo bench -p linux-hello-daemon -- encryption

# Secure memory operations
cargo bench -p linux-hello-daemon -- secure_memory

# Full authentication pipeline
cargo bench -p linux-hello-daemon -- full_pipeline

Generate HTML Reports

Criterion automatically generates HTML reports in target/criterion/. Open target/criterion/report/index.html in a browser to view detailed results with graphs.

# After running benchmarks
firefox target/criterion/report/index.html

Compare Against Baseline

To track regressions, save a baseline and compare:

# Save current results as baseline
cargo bench -p linux-hello-daemon -- --save-baseline main

# After changes, compare against baseline
cargo bench -p linux-hello-daemon -- --baseline main

Benchmark Descriptions

Face Detection (face_detection)

Tests the face detection algorithms at common camera resolutions:

  • QVGA (320x240)
  • VGA (640x480)
  • 720p (1280x720)
  • 1080p (1920x1080)

What it measures:

  • simple_detection: Basic placeholder algorithm
  • detector_trait: Full FaceDetect trait implementation

Embedding Extraction (embedding_extraction)

Tests embedding generation at various input sizes and output dimensions:

  • Face sizes: 64x64, 112x112, 160x160, 224x224
  • Embedding dimensions: 64, 128, 256, 512

What it measures:

  • Time to extract a normalized embedding vector from a face region

Template Matching (template_matching)

Tests comparison operations:

  • Cosine similarity at different dimensions
  • Euclidean distance calculations
  • Matching against databases of 1-100 templates

What it measures:

  • Single comparison latency
  • Throughput when matching against template databases

Anti-Spoofing (anti_spoofing)

Tests liveness detection components:

  • Single frame IR/depth/texture analysis
  • Full temporal pipeline (10 frames with movement/blink detection)

What it measures:

  • Per-frame analysis latency
  • Full pipeline latency for multi-frame analysis

Encryption (encryption)

Tests AES-256-GCM encryption used for template storage:

  • Encrypt/decrypt at various data sizes
  • Round-trip (encrypt then decrypt)
  • PBKDF2 key derivation overhead

What it measures:

  • Throughput (bytes/second)
  • Latency for template-sized data

Secure Memory (secure_memory)

Tests security-critical memory operations:

  • SecureEmbedding creation (with memory locking)
  • Constant-time cosine similarity
  • Secure byte comparison (SecureBytes)
  • Memory zeroization

What it measures:

  • Overhead vs non-secure operations
  • Timing consistency (for constant-time operations)

Full Pipeline (full_pipeline)

Tests complete authentication flows:

  • auth_pipeline_no_crypto: Detection + extraction + matching
  • auth_pipeline_with_antispoofing: Full pipeline with liveness checks

What it measures:

  • End-to-end authentication latency

Reference Results

Expected results on reference hardware (AMD Ryzen 7 5800X, 32GB RAM):

Benchmark Expected Time Notes
Face detection (VGA) ~50 us Placeholder algorithm
Embedding extraction (112x112) ~100 us Placeholder algorithm
Cosine similarity (128-dim) ~500 ns SIMD-optimized
Template matching (5 templates) ~3 us Linear scan
Anti-spoofing (single frame) ~2 ms VGA resolution
AES-GCM encrypt (512 bytes) ~20 us With PBKDF2
Secure memory zero (1KB) ~500 ns Volatile writes
Constant-time eq (256 bytes) ~300 ns Using subtle crate
Full pipeline (no crypto) ~200 us Detection + match
Full pipeline (with anti-spoof) ~2.5 ms Complete auth

Note: Production performance with ONNX models will differ significantly. These benchmarks use placeholder algorithms for testing infrastructure.

Interpreting Results

Understanding Criterion Output

template_matching/cosine_similarity/128
                        time:   [487.23 ns 489.12 ns 491.34 ns]
                        thrpt:  [2.0353 Melem/s 2.0444 Melem/s 2.0523 Melem/s]
                 change: [-1.2% +0.3% +1.8%] (p = 0.72 > 0.05)
                        No change in performance detected.
  • time: [lower bound, estimate, upper bound] with 95% confidence
  • thrpt: Throughput (operations per second)
  • change: Comparison vs previous run (if available)

Performance Regressions

Criterion will flag significant regressions:

  • Performance degraded: >5% slower with high confidence
  • Performance improved: >5% faster with high confidence

Investigate regressions before merging code changes.

Adding New Benchmarks

When adding new functionality, include appropriate benchmarks:

fn bench_new_feature(c: &mut Criterion) {
    let mut group = c.benchmark_group("new_feature");

    // Set throughput for rate-based benchmarks
    group.throughput(Throughput::Elements(1));

    group.bench_function("operation_name", |b| {
        // Setup (not measured)
        let input = prepare_input();

        b.iter(|| {
            // This code is measured
            black_box(your_function(black_box(&input)))
        });
    });

    group.finish();
}

// Add to criterion_group!
criterion_group!(benches, ..., bench_new_feature);

Continuous Integration

Benchmarks should be run in CI on performance-critical PRs:

# Example GitHub Actions workflow
benchmark:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - name: Run benchmarks
      run: cargo bench -p linux-hello-daemon -- --noplot
    - name: Store results
      uses: actions/upload-artifact@v4
      with:
        name: benchmark-results
        path: target/criterion/

Troubleshooting

High Variance in Results

If benchmarks show high variance (wide confidence intervals):

  1. Close other applications
  2. Disable CPU frequency scaling: sudo cpupower frequency-set -g performance
  3. Increase sample size: group.sample_size(200);
  4. Run on an idle system

Benchmarks Too Slow

For slow benchmarks, reduce sample size:

group.sample_size(10); // Default is 100
group.measurement_time(std::time::Duration::from_secs(5));

Memory Issues

If benchmarks fail with OOM or memory errors:

  1. Reduce iteration count
  2. Clean up large allocations in benchmark functions
  3. Check for memory leaks with valgrind

License

These benchmarks are part of the Linux Hello project and are released under the GPL-3.0 license.