Files

2026-01-15 22:40:51 +01:00

8.4 KiB

Raw Blame History

Linux Hello Performance Benchmarks

This document describes the performance benchmarks for the Linux Hello face authentication system. These benchmarks are used for optimization and regression testing.

Overview

The benchmark suite measures performance of critical authentication pipeline components:

Component	Description	Target Metric
Face Detection	Locate faces in camera frames	Frames per second
Embedding Extraction	Extract facial features	Embeddings per second
Template Matching	Compare embeddings (cosine similarity)	Comparisons per second
Anti-Spoofing	Liveness detection pipeline	Latency (ms)
Encryption/Decryption	AES-256-GCM operations	Throughput (MB/s)
Secure Memory	Allocation, zeroization, constant-time ops	Overhead (ns)

Performance Goals

For a responsive authentication experience, the total authentication time should be under 100ms. This breaks down as follows:

Stage	Target	Notes
Frame Capture	<33ms	30 FPS minimum
Face Detection	<20ms	Per frame
Embedding Extraction	<30ms	Per detected face
Anti-Spoofing (per frame)	<15ms	Single frame analysis
Template Matching	<5ms	Against up to 100 templates
Encryption Round-trip	<10ms	For template storage/retrieval
Total Pipeline	<100ms	Single-frame authentication

Additional Targets

Multi-frame anti-spoofing: <150ms for 10-frame temporal analysis
Secure memory operations: <1% overhead vs non-secure operations
Constant-time comparisons: Timing variance <1% between match/no-match

Running Benchmarks

Prerequisites

Ensure you have Rust 1.75+ installed and the project dependencies:

cd linux-hello
cargo build --release -p linux-hello-daemon

Run All Benchmarks

cargo bench -p linux-hello-daemon

Run Specific Benchmark Groups

# Face detection only
cargo bench -p linux-hello-daemon -- face_detection

# Template matching
cargo bench -p linux-hello-daemon -- template_matching

# Encryption
cargo bench -p linux-hello-daemon -- encryption

# Secure memory operations
cargo bench -p linux-hello-daemon -- secure_memory

# Full authentication pipeline
cargo bench -p linux-hello-daemon -- full_pipeline

Generate HTML Reports

Criterion automatically generates HTML reports in target/criterion/. Open target/criterion/report/index.html in a browser to view detailed results with graphs.

# After running benchmarks
firefox target/criterion/report/index.html

Compare Against Baseline

To track regressions, save a baseline and compare:

# Save current results as baseline
cargo bench -p linux-hello-daemon -- --save-baseline main

# After changes, compare against baseline
cargo bench -p linux-hello-daemon -- --baseline main

Benchmark Descriptions

Face Detection (`face_detection`)

Tests the face detection algorithms at common camera resolutions:

QVGA (320x240)
VGA (640x480)
720p (1280x720)
1080p (1920x1080)

What it measures:

simple_detection: Basic placeholder algorithm
detector_trait: Full FaceDetect trait implementation

Embedding Extraction (`embedding_extraction`)

Tests embedding generation at various input sizes and output dimensions:

Face sizes: 64x64, 112x112, 160x160, 224x224
Embedding dimensions: 64, 128, 256, 512

What it measures:

Time to extract a normalized embedding vector from a face region

Template Matching (`template_matching`)

Tests comparison operations:

Cosine similarity at different dimensions
Euclidean distance calculations
Matching against databases of 1-100 templates

What it measures:

Single comparison latency
Throughput when matching against template databases

Anti-Spoofing (`anti_spoofing`)

Tests liveness detection components:

Single frame IR/depth/texture analysis
Full temporal pipeline (10 frames with movement/blink detection)

What it measures:

Per-frame analysis latency
Full pipeline latency for multi-frame analysis

Encryption (`encryption`)

Tests AES-256-GCM encryption used for template storage:

Encrypt/decrypt at various data sizes
Round-trip (encrypt then decrypt)
PBKDF2 key derivation overhead

What it measures:

Throughput (bytes/second)
Latency for template-sized data

Secure Memory (`secure_memory`)

Tests security-critical memory operations:

SecureEmbedding creation (with memory locking)
Constant-time cosine similarity
Secure byte comparison (SecureBytes)
Memory zeroization

What it measures:

Overhead vs non-secure operations
Timing consistency (for constant-time operations)

Full Pipeline (`full_pipeline`)

Tests complete authentication flows:

auth_pipeline_no_crypto: Detection + extraction + matching
auth_pipeline_with_antispoofing: Full pipeline with liveness checks

What it measures:

End-to-end authentication latency

Reference Results

Expected results on reference hardware (AMD Ryzen 7 5800X, 32GB RAM):

Benchmark	Expected Time	Notes
Face detection (VGA)	~50 us	Placeholder algorithm
Embedding extraction (112x112)	~100 us	Placeholder algorithm
Cosine similarity (128-dim)	~500 ns	SIMD-optimized
Template matching (5 templates)	~3 us	Linear scan
Anti-spoofing (single frame)	~2 ms	VGA resolution
AES-GCM encrypt (512 bytes)	~20 us	With PBKDF2
Secure memory zero (1KB)	~500 ns	Volatile writes
Constant-time eq (256 bytes)	~300 ns	Using subtle crate
Full pipeline (no crypto)	~200 us	Detection + match
Full pipeline (with anti-spoof)	~2.5 ms	Complete auth

Note: Production performance with ONNX models will differ significantly. These benchmarks use placeholder algorithms for testing infrastructure.

Interpreting Results

Understanding Criterion Output

template_matching/cosine_similarity/128
                        time:   [487.23 ns 489.12 ns 491.34 ns]
                        thrpt:  [2.0353 Melem/s 2.0444 Melem/s 2.0523 Melem/s]
                 change: [-1.2% +0.3% +1.8%] (p = 0.72 > 0.05)
                        No change in performance detected.

time: [lower bound, estimate, upper bound] with 95% confidence
thrpt: Throughput (operations per second)
change: Comparison vs previous run (if available)

Performance Regressions

Criterion will flag significant regressions:

Performance degraded: >5% slower with high confidence
Performance improved: >5% faster with high confidence

Investigate regressions before merging code changes.

Adding New Benchmarks

When adding new functionality, include appropriate benchmarks:

fn bench_new_feature(c: &mut Criterion) {
    let mut group = c.benchmark_group("new_feature");

    // Set throughput for rate-based benchmarks
    group.throughput(Throughput::Elements(1));

    group.bench_function("operation_name", |b| {
        // Setup (not measured)
        let input = prepare_input();

        b.iter(|| {
            // This code is measured
            black_box(your_function(black_box(&input)))
        });
    });

    group.finish();
}

// Add to criterion_group!
criterion_group!(benches, ..., bench_new_feature);

Continuous Integration

Benchmarks should be run in CI on performance-critical PRs:

# Example GitHub Actions workflow
benchmark:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - name: Run benchmarks
      run: cargo bench -p linux-hello-daemon -- --noplot
    - name: Store results
      uses: actions/upload-artifact@v4
      with:
        name: benchmark-results
        path: target/criterion/

Troubleshooting

High Variance in Results

If benchmarks show high variance (wide confidence intervals):

Close other applications
Disable CPU frequency scaling: sudo cpupower frequency-set -g performance
Increase sample size: group.sample_size(200);
Run on an idle system

Benchmarks Too Slow

For slow benchmarks, reduce sample size:

group.sample_size(10); // Default is 100
group.measurement_time(std::time::Duration::from_secs(5));

Memory Issues

If benchmarks fail with OOM or memory errors:

Reduce iteration count
Clean up large allocations in benchmark functions
Check for memory leaks with valgrind

License

These benchmarks are part of the Linux Hello project and are released under the GPL-3.0 license.

8.4 KiB Raw Blame History

Linux Hello Performance Benchmarks

Overview

Performance Goals

Additional Targets

Running Benchmarks

Prerequisites

Run All Benchmarks

Run Specific Benchmark Groups

Generate HTML Reports

Compare Against Baseline

Benchmark Descriptions

Face Detection (face_detection)

Embedding Extraction (embedding_extraction)

Template Matching (template_matching)

Anti-Spoofing (anti_spoofing)

Encryption (encryption)

Secure Memory (secure_memory)

Full Pipeline (full_pipeline)

Reference Results

Interpreting Results

Understanding Criterion Output

Performance Regressions

Adding New Benchmarks

Continuous Integration

Troubleshooting

High Variance in Results

Benchmarks Too Slow

Memory Issues

License

8.4 KiB

Raw Blame History

Face Detection (`face_detection`)

Embedding Extraction (`embedding_extraction`)

Template Matching (`template_matching`)

Anti-Spoofing (`anti_spoofing`)

Encryption (`encryption`)

Secure Memory (`secure_memory`)

Full Pipeline (`full_pipeline`)