7.1 KiB
7.1 KiB
Face Recognition ONNX Models
This directory contains ONNX model files for face detection and embedding extraction used by Linux Hello's facial authentication system.
System Requirements
ONNX support requires glibc 2.38+:
- Ubuntu 24.04 or later
- Fedora 39 or later
- Arch Linux (rolling release)
Check your glibc version:
ldd --version
Quick Start
# 1. Download models (see instructions below)
# 2. Place in this directory:
# models/retinaface.onnx
# models/mobilefacenet.onnx
# 3. Build with ONNX support
cargo build --release --features onnx
# 4. Test
./target/release/linux-hello detect --image photo.jpg --output detected.jpg
Required Models
1. Face Detection Model
Recommended: RetinaFace
| Property | Value |
|---|---|
| File | retinaface.onnx |
| Purpose | Face detection with 5-point landmarks |
| Input Shape | [1, 3, 640, 640] (NCHW, RGB) |
| Input Range | [-1, 1] normalized: (pixel - 127.5) / 128.0 |
| Outputs | loc, conf, landm tensors |
Alternative: BlazeFace
| Property | Value |
|---|---|
| File | blazeface.onnx |
| Purpose | Fast face detection |
| Input Shape | [1, 3, 128, 128] or [1, 3, 256, 256] |
| Use Case | Real-time detection on low-power devices |
2. Face Embedding Model
Recommended: MobileFaceNet
| Property | Value |
|---|---|
| File | mobilefacenet.onnx |
| Purpose | Face embedding extraction |
| Input Shape | [1, 3, 112, 112] (NCHW, RGB) |
| Input Range | [-1, 1] normalized: (pixel - 127.5) / 128.0 |
| Output Shape | [1, 128] or [1, 512] |
| Output | L2-normalized embedding vector |
Alternative: ArcFace
| Property | Value |
|---|---|
| File | arcface.onnx |
| Purpose | High-accuracy face embedding |
| Input Shape | [1, 3, 112, 112] |
| Output Shape | [1, 512] |
| Use Case | Higher accuracy at cost of larger model |
Download Instructions
Option 1: From ONNX Model Zoo
# RetinaFace (face detection)
wget https://github.com/onnx/models/raw/main/vision/body_analysis/ultraface/models/version-RFB-640.onnx \
-O retinaface.onnx
# Note: MobileFaceNet may need to be converted from other frameworks
Option 2: From InsightFace
# Clone InsightFace model repository
git clone https://github.com/deepinsight/insightface.git
cd insightface/model_zoo
# Download and extract models
# See: https://github.com/deepinsight/insightface/tree/master/model_zoo
Option 3: Convert from PyTorch/TensorFlow
From PyTorch:
import torch
import torch.onnx
# Load your trained model
model = YourFaceModel()
model.load_state_dict(torch.load('model.pth'))
model.eval()
# Export to ONNX
dummy_input = torch.randn(1, 3, 112, 112)
torch.onnx.export(
model,
dummy_input,
"model.onnx",
input_names=['input'],
output_names=['embedding'],
dynamic_axes={'input': {0: 'batch'}, 'embedding': {0: 'batch'}}
)
From TensorFlow:
pip install tf2onnx
python -m tf2onnx.convert \
--saved-model ./saved_model \
--output model.onnx \
--opset 13
Model Specifications
RetinaFace Output Format
The RetinaFace model outputs three tensors:
-
loc (bounding boxes):
[1, num_anchors, 4]- Format:
[dx, dy, dw, dh]offsets from anchor boxes - Decode:
cx = anchor_cx + dx * 0.1 * anchor_w
- Format:
-
conf (confidence):
[1, num_anchors, 2]- Format:
[background_score, face_score] - Apply softmax to get probability
- Format:
-
landm (landmarks):
[1, num_anchors, 10]- Format: 5 points x 2 coordinates
[x0, y0, x1, y1, ..., x4, y4] - Landmark order:
- 0: Left eye center
- 1: Right eye center
- 2: Nose tip
- 3: Left mouth corner
- 4: Right mouth corner
- Format: 5 points x 2 coordinates
Anchor Configuration
RetinaFace uses multi-scale anchors:
| Stride | Feature Map Size (640x640) | Anchor Sizes |
|---|---|---|
| 8 | 80x80 | 16, 32 |
| 16 | 40x40 | 64, 128 |
| 32 | 20x20 | 256, 512 |
Embedding Normalization
Face embeddings should be L2-normalized for comparison:
let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
let normalized: Vec<f32> = embedding.iter().map(|x| x / norm).collect();
Expected File Layout
models/
├── README.md # This file
├── retinaface.onnx # Face detection model
├── mobilefacenet.onnx # Face embedding model (128-dim)
├── arcface.onnx # Alternative embedding model (512-dim, optional)
└── blazeface.onnx # Alternative detection model (optional)
Testing Models
To verify models work correctly:
# Run integration tests with models
cd linux-hello-daemon
cargo test --features onnx -- --ignored
Performance Guidelines
Detection Model Selection
| Model | Input Size | Speed | Accuracy | Memory |
|---|---|---|---|---|
| RetinaFace-MNet0.25 | 640x640 | Fast | Good | ~5MB |
| RetinaFace-R50 | 640x640 | Medium | Excellent | ~100MB |
| BlazeFace | 128x128 | Very Fast | Moderate | ~1MB |
Embedding Model Selection
| Model | Embedding Dim | Speed | Accuracy | Memory |
|---|---|---|---|---|
| MobileFaceNet | 128 | Fast | Good | ~4MB |
| ArcFace-R50 | 512 | Medium | Excellent | ~120MB |
| ArcFace-R100 | 512 | Slow | Best | ~250MB |
Recommended Configurations
Low-power devices (Raspberry Pi, etc.):
- Detection: BlazeFace 128x128
- Embedding: MobileFaceNet 128-dim
- Expected: ~30ms per frame
Standard desktop:
- Detection: RetinaFace-MNet 640x640
- Embedding: MobileFaceNet 128-dim
- Expected: ~15ms per frame
High-security scenarios:
- Detection: RetinaFace-R50 640x640
- Embedding: ArcFace-R100 512-dim
- Expected: ~100ms per frame
License Information
Ensure compliance with model licenses:
| Model | License | Commercial Use |
|---|---|---|
| RetinaFace | MIT | Yes |
| BlazeFace | Apache 2.0 | Yes |
| MobileFaceNet | MIT | Yes |
| ArcFace | MIT | Yes |
| InsightFace models | Non-commercial | Check specific model |
Troubleshooting
Model Loading Fails
- Verify ONNX format version (opset 11-17 recommended)
- Check input/output tensor names match expected
- Ensure file is not corrupted:
python -c "import onnx; onnx.load('model.onnx')"
Poor Detection Results
- Ensure input normalization matches model training
- Check image is RGB (not BGR)
- Verify input dimensions match model expectations
- Adjust confidence threshold (default: 0.5)
Embedding Quality Issues
- Face alignment is critical - ensure landmarks are correct
- Check that input is 112x112 after alignment
- Verify embedding is L2-normalized before comparison
- Distance threshold typically: 0.4-0.6 for cosine distance