Face Recognition ONNX Models
This directory contains ONNX model files for face detection and embedding extraction used by Linux Hello's facial authentication system.
System Requirements
ONNX support requires glibc 2.38+:
- Ubuntu 24.04 or later
- Fedora 39 or later
- Arch Linux (rolling release)
Check your glibc version:
ldd --version
Quick Start
# 1. Download models (see instructions below)
# 2. Place in this directory:
# models/retinaface.onnx
# models/mobilefacenet.onnx
# 3. Build with ONNX support
cargo build --release --features onnx
# 4. Test
./target/release/linux-hello detect --image photo.jpg --output detected.jpg
Required Models
1. Face Detection Model
Recommended: RetinaFace
| Property | Value |
|---|---|
| File | retinaface.onnx |
| Purpose | Face detection with 5-point landmarks |
| Input Shape | [1, 3, 640, 640] (NCHW, RGB) |
| Input Range | [-1, 1] normalized: (pixel - 127.5) / 128.0 |
| Outputs | loc, conf, landm tensors |
Alternative: BlazeFace
| Property | Value |
|---|---|
| File | blazeface.onnx |
| Purpose | Fast face detection |
| Input Shape | [1, 3, 128, 128] or [1, 3, 256, 256] |
| Use Case | Real-time detection on low-power devices |
2. Face Embedding Model
Recommended: MobileFaceNet
| Property | Value |
|---|---|
| File | mobilefacenet.onnx |
| Purpose | Face embedding extraction |
| Input Shape | [1, 3, 112, 112] (NCHW, RGB) |
| Input Range | [-1, 1] normalized: (pixel - 127.5) / 128.0 |
| Output Shape | [1, 128] or [1, 512] |
| Output | L2-normalized embedding vector |
Alternative: ArcFace
| Property | Value |
|---|---|
| File | arcface.onnx |
| Purpose | High-accuracy face embedding |
| Input Shape | [1, 3, 112, 112] |
| Output Shape | [1, 512] |
| Use Case | Higher accuracy at cost of larger model |
Download Instructions
Option 1: From ONNX Model Zoo
# RetinaFace (face detection)
wget https://github.com/onnx/models/raw/main/vision/body_analysis/ultraface/models/version-RFB-640.onnx \
-O retinaface.onnx
# Note: MobileFaceNet may need to be converted from other frameworks
Option 2: From InsightFace
# Clone InsightFace model repository
git clone https://github.com/deepinsight/insightface.git
cd insightface/model_zoo
# Download and extract models
# See: https://github.com/deepinsight/insightface/tree/master/model_zoo
Option 3: Convert from PyTorch/TensorFlow
From PyTorch:
import torch
import torch.onnx
# Load your trained model
model = YourFaceModel()
model.load_state_dict(torch.load('model.pth'))
model.eval()
# Export to ONNX
dummy_input = torch.randn(1, 3, 112, 112)
torch.onnx.export(
model,
dummy_input,
"model.onnx",
input_names=['input'],
output_names=['embedding'],
dynamic_axes={'input': {0: 'batch'}, 'embedding': {0: 'batch'}}
)
From TensorFlow:
pip install tf2onnx
python -m tf2onnx.convert \
--saved-model ./saved_model \
--output model.onnx \
--opset 13
Model Specifications
RetinaFace Output Format
The RetinaFace model outputs three tensors:
-
loc (bounding boxes):
[1, num_anchors, 4]- Format:
[dx, dy, dw, dh]offsets from anchor boxes - Decode:
cx = anchor_cx + dx * 0.1 * anchor_w
- Format:
-
conf (confidence):
[1, num_anchors, 2]- Format:
[background_score, face_score] - Apply softmax to get probability
- Format:
-
landm (landmarks):
[1, num_anchors, 10]- Format: 5 points x 2 coordinates
[x0, y0, x1, y1, ..., x4, y4] - Landmark order:
- 0: Left eye center
- 1: Right eye center
- 2: Nose tip
- 3: Left mouth corner
- 4: Right mouth corner
- Format: 5 points x 2 coordinates
Anchor Configuration
RetinaFace uses multi-scale anchors:
| Stride | Feature Map Size (640x640) | Anchor Sizes |
|---|---|---|
| 8 | 80x80 | 16, 32 |
| 16 | 40x40 | 64, 128 |
| 32 | 20x20 | 256, 512 |
Embedding Normalization
Face embeddings should be L2-normalized for comparison:
let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
let normalized: Vec<f32> = embedding.iter().map(|x| x / norm).collect();
Expected File Layout
models/
├── README.md # This file
├── retinaface.onnx # Face detection model
├── mobilefacenet.onnx # Face embedding model (128-dim)
├── arcface.onnx # Alternative embedding model (512-dim, optional)
└── blazeface.onnx # Alternative detection model (optional)
Testing Models
To verify models work correctly:
# Run integration tests with models
cd linux-hello-daemon
cargo test --features onnx -- --ignored
Performance Guidelines
Detection Model Selection
| Model | Input Size | Speed | Accuracy | Memory |
|---|---|---|---|---|
| RetinaFace-MNet0.25 | 640x640 | Fast | Good | ~5MB |
| RetinaFace-R50 | 640x640 | Medium | Excellent | ~100MB |
| BlazeFace | 128x128 | Very Fast | Moderate | ~1MB |
Embedding Model Selection
| Model | Embedding Dim | Speed | Accuracy | Memory |
|---|---|---|---|---|
| MobileFaceNet | 128 | Fast | Good | ~4MB |
| ArcFace-R50 | 512 | Medium | Excellent | ~120MB |
| ArcFace-R100 | 512 | Slow | Best | ~250MB |
Recommended Configurations
Low-power devices (Raspberry Pi, etc.):
- Detection: BlazeFace 128x128
- Embedding: MobileFaceNet 128-dim
- Expected: ~30ms per frame
Standard desktop:
- Detection: RetinaFace-MNet 640x640
- Embedding: MobileFaceNet 128-dim
- Expected: ~15ms per frame
High-security scenarios:
- Detection: RetinaFace-R50 640x640
- Embedding: ArcFace-R100 512-dim
- Expected: ~100ms per frame
License Information
Ensure compliance with model licenses:
| Model | License | Commercial Use |
|---|---|---|
| RetinaFace | MIT | Yes |
| BlazeFace | Apache 2.0 | Yes |
| MobileFaceNet | MIT | Yes |
| ArcFace | MIT | Yes |
| InsightFace models | Non-commercial | Check specific model |
Troubleshooting
Model Loading Fails
- Verify ONNX format version (opset 11-17 recommended)
- Check input/output tensor names match expected
- Ensure file is not corrupted:
python -c "import onnx; onnx.load('model.onnx')"
Poor Detection Results
- Ensure input normalization matches model training
- Check image is RGB (not BGR)
- Verify input dimensions match model expectations
- Adjust confidence threshold (default: 0.5)
Embedding Quality Issues
- Face alignment is critical - ensure landmarks are correct
- Check that input is 112x112 after alignment
- Verify embedding is L2-normalized before comparison
- Distance threshold typically: 0.4-0.6 for cosine distance