Files
Linux-Hello/models/README.md
2026-01-15 22:50:18 +01:00

277 lines
7.1 KiB
Markdown

# Face Recognition ONNX Models
This directory contains ONNX model files for face detection and embedding extraction
used by Linux Hello's facial authentication system.
## System Requirements
**ONNX support requires glibc 2.38+:**
- Ubuntu 24.04 or later
- Fedora 39 or later
- Arch Linux (rolling release)
Check your glibc version:
```bash
ldd --version
```
## Quick Start
```bash
# 1. Download models (see instructions below)
# 2. Place in this directory:
# models/retinaface.onnx
# models/mobilefacenet.onnx
# 3. Build with ONNX support
cargo build --release --features onnx
# 4. Test
./target/release/linux-hello detect --image photo.jpg --output detected.jpg
```
## Required Models
### 1. Face Detection Model
**Recommended: RetinaFace**
| Property | Value |
|----------|-------|
| File | `retinaface.onnx` |
| Purpose | Face detection with 5-point landmarks |
| Input Shape | `[1, 3, 640, 640]` (NCHW, RGB) |
| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
| Outputs | `loc`, `conf`, `landm` tensors |
**Alternative: BlazeFace**
| Property | Value |
|----------|-------|
| File | `blazeface.onnx` |
| Purpose | Fast face detection |
| Input Shape | `[1, 3, 128, 128]` or `[1, 3, 256, 256]` |
| Use Case | Real-time detection on low-power devices |
### 2. Face Embedding Model
**Recommended: MobileFaceNet**
| Property | Value |
|----------|-------|
| File | `mobilefacenet.onnx` |
| Purpose | Face embedding extraction |
| Input Shape | `[1, 3, 112, 112]` (NCHW, RGB) |
| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
| Output Shape | `[1, 128]` or `[1, 512]` |
| Output | L2-normalized embedding vector |
**Alternative: ArcFace**
| Property | Value |
|----------|-------|
| File | `arcface.onnx` |
| Purpose | High-accuracy face embedding |
| Input Shape | `[1, 3, 112, 112]` |
| Output Shape | `[1, 512]` |
| Use Case | Higher accuracy at cost of larger model |
## Download Instructions
### Option 1: From ONNX Model Zoo
```bash
# RetinaFace (face detection)
wget https://github.com/onnx/models/raw/main/vision/body_analysis/ultraface/models/version-RFB-640.onnx \
-O retinaface.onnx
# Note: MobileFaceNet may need to be converted from other frameworks
```
### Option 2: From InsightFace
```bash
# Clone InsightFace model repository
git clone https://github.com/deepinsight/insightface.git
cd insightface/model_zoo
# Download and extract models
# See: https://github.com/deepinsight/insightface/tree/master/model_zoo
```
### Option 3: Convert from PyTorch/TensorFlow
**From PyTorch:**
```python
import torch
import torch.onnx
# Load your trained model
model = YourFaceModel()
model.load_state_dict(torch.load('model.pth'))
model.eval()
# Export to ONNX
dummy_input = torch.randn(1, 3, 112, 112)
torch.onnx.export(
model,
dummy_input,
"model.onnx",
input_names=['input'],
output_names=['embedding'],
dynamic_axes={'input': {0: 'batch'}, 'embedding': {0: 'batch'}}
)
```
**From TensorFlow:**
```bash
pip install tf2onnx
python -m tf2onnx.convert \
--saved-model ./saved_model \
--output model.onnx \
--opset 13
```
## Model Specifications
### RetinaFace Output Format
The RetinaFace model outputs three tensors:
1. **loc** (bounding boxes): `[1, num_anchors, 4]`
- Format: `[dx, dy, dw, dh]` offsets from anchor boxes
- Decode: `cx = anchor_cx + dx * 0.1 * anchor_w`
2. **conf** (confidence): `[1, num_anchors, 2]`
- Format: `[background_score, face_score]`
- Apply softmax to get probability
3. **landm** (landmarks): `[1, num_anchors, 10]`
- Format: 5 points x 2 coordinates `[x0, y0, x1, y1, ..., x4, y4]`
- Landmark order:
- 0: Left eye center
- 1: Right eye center
- 2: Nose tip
- 3: Left mouth corner
- 4: Right mouth corner
### Anchor Configuration
RetinaFace uses multi-scale anchors:
| Stride | Feature Map Size (640x640) | Anchor Sizes |
|--------|---------------------------|--------------|
| 8 | 80x80 | 16, 32 |
| 16 | 40x40 | 64, 128 |
| 32 | 20x20 | 256, 512 |
### Embedding Normalization
Face embeddings should be L2-normalized for comparison:
```rust
let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
let normalized: Vec<f32> = embedding.iter().map(|x| x / norm).collect();
```
## Expected File Layout
```
models/
├── README.md # This file
├── retinaface.onnx # Face detection model
├── mobilefacenet.onnx # Face embedding model (128-dim)
├── arcface.onnx # Alternative embedding model (512-dim, optional)
└── blazeface.onnx # Alternative detection model (optional)
```
## Testing Models
To verify models work correctly:
```bash
# Run integration tests with models
cd linux-hello-daemon
cargo test --features onnx -- --ignored
```
## Performance Guidelines
### Detection Model Selection
| Model | Input Size | Speed | Accuracy | Memory |
|-------|-----------|-------|----------|--------|
| RetinaFace-MNet0.25 | 640x640 | Fast | Good | ~5MB |
| RetinaFace-R50 | 640x640 | Medium | Excellent | ~100MB |
| BlazeFace | 128x128 | Very Fast | Moderate | ~1MB |
### Embedding Model Selection
| Model | Embedding Dim | Speed | Accuracy | Memory |
|-------|--------------|-------|----------|--------|
| MobileFaceNet | 128 | Fast | Good | ~4MB |
| ArcFace-R50 | 512 | Medium | Excellent | ~120MB |
| ArcFace-R100 | 512 | Slow | Best | ~250MB |
### Recommended Configurations
**Low-power devices (Raspberry Pi, etc.):**
- Detection: BlazeFace 128x128
- Embedding: MobileFaceNet 128-dim
- Expected: ~30ms per frame
**Standard desktop:**
- Detection: RetinaFace-MNet 640x640
- Embedding: MobileFaceNet 128-dim
- Expected: ~15ms per frame
**High-security scenarios:**
- Detection: RetinaFace-R50 640x640
- Embedding: ArcFace-R100 512-dim
- Expected: ~100ms per frame
## License Information
Ensure compliance with model licenses:
| Model | License | Commercial Use |
|-------|---------|----------------|
| RetinaFace | MIT | Yes |
| BlazeFace | Apache 2.0 | Yes |
| MobileFaceNet | MIT | Yes |
| ArcFace | MIT | Yes |
| InsightFace models | Non-commercial | Check specific model |
## Troubleshooting
### Model Loading Fails
1. Verify ONNX format version (opset 11-17 recommended)
2. Check input/output tensor names match expected
3. Ensure file is not corrupted: `python -c "import onnx; onnx.load('model.onnx')"`
### Poor Detection Results
1. Ensure input normalization matches model training
2. Check image is RGB (not BGR)
3. Verify input dimensions match model expectations
4. Adjust confidence threshold (default: 0.5)
### Embedding Quality Issues
1. Face alignment is critical - ensure landmarks are correct
2. Check that input is 112x112 after alignment
3. Verify embedding is L2-normalized before comparison
4. Distance threshold typically: 0.4-0.6 for cosine distance
## References
- [ONNX Model Zoo](https://github.com/onnx/models)
- [InsightFace](https://github.com/deepinsight/insightface)
- [RetinaFace Paper](https://arxiv.org/abs/1905.00641)
- [ArcFace Paper](https://arxiv.org/abs/1801.07698)
- [MobileFaceNet Paper](https://arxiv.org/abs/1804.07573)