277 lines
7.1 KiB
Markdown
277 lines
7.1 KiB
Markdown
# Face Recognition ONNX Models
|
|
|
|
This directory contains ONNX model files for face detection and embedding extraction
|
|
used by Linux Hello's facial authentication system.
|
|
|
|
## System Requirements
|
|
|
|
**ONNX support requires glibc 2.38+:**
|
|
- Ubuntu 24.04 or later
|
|
- Fedora 39 or later
|
|
- Arch Linux (rolling release)
|
|
|
|
Check your glibc version:
|
|
```bash
|
|
ldd --version
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# 1. Download models (see instructions below)
|
|
# 2. Place in this directory:
|
|
# models/retinaface.onnx
|
|
# models/mobilefacenet.onnx
|
|
|
|
# 3. Build with ONNX support
|
|
cargo build --release --features onnx
|
|
|
|
# 4. Test
|
|
./target/release/linux-hello detect --image photo.jpg --output detected.jpg
|
|
```
|
|
|
|
## Required Models
|
|
|
|
### 1. Face Detection Model
|
|
|
|
**Recommended: RetinaFace**
|
|
|
|
| Property | Value |
|
|
|----------|-------|
|
|
| File | `retinaface.onnx` |
|
|
| Purpose | Face detection with 5-point landmarks |
|
|
| Input Shape | `[1, 3, 640, 640]` (NCHW, RGB) |
|
|
| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
|
|
| Outputs | `loc`, `conf`, `landm` tensors |
|
|
|
|
**Alternative: BlazeFace**
|
|
|
|
| Property | Value |
|
|
|----------|-------|
|
|
| File | `blazeface.onnx` |
|
|
| Purpose | Fast face detection |
|
|
| Input Shape | `[1, 3, 128, 128]` or `[1, 3, 256, 256]` |
|
|
| Use Case | Real-time detection on low-power devices |
|
|
|
|
### 2. Face Embedding Model
|
|
|
|
**Recommended: MobileFaceNet**
|
|
|
|
| Property | Value |
|
|
|----------|-------|
|
|
| File | `mobilefacenet.onnx` |
|
|
| Purpose | Face embedding extraction |
|
|
| Input Shape | `[1, 3, 112, 112]` (NCHW, RGB) |
|
|
| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
|
|
| Output Shape | `[1, 128]` or `[1, 512]` |
|
|
| Output | L2-normalized embedding vector |
|
|
|
|
**Alternative: ArcFace**
|
|
|
|
| Property | Value |
|
|
|----------|-------|
|
|
| File | `arcface.onnx` |
|
|
| Purpose | High-accuracy face embedding |
|
|
| Input Shape | `[1, 3, 112, 112]` |
|
|
| Output Shape | `[1, 512]` |
|
|
| Use Case | Higher accuracy at cost of larger model |
|
|
|
|
## Download Instructions
|
|
|
|
### Option 1: From ONNX Model Zoo
|
|
|
|
```bash
|
|
# RetinaFace (face detection)
|
|
wget https://github.com/onnx/models/raw/main/vision/body_analysis/ultraface/models/version-RFB-640.onnx \
|
|
-O retinaface.onnx
|
|
|
|
# Note: MobileFaceNet may need to be converted from other frameworks
|
|
```
|
|
|
|
### Option 2: From InsightFace
|
|
|
|
```bash
|
|
# Clone InsightFace model repository
|
|
git clone https://github.com/deepinsight/insightface.git
|
|
cd insightface/model_zoo
|
|
|
|
# Download and extract models
|
|
# See: https://github.com/deepinsight/insightface/tree/master/model_zoo
|
|
```
|
|
|
|
### Option 3: Convert from PyTorch/TensorFlow
|
|
|
|
**From PyTorch:**
|
|
|
|
```python
|
|
import torch
|
|
import torch.onnx
|
|
|
|
# Load your trained model
|
|
model = YourFaceModel()
|
|
model.load_state_dict(torch.load('model.pth'))
|
|
model.eval()
|
|
|
|
# Export to ONNX
|
|
dummy_input = torch.randn(1, 3, 112, 112)
|
|
torch.onnx.export(
|
|
model,
|
|
dummy_input,
|
|
"model.onnx",
|
|
input_names=['input'],
|
|
output_names=['embedding'],
|
|
dynamic_axes={'input': {0: 'batch'}, 'embedding': {0: 'batch'}}
|
|
)
|
|
```
|
|
|
|
**From TensorFlow:**
|
|
|
|
```bash
|
|
pip install tf2onnx
|
|
|
|
python -m tf2onnx.convert \
|
|
--saved-model ./saved_model \
|
|
--output model.onnx \
|
|
--opset 13
|
|
```
|
|
|
|
## Model Specifications
|
|
|
|
### RetinaFace Output Format
|
|
|
|
The RetinaFace model outputs three tensors:
|
|
|
|
1. **loc** (bounding boxes): `[1, num_anchors, 4]`
|
|
- Format: `[dx, dy, dw, dh]` offsets from anchor boxes
|
|
- Decode: `cx = anchor_cx + dx * 0.1 * anchor_w`
|
|
|
|
2. **conf** (confidence): `[1, num_anchors, 2]`
|
|
- Format: `[background_score, face_score]`
|
|
- Apply softmax to get probability
|
|
|
|
3. **landm** (landmarks): `[1, num_anchors, 10]`
|
|
- Format: 5 points x 2 coordinates `[x0, y0, x1, y1, ..., x4, y4]`
|
|
- Landmark order:
|
|
- 0: Left eye center
|
|
- 1: Right eye center
|
|
- 2: Nose tip
|
|
- 3: Left mouth corner
|
|
- 4: Right mouth corner
|
|
|
|
### Anchor Configuration
|
|
|
|
RetinaFace uses multi-scale anchors:
|
|
|
|
| Stride | Feature Map Size (640x640) | Anchor Sizes |
|
|
|--------|---------------------------|--------------|
|
|
| 8 | 80x80 | 16, 32 |
|
|
| 16 | 40x40 | 64, 128 |
|
|
| 32 | 20x20 | 256, 512 |
|
|
|
|
### Embedding Normalization
|
|
|
|
Face embeddings should be L2-normalized for comparison:
|
|
|
|
```rust
|
|
let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
|
|
let normalized: Vec<f32> = embedding.iter().map(|x| x / norm).collect();
|
|
```
|
|
|
|
## Expected File Layout
|
|
|
|
```
|
|
models/
|
|
├── README.md # This file
|
|
├── retinaface.onnx # Face detection model
|
|
├── mobilefacenet.onnx # Face embedding model (128-dim)
|
|
├── arcface.onnx # Alternative embedding model (512-dim, optional)
|
|
└── blazeface.onnx # Alternative detection model (optional)
|
|
```
|
|
|
|
## Testing Models
|
|
|
|
To verify models work correctly:
|
|
|
|
```bash
|
|
# Run integration tests with models
|
|
cd linux-hello-daemon
|
|
cargo test --features onnx -- --ignored
|
|
```
|
|
|
|
## Performance Guidelines
|
|
|
|
### Detection Model Selection
|
|
|
|
| Model | Input Size | Speed | Accuracy | Memory |
|
|
|-------|-----------|-------|----------|--------|
|
|
| RetinaFace-MNet0.25 | 640x640 | Fast | Good | ~5MB |
|
|
| RetinaFace-R50 | 640x640 | Medium | Excellent | ~100MB |
|
|
| BlazeFace | 128x128 | Very Fast | Moderate | ~1MB |
|
|
|
|
### Embedding Model Selection
|
|
|
|
| Model | Embedding Dim | Speed | Accuracy | Memory |
|
|
|-------|--------------|-------|----------|--------|
|
|
| MobileFaceNet | 128 | Fast | Good | ~4MB |
|
|
| ArcFace-R50 | 512 | Medium | Excellent | ~120MB |
|
|
| ArcFace-R100 | 512 | Slow | Best | ~250MB |
|
|
|
|
### Recommended Configurations
|
|
|
|
**Low-power devices (Raspberry Pi, etc.):**
|
|
- Detection: BlazeFace 128x128
|
|
- Embedding: MobileFaceNet 128-dim
|
|
- Expected: ~30ms per frame
|
|
|
|
**Standard desktop:**
|
|
- Detection: RetinaFace-MNet 640x640
|
|
- Embedding: MobileFaceNet 128-dim
|
|
- Expected: ~15ms per frame
|
|
|
|
**High-security scenarios:**
|
|
- Detection: RetinaFace-R50 640x640
|
|
- Embedding: ArcFace-R100 512-dim
|
|
- Expected: ~100ms per frame
|
|
|
|
## License Information
|
|
|
|
Ensure compliance with model licenses:
|
|
|
|
| Model | License | Commercial Use |
|
|
|-------|---------|----------------|
|
|
| RetinaFace | MIT | Yes |
|
|
| BlazeFace | Apache 2.0 | Yes |
|
|
| MobileFaceNet | MIT | Yes |
|
|
| ArcFace | MIT | Yes |
|
|
| InsightFace models | Non-commercial | Check specific model |
|
|
|
|
## Troubleshooting
|
|
|
|
### Model Loading Fails
|
|
|
|
1. Verify ONNX format version (opset 11-17 recommended)
|
|
2. Check input/output tensor names match expected
|
|
3. Ensure file is not corrupted: `python -c "import onnx; onnx.load('model.onnx')"`
|
|
|
|
### Poor Detection Results
|
|
|
|
1. Ensure input normalization matches model training
|
|
2. Check image is RGB (not BGR)
|
|
3. Verify input dimensions match model expectations
|
|
4. Adjust confidence threshold (default: 0.5)
|
|
|
|
### Embedding Quality Issues
|
|
|
|
1. Face alignment is critical - ensure landmarks are correct
|
|
2. Check that input is 112x112 after alignment
|
|
3. Verify embedding is L2-normalized before comparison
|
|
4. Distance threshold typically: 0.4-0.6 for cosine distance
|
|
|
|
## References
|
|
|
|
- [ONNX Model Zoo](https://github.com/onnx/models)
|
|
- [InsightFace](https://github.com/deepinsight/insightface)
|
|
- [RetinaFace Paper](https://arxiv.org/abs/1905.00641)
|
|
- [ArcFace Paper](https://arxiv.org/abs/1801.07698)
|
|
- [MobileFaceNet Paper](https://arxiv.org/abs/1804.07573)
|