Development over

This commit is contained in:
2026-01-15 22:40:51 +01:00
parent 2f6b16d946
commit 1e7f296635
63 changed files with 12945 additions and 331 deletions

View File

@@ -1,41 +1,249 @@
# Face Detection Models
# Face Recognition ONNX Models
This directory contains ONNX model files for face detection and embedding.
This directory contains ONNX model files for face detection and embedding extraction
used by Linux Hello's facial authentication system.
## Required Models
### BlazeFace (Face Detection)
- **File**: `blazeface.onnx`
- **Purpose**: Fast face detection
- **Input**: RGB image [1, 3, 128, 128]
- **Output**: Bounding boxes and confidence scores
### 1. Face Detection Model
Download from: https://github.com/onnx/models/tree/main/vision/body_analysis/ultraface
**Recommended: RetinaFace**
### MobileFaceNet (Face Embedding)
- **File**: `mobilefacenet.onnx`
- **Purpose**: Face feature extraction
- **Input**: Aligned face [1, 3, 112, 112]
- **Output**: 128-dimensional embedding
| Property | Value |
|----------|-------|
| File | `retinaface.onnx` |
| Purpose | Face detection with 5-point landmarks |
| Input Shape | `[1, 3, 640, 640]` (NCHW, RGB) |
| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
| Outputs | `loc`, `conf`, `landm` tensors |
Download from: https://github.com/onnx/models
**Alternative: BlazeFace**
## Model Conversion
| Property | Value |
|----------|-------|
| File | `blazeface.onnx` |
| Purpose | Fast face detection |
| Input Shape | `[1, 3, 128, 128]` or `[1, 3, 256, 256]` |
| Use Case | Real-time detection on low-power devices |
If you have models in other formats, convert to ONNX using:
### 2. Face Embedding Model
**Recommended: MobileFaceNet**
| Property | Value |
|----------|-------|
| File | `mobilefacenet.onnx` |
| Purpose | Face embedding extraction |
| Input Shape | `[1, 3, 112, 112]` (NCHW, RGB) |
| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
| Output Shape | `[1, 128]` or `[1, 512]` |
| Output | L2-normalized embedding vector |
**Alternative: ArcFace**
| Property | Value |
|----------|-------|
| File | `arcface.onnx` |
| Purpose | High-accuracy face embedding |
| Input Shape | `[1, 3, 112, 112]` |
| Output Shape | `[1, 512]` |
| Use Case | Higher accuracy at cost of larger model |
## Download Instructions
### Option 1: From ONNX Model Zoo
```bash
# From TensorFlow
python -m tf2onnx.convert --saved-model ./model --output model.onnx
# RetinaFace (face detection)
wget https://github.com/onnx/models/raw/main/vision/body_analysis/ultraface/models/version-RFB-640.onnx \
-O retinaface.onnx
# From PyTorch
import torch
torch.onnx.export(model, dummy_input, "model.onnx")
# Note: MobileFaceNet may need to be converted from other frameworks
```
## License
### Option 2: From InsightFace
Please ensure you comply with the licenses of any models you download:
- BlazeFace: Apache 2.0
- MobileFaceNet: MIT
- ArcFace: MIT
```bash
# Clone InsightFace model repository
git clone https://github.com/deepinsight/insightface.git
cd insightface/model_zoo
# Download and extract models
# See: https://github.com/deepinsight/insightface/tree/master/model_zoo
```
### Option 3: Convert from PyTorch/TensorFlow
**From PyTorch:**
```python
import torch
import torch.onnx
# Load your trained model
model = YourFaceModel()
model.load_state_dict(torch.load('model.pth'))
model.eval()
# Export to ONNX
dummy_input = torch.randn(1, 3, 112, 112)
torch.onnx.export(
model,
dummy_input,
"model.onnx",
input_names=['input'],
output_names=['embedding'],
dynamic_axes={'input': {0: 'batch'}, 'embedding': {0: 'batch'}}
)
```
**From TensorFlow:**
```bash
pip install tf2onnx
python -m tf2onnx.convert \
--saved-model ./saved_model \
--output model.onnx \
--opset 13
```
## Model Specifications
### RetinaFace Output Format
The RetinaFace model outputs three tensors:
1. **loc** (bounding boxes): `[1, num_anchors, 4]`
- Format: `[dx, dy, dw, dh]` offsets from anchor boxes
- Decode: `cx = anchor_cx + dx * 0.1 * anchor_w`
2. **conf** (confidence): `[1, num_anchors, 2]`
- Format: `[background_score, face_score]`
- Apply softmax to get probability
3. **landm** (landmarks): `[1, num_anchors, 10]`
- Format: 5 points x 2 coordinates `[x0, y0, x1, y1, ..., x4, y4]`
- Landmark order:
- 0: Left eye center
- 1: Right eye center
- 2: Nose tip
- 3: Left mouth corner
- 4: Right mouth corner
### Anchor Configuration
RetinaFace uses multi-scale anchors:
| Stride | Feature Map Size (640x640) | Anchor Sizes |
|--------|---------------------------|--------------|
| 8 | 80x80 | 16, 32 |
| 16 | 40x40 | 64, 128 |
| 32 | 20x20 | 256, 512 |
### Embedding Normalization
Face embeddings should be L2-normalized for comparison:
```rust
let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
let normalized: Vec<f32> = embedding.iter().map(|x| x / norm).collect();
```
## Expected File Layout
```
models/
├── README.md # This file
├── retinaface.onnx # Face detection model
├── mobilefacenet.onnx # Face embedding model (128-dim)
├── arcface.onnx # Alternative embedding model (512-dim, optional)
└── blazeface.onnx # Alternative detection model (optional)
```
## Testing Models
To verify models work correctly:
```bash
# Run integration tests with models
cd linux-hello-daemon
cargo test --features onnx -- --ignored
```
## Performance Guidelines
### Detection Model Selection
| Model | Input Size | Speed | Accuracy | Memory |
|-------|-----------|-------|----------|--------|
| RetinaFace-MNet0.25 | 640x640 | Fast | Good | ~5MB |
| RetinaFace-R50 | 640x640 | Medium | Excellent | ~100MB |
| BlazeFace | 128x128 | Very Fast | Moderate | ~1MB |
### Embedding Model Selection
| Model | Embedding Dim | Speed | Accuracy | Memory |
|-------|--------------|-------|----------|--------|
| MobileFaceNet | 128 | Fast | Good | ~4MB |
| ArcFace-R50 | 512 | Medium | Excellent | ~120MB |
| ArcFace-R100 | 512 | Slow | Best | ~250MB |
### Recommended Configurations
**Low-power devices (Raspberry Pi, etc.):**
- Detection: BlazeFace 128x128
- Embedding: MobileFaceNet 128-dim
- Expected: ~30ms per frame
**Standard desktop:**
- Detection: RetinaFace-MNet 640x640
- Embedding: MobileFaceNet 128-dim
- Expected: ~15ms per frame
**High-security scenarios:**
- Detection: RetinaFace-R50 640x640
- Embedding: ArcFace-R100 512-dim
- Expected: ~100ms per frame
## License Information
Ensure compliance with model licenses:
| Model | License | Commercial Use |
|-------|---------|----------------|
| RetinaFace | MIT | Yes |
| BlazeFace | Apache 2.0 | Yes |
| MobileFaceNet | MIT | Yes |
| ArcFace | MIT | Yes |
| InsightFace models | Non-commercial | Check specific model |
## Troubleshooting
### Model Loading Fails
1. Verify ONNX format version (opset 11-17 recommended)
2. Check input/output tensor names match expected
3. Ensure file is not corrupted: `python -c "import onnx; onnx.load('model.onnx')"`
### Poor Detection Results
1. Ensure input normalization matches model training
2. Check image is RGB (not BGR)
3. Verify input dimensions match model expectations
4. Adjust confidence threshold (default: 0.5)
### Embedding Quality Issues
1. Face alignment is critical - ensure landmarks are correct
2. Check that input is 112x112 after alignment
3. Verify embedding is L2-normalized before comparison
4. Distance threshold typically: 0.4-0.6 for cosine distance
## References
- [ONNX Model Zoo](https://github.com/onnx/models)
- [InsightFace](https://github.com/deepinsight/insightface)
- [RetinaFace Paper](https://arxiv.org/abs/1905.00641)
- [ArcFace Paper](https://arxiv.org/abs/1801.07698)
- [MobileFaceNet Paper](https://arxiv.org/abs/1804.07573)