Development over

2026-01-15 22:40:51 +01:00
parent 2f6b16d946
commit 1e7f296635
63 changed files with 12945 additions and 331 deletions
--- a/models/README.md
+++ b/models/README.md
@@ -1,41 +1,249 @@
-# Face Detection Models
+# Face Recognition ONNX Models

-This directory contains ONNX model files for face detection and embedding.
+This directory contains ONNX model files for face detection and embedding extraction
+used by Linux Hello's facial authentication system.

 ## Required Models

-### BlazeFace (Face Detection)
- **File**: `blazeface.onnx`
- **Purpose**: Fast face detection
- **Input**: RGB image [1, 3, 128, 128]
- **Output**: Bounding boxes and confidence scores
+### 1. Face Detection Model

-Download from: https://github.com/onnx/models/tree/main/vision/body_analysis/ultraface
+**Recommended: RetinaFace**

-### MobileFaceNet (Face Embedding)
- **File**: `mobilefacenet.onnx`
- **Purpose**: Face feature extraction
- **Input**: Aligned face [1, 3, 112, 112]
- **Output**: 128-dimensional embedding
+| Property | Value |
+|----------|-------|
+| File | `retinaface.onnx` |
+| Purpose | Face detection with 5-point landmarks |
+| Input Shape | `[1, 3, 640, 640]` (NCHW, RGB) |
+| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
+| Outputs | `loc`, `conf`, `landm` tensors |

-Download from: https://github.com/onnx/models
+**Alternative: BlazeFace**

-## Model Conversion
+| Property | Value |
+|----------|-------|
+| File | `blazeface.onnx` |
+| Purpose | Fast face detection |
+| Input Shape | `[1, 3, 128, 128]` or `[1, 3, 256, 256]` |
+| Use Case | Real-time detection on low-power devices |

-If you have models in other formats, convert to ONNX using:
+### 2. Face Embedding Model
+
+**Recommended: MobileFaceNet**
+
+| Property | Value |
+|----------|-------|
+| File | `mobilefacenet.onnx` |
+| Purpose | Face embedding extraction |
+| Input Shape | `[1, 3, 112, 112]` (NCHW, RGB) |
+| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
+| Output Shape | `[1, 128]` or `[1, 512]` |
+| Output | L2-normalized embedding vector |
+
+**Alternative: ArcFace**
+
+| Property | Value |
+|----------|-------|
+| File | `arcface.onnx` |
+| Purpose | High-accuracy face embedding |
+| Input Shape | `[1, 3, 112, 112]` |
+| Output Shape | `[1, 512]` |
+| Use Case | Higher accuracy at cost of larger model |
+
+## Download Instructions
+
+### Option 1: From ONNX Model Zoo

 ```bash
-# From TensorFlow
-python -m tf2onnx.convert --saved-model ./model --output model.onnx
+# RetinaFace (face detection)
+wget https://github.com/onnx/models/raw/main/vision/body_analysis/ultraface/models/version-RFB-640.onnx \
+     -O retinaface.onnx

-# From PyTorch
-import torch
-torch.onnx.export(model, dummy_input, "model.onnx")
+# Note: MobileFaceNet may need to be converted from other frameworks
 ```

-## License
+### Option 2: From InsightFace

-Please ensure you comply with the licenses of any models you download:
- BlazeFace: Apache 2.0
- MobileFaceNet: MIT
- ArcFace: MIT
+```bash
+# Clone InsightFace model repository
+git clone https://github.com/deepinsight/insightface.git
+cd insightface/model_zoo
+
+# Download and extract models
+# See: https://github.com/deepinsight/insightface/tree/master/model_zoo
+```
+
+### Option 3: Convert from PyTorch/TensorFlow
+
+**From PyTorch:**
+
+```python
+import torch
+import torch.onnx
+
+# Load your trained model
+model = YourFaceModel()
+model.load_state_dict(torch.load('model.pth'))
+model.eval()
+
+# Export to ONNX
+dummy_input = torch.randn(1, 3, 112, 112)
+torch.onnx.export(
+    model,
+    dummy_input,
+    "model.onnx",
+    input_names=['input'],
+    output_names=['embedding'],
+    dynamic_axes={'input': {0: 'batch'}, 'embedding': {0: 'batch'}}
+)
+```
+
+**From TensorFlow:**
+
+```bash
+pip install tf2onnx
+
+python -m tf2onnx.convert \
+    --saved-model ./saved_model \
+    --output model.onnx \
+    --opset 13
+```
+
+## Model Specifications
+
+### RetinaFace Output Format
+
+The RetinaFace model outputs three tensors:
+
+1. **loc** (bounding boxes): `[1, num_anchors, 4]`
+   - Format: `[dx, dy, dw, dh]` offsets from anchor boxes
+   - Decode: `cx = anchor_cx + dx * 0.1 * anchor_w`
+
+2. **conf** (confidence): `[1, num_anchors, 2]`
+   - Format: `[background_score, face_score]`
+   - Apply softmax to get probability
+
+3. **landm** (landmarks): `[1, num_anchors, 10]`
+   - Format: 5 points x 2 coordinates `[x0, y0, x1, y1, ..., x4, y4]`
+   - Landmark order:
+     - 0: Left eye center
+     - 1: Right eye center
+     - 2: Nose tip
+     - 3: Left mouth corner
+     - 4: Right mouth corner
+
+### Anchor Configuration
+
+RetinaFace uses multi-scale anchors:
+
+| Stride | Feature Map Size (640x640) | Anchor Sizes |
+|--------|---------------------------|--------------|
+| 8      | 80x80                     | 16, 32      |
+| 16     | 40x40                     | 64, 128     |
+| 32     | 20x20                     | 256, 512    |
+
+### Embedding Normalization
+
+Face embeddings should be L2-normalized for comparison:
+
+```rust
+let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
+let normalized: Vec<f32> = embedding.iter().map(|x| x / norm).collect();
+```
+
+## Expected File Layout
+
+```
+models/
+├── README.md           # This file
+├── retinaface.onnx     # Face detection model
+├── mobilefacenet.onnx  # Face embedding model (128-dim)
+├── arcface.onnx        # Alternative embedding model (512-dim, optional)
+└── blazeface.onnx      # Alternative detection model (optional)
+```
+
+## Testing Models
+
+To verify models work correctly:
+
+```bash
+# Run integration tests with models
+cd linux-hello-daemon
+cargo test --features onnx -- --ignored
+```
+
+## Performance Guidelines
+
+### Detection Model Selection
+
+| Model | Input Size | Speed | Accuracy | Memory |
+|-------|-----------|-------|----------|--------|
+| RetinaFace-MNet0.25 | 640x640 | Fast | Good | ~5MB |
+| RetinaFace-R50 | 640x640 | Medium | Excellent | ~100MB |
+| BlazeFace | 128x128 | Very Fast | Moderate | ~1MB |
+
+### Embedding Model Selection
+
+| Model | Embedding Dim | Speed | Accuracy | Memory |
+|-------|--------------|-------|----------|--------|
+| MobileFaceNet | 128 | Fast | Good | ~4MB |
+| ArcFace-R50 | 512 | Medium | Excellent | ~120MB |
+| ArcFace-R100 | 512 | Slow | Best | ~250MB |
+
+### Recommended Configurations
+
+**Low-power devices (Raspberry Pi, etc.):**
+- Detection: BlazeFace 128x128
+- Embedding: MobileFaceNet 128-dim
+- Expected: ~30ms per frame
+
+**Standard desktop:**
+- Detection: RetinaFace-MNet 640x640
+- Embedding: MobileFaceNet 128-dim
+- Expected: ~15ms per frame
+
+**High-security scenarios:**
+- Detection: RetinaFace-R50 640x640
+- Embedding: ArcFace-R100 512-dim
+- Expected: ~100ms per frame
+
+## License Information
+
+Ensure compliance with model licenses:
+
+| Model | License | Commercial Use |
+|-------|---------|----------------|
+| RetinaFace | MIT | Yes |
+| BlazeFace | Apache 2.0 | Yes |
+| MobileFaceNet | MIT | Yes |
+| ArcFace | MIT | Yes |
+| InsightFace models | Non-commercial | Check specific model |
+
+## Troubleshooting
+
+### Model Loading Fails
+
+1. Verify ONNX format version (opset 11-17 recommended)
+2. Check input/output tensor names match expected
+3. Ensure file is not corrupted: `python -c "import onnx; onnx.load('model.onnx')"`
+
+### Poor Detection Results
+
+1. Ensure input normalization matches model training
+2. Check image is RGB (not BGR)
+3. Verify input dimensions match model expectations
+4. Adjust confidence threshold (default: 0.5)
+
+### Embedding Quality Issues
+
+1. Face alignment is critical - ensure landmarks are correct
+2. Check that input is 112x112 after alignment
+3. Verify embedding is L2-normalized before comparison
+4. Distance threshold typically: 0.4-0.6 for cosine distance
+
+## References
+
+- [ONNX Model Zoo](https://github.com/onnx/models)
+- [InsightFace](https://github.com/deepinsight/insightface)
+- [RetinaFace Paper](https://arxiv.org/abs/1905.00641)
+- [ArcFace Paper](https://arxiv.org/abs/1801.07698)
+- [MobileFaceNet Paper](https://arxiv.org/abs/1804.07573)