Development over
This commit is contained in:
260
models/README.md
260
models/README.md
@@ -1,41 +1,249 @@
|
||||
# Face Detection Models
|
||||
# Face Recognition ONNX Models
|
||||
|
||||
This directory contains ONNX model files for face detection and embedding.
|
||||
This directory contains ONNX model files for face detection and embedding extraction
|
||||
used by Linux Hello's facial authentication system.
|
||||
|
||||
## Required Models
|
||||
|
||||
### BlazeFace (Face Detection)
|
||||
- **File**: `blazeface.onnx`
|
||||
- **Purpose**: Fast face detection
|
||||
- **Input**: RGB image [1, 3, 128, 128]
|
||||
- **Output**: Bounding boxes and confidence scores
|
||||
### 1. Face Detection Model
|
||||
|
||||
Download from: https://github.com/onnx/models/tree/main/vision/body_analysis/ultraface
|
||||
**Recommended: RetinaFace**
|
||||
|
||||
### MobileFaceNet (Face Embedding)
|
||||
- **File**: `mobilefacenet.onnx`
|
||||
- **Purpose**: Face feature extraction
|
||||
- **Input**: Aligned face [1, 3, 112, 112]
|
||||
- **Output**: 128-dimensional embedding
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| File | `retinaface.onnx` |
|
||||
| Purpose | Face detection with 5-point landmarks |
|
||||
| Input Shape | `[1, 3, 640, 640]` (NCHW, RGB) |
|
||||
| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
|
||||
| Outputs | `loc`, `conf`, `landm` tensors |
|
||||
|
||||
Download from: https://github.com/onnx/models
|
||||
**Alternative: BlazeFace**
|
||||
|
||||
## Model Conversion
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| File | `blazeface.onnx` |
|
||||
| Purpose | Fast face detection |
|
||||
| Input Shape | `[1, 3, 128, 128]` or `[1, 3, 256, 256]` |
|
||||
| Use Case | Real-time detection on low-power devices |
|
||||
|
||||
If you have models in other formats, convert to ONNX using:
|
||||
### 2. Face Embedding Model
|
||||
|
||||
**Recommended: MobileFaceNet**
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| File | `mobilefacenet.onnx` |
|
||||
| Purpose | Face embedding extraction |
|
||||
| Input Shape | `[1, 3, 112, 112]` (NCHW, RGB) |
|
||||
| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
|
||||
| Output Shape | `[1, 128]` or `[1, 512]` |
|
||||
| Output | L2-normalized embedding vector |
|
||||
|
||||
**Alternative: ArcFace**
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| File | `arcface.onnx` |
|
||||
| Purpose | High-accuracy face embedding |
|
||||
| Input Shape | `[1, 3, 112, 112]` |
|
||||
| Output Shape | `[1, 512]` |
|
||||
| Use Case | Higher accuracy at cost of larger model |
|
||||
|
||||
## Download Instructions
|
||||
|
||||
### Option 1: From ONNX Model Zoo
|
||||
|
||||
```bash
|
||||
# From TensorFlow
|
||||
python -m tf2onnx.convert --saved-model ./model --output model.onnx
|
||||
# RetinaFace (face detection)
|
||||
wget https://github.com/onnx/models/raw/main/vision/body_analysis/ultraface/models/version-RFB-640.onnx \
|
||||
-O retinaface.onnx
|
||||
|
||||
# From PyTorch
|
||||
import torch
|
||||
torch.onnx.export(model, dummy_input, "model.onnx")
|
||||
# Note: MobileFaceNet may need to be converted from other frameworks
|
||||
```
|
||||
|
||||
## License
|
||||
### Option 2: From InsightFace
|
||||
|
||||
Please ensure you comply with the licenses of any models you download:
|
||||
- BlazeFace: Apache 2.0
|
||||
- MobileFaceNet: MIT
|
||||
- ArcFace: MIT
|
||||
```bash
|
||||
# Clone InsightFace model repository
|
||||
git clone https://github.com/deepinsight/insightface.git
|
||||
cd insightface/model_zoo
|
||||
|
||||
# Download and extract models
|
||||
# See: https://github.com/deepinsight/insightface/tree/master/model_zoo
|
||||
```
|
||||
|
||||
### Option 3: Convert from PyTorch/TensorFlow
|
||||
|
||||
**From PyTorch:**
|
||||
|
||||
```python
|
||||
import torch
|
||||
import torch.onnx
|
||||
|
||||
# Load your trained model
|
||||
model = YourFaceModel()
|
||||
model.load_state_dict(torch.load('model.pth'))
|
||||
model.eval()
|
||||
|
||||
# Export to ONNX
|
||||
dummy_input = torch.randn(1, 3, 112, 112)
|
||||
torch.onnx.export(
|
||||
model,
|
||||
dummy_input,
|
||||
"model.onnx",
|
||||
input_names=['input'],
|
||||
output_names=['embedding'],
|
||||
dynamic_axes={'input': {0: 'batch'}, 'embedding': {0: 'batch'}}
|
||||
)
|
||||
```
|
||||
|
||||
**From TensorFlow:**
|
||||
|
||||
```bash
|
||||
pip install tf2onnx
|
||||
|
||||
python -m tf2onnx.convert \
|
||||
--saved-model ./saved_model \
|
||||
--output model.onnx \
|
||||
--opset 13
|
||||
```
|
||||
|
||||
## Model Specifications
|
||||
|
||||
### RetinaFace Output Format
|
||||
|
||||
The RetinaFace model outputs three tensors:
|
||||
|
||||
1. **loc** (bounding boxes): `[1, num_anchors, 4]`
|
||||
- Format: `[dx, dy, dw, dh]` offsets from anchor boxes
|
||||
- Decode: `cx = anchor_cx + dx * 0.1 * anchor_w`
|
||||
|
||||
2. **conf** (confidence): `[1, num_anchors, 2]`
|
||||
- Format: `[background_score, face_score]`
|
||||
- Apply softmax to get probability
|
||||
|
||||
3. **landm** (landmarks): `[1, num_anchors, 10]`
|
||||
- Format: 5 points x 2 coordinates `[x0, y0, x1, y1, ..., x4, y4]`
|
||||
- Landmark order:
|
||||
- 0: Left eye center
|
||||
- 1: Right eye center
|
||||
- 2: Nose tip
|
||||
- 3: Left mouth corner
|
||||
- 4: Right mouth corner
|
||||
|
||||
### Anchor Configuration
|
||||
|
||||
RetinaFace uses multi-scale anchors:
|
||||
|
||||
| Stride | Feature Map Size (640x640) | Anchor Sizes |
|
||||
|--------|---------------------------|--------------|
|
||||
| 8 | 80x80 | 16, 32 |
|
||||
| 16 | 40x40 | 64, 128 |
|
||||
| 32 | 20x20 | 256, 512 |
|
||||
|
||||
### Embedding Normalization
|
||||
|
||||
Face embeddings should be L2-normalized for comparison:
|
||||
|
||||
```rust
|
||||
let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
let normalized: Vec<f32> = embedding.iter().map(|x| x / norm).collect();
|
||||
```
|
||||
|
||||
## Expected File Layout
|
||||
|
||||
```
|
||||
models/
|
||||
├── README.md # This file
|
||||
├── retinaface.onnx # Face detection model
|
||||
├── mobilefacenet.onnx # Face embedding model (128-dim)
|
||||
├── arcface.onnx # Alternative embedding model (512-dim, optional)
|
||||
└── blazeface.onnx # Alternative detection model (optional)
|
||||
```
|
||||
|
||||
## Testing Models
|
||||
|
||||
To verify models work correctly:
|
||||
|
||||
```bash
|
||||
# Run integration tests with models
|
||||
cd linux-hello-daemon
|
||||
cargo test --features onnx -- --ignored
|
||||
```
|
||||
|
||||
## Performance Guidelines
|
||||
|
||||
### Detection Model Selection
|
||||
|
||||
| Model | Input Size | Speed | Accuracy | Memory |
|
||||
|-------|-----------|-------|----------|--------|
|
||||
| RetinaFace-MNet0.25 | 640x640 | Fast | Good | ~5MB |
|
||||
| RetinaFace-R50 | 640x640 | Medium | Excellent | ~100MB |
|
||||
| BlazeFace | 128x128 | Very Fast | Moderate | ~1MB |
|
||||
|
||||
### Embedding Model Selection
|
||||
|
||||
| Model | Embedding Dim | Speed | Accuracy | Memory |
|
||||
|-------|--------------|-------|----------|--------|
|
||||
| MobileFaceNet | 128 | Fast | Good | ~4MB |
|
||||
| ArcFace-R50 | 512 | Medium | Excellent | ~120MB |
|
||||
| ArcFace-R100 | 512 | Slow | Best | ~250MB |
|
||||
|
||||
### Recommended Configurations
|
||||
|
||||
**Low-power devices (Raspberry Pi, etc.):**
|
||||
- Detection: BlazeFace 128x128
|
||||
- Embedding: MobileFaceNet 128-dim
|
||||
- Expected: ~30ms per frame
|
||||
|
||||
**Standard desktop:**
|
||||
- Detection: RetinaFace-MNet 640x640
|
||||
- Embedding: MobileFaceNet 128-dim
|
||||
- Expected: ~15ms per frame
|
||||
|
||||
**High-security scenarios:**
|
||||
- Detection: RetinaFace-R50 640x640
|
||||
- Embedding: ArcFace-R100 512-dim
|
||||
- Expected: ~100ms per frame
|
||||
|
||||
## License Information
|
||||
|
||||
Ensure compliance with model licenses:
|
||||
|
||||
| Model | License | Commercial Use |
|
||||
|-------|---------|----------------|
|
||||
| RetinaFace | MIT | Yes |
|
||||
| BlazeFace | Apache 2.0 | Yes |
|
||||
| MobileFaceNet | MIT | Yes |
|
||||
| ArcFace | MIT | Yes |
|
||||
| InsightFace models | Non-commercial | Check specific model |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Model Loading Fails
|
||||
|
||||
1. Verify ONNX format version (opset 11-17 recommended)
|
||||
2. Check input/output tensor names match expected
|
||||
3. Ensure file is not corrupted: `python -c "import onnx; onnx.load('model.onnx')"`
|
||||
|
||||
### Poor Detection Results
|
||||
|
||||
1. Ensure input normalization matches model training
|
||||
2. Check image is RGB (not BGR)
|
||||
3. Verify input dimensions match model expectations
|
||||
4. Adjust confidence threshold (default: 0.5)
|
||||
|
||||
### Embedding Quality Issues
|
||||
|
||||
1. Face alignment is critical - ensure landmarks are correct
|
||||
2. Check that input is 112x112 after alignment
|
||||
3. Verify embedding is L2-normalized before comparison
|
||||
4. Distance threshold typically: 0.4-0.6 for cosine distance
|
||||
|
||||
## References
|
||||
|
||||
- [ONNX Model Zoo](https://github.com/onnx/models)
|
||||
- [InsightFace](https://github.com/deepinsight/insightface)
|
||||
- [RetinaFace Paper](https://arxiv.org/abs/1905.00641)
|
||||
- [ArcFace Paper](https://arxiv.org/abs/1801.07698)
|
||||
- [MobileFaceNet Paper](https://arxiv.org/abs/1804.07573)
|
||||
|
||||
Reference in New Issue
Block a user