# Face Recognition ONNX Models This directory contains ONNX model files for face detection and embedding extraction used by Linux Hello's facial authentication system. ## System Requirements **ONNX support requires glibc 2.38+:** - Ubuntu 24.04 or later - Fedora 39 or later - Arch Linux (rolling release) Check your glibc version: ```bash ldd --version ``` ## Quick Start ```bash # 1. Download models (see instructions below) # 2. Place in this directory: # models/retinaface.onnx # models/mobilefacenet.onnx # 3. Build with ONNX support cargo build --release --features onnx # 4. Test ./target/release/linux-hello detect --image photo.jpg --output detected.jpg ``` ## Required Models ### 1. Face Detection Model **Recommended: RetinaFace** | Property | Value | |----------|-------| | File | `retinaface.onnx` | | Purpose | Face detection with 5-point landmarks | | Input Shape | `[1, 3, 640, 640]` (NCHW, RGB) | | Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` | | Outputs | `loc`, `conf`, `landm` tensors | **Alternative: BlazeFace** | Property | Value | |----------|-------| | File | `blazeface.onnx` | | Purpose | Fast face detection | | Input Shape | `[1, 3, 128, 128]` or `[1, 3, 256, 256]` | | Use Case | Real-time detection on low-power devices | ### 2. Face Embedding Model **Recommended: MobileFaceNet** | Property | Value | |----------|-------| | File | `mobilefacenet.onnx` | | Purpose | Face embedding extraction | | Input Shape | `[1, 3, 112, 112]` (NCHW, RGB) | | Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` | | Output Shape | `[1, 128]` or `[1, 512]` | | Output | L2-normalized embedding vector | **Alternative: ArcFace** | Property | Value | |----------|-------| | File | `arcface.onnx` | | Purpose | High-accuracy face embedding | | Input Shape | `[1, 3, 112, 112]` | | Output Shape | `[1, 512]` | | Use Case | Higher accuracy at cost of larger model | ## Download Instructions ### Option 1: From ONNX Model Zoo ```bash # RetinaFace (face detection) wget https://github.com/onnx/models/raw/main/vision/body_analysis/ultraface/models/version-RFB-640.onnx \ -O retinaface.onnx # Note: MobileFaceNet may need to be converted from other frameworks ``` ### Option 2: From InsightFace ```bash # Clone InsightFace model repository git clone https://github.com/deepinsight/insightface.git cd insightface/model_zoo # Download and extract models # See: https://github.com/deepinsight/insightface/tree/master/model_zoo ``` ### Option 3: Convert from PyTorch/TensorFlow **From PyTorch:** ```python import torch import torch.onnx # Load your trained model model = YourFaceModel() model.load_state_dict(torch.load('model.pth')) model.eval() # Export to ONNX dummy_input = torch.randn(1, 3, 112, 112) torch.onnx.export( model, dummy_input, "model.onnx", input_names=['input'], output_names=['embedding'], dynamic_axes={'input': {0: 'batch'}, 'embedding': {0: 'batch'}} ) ``` **From TensorFlow:** ```bash pip install tf2onnx python -m tf2onnx.convert \ --saved-model ./saved_model \ --output model.onnx \ --opset 13 ``` ## Model Specifications ### RetinaFace Output Format The RetinaFace model outputs three tensors: 1. **loc** (bounding boxes): `[1, num_anchors, 4]` - Format: `[dx, dy, dw, dh]` offsets from anchor boxes - Decode: `cx = anchor_cx + dx * 0.1 * anchor_w` 2. **conf** (confidence): `[1, num_anchors, 2]` - Format: `[background_score, face_score]` - Apply softmax to get probability 3. **landm** (landmarks): `[1, num_anchors, 10]` - Format: 5 points x 2 coordinates `[x0, y0, x1, y1, ..., x4, y4]` - Landmark order: - 0: Left eye center - 1: Right eye center - 2: Nose tip - 3: Left mouth corner - 4: Right mouth corner ### Anchor Configuration RetinaFace uses multi-scale anchors: | Stride | Feature Map Size (640x640) | Anchor Sizes | |--------|---------------------------|--------------| | 8 | 80x80 | 16, 32 | | 16 | 40x40 | 64, 128 | | 32 | 20x20 | 256, 512 | ### Embedding Normalization Face embeddings should be L2-normalized for comparison: ```rust let norm: f32 = embedding.iter().map(|x| x * x).sum::().sqrt(); let normalized: Vec = embedding.iter().map(|x| x / norm).collect(); ``` ## Expected File Layout ``` models/ ├── README.md # This file ├── retinaface.onnx # Face detection model ├── mobilefacenet.onnx # Face embedding model (128-dim) ├── arcface.onnx # Alternative embedding model (512-dim, optional) └── blazeface.onnx # Alternative detection model (optional) ``` ## Testing Models To verify models work correctly: ```bash # Run integration tests with models cd linux-hello-daemon cargo test --features onnx -- --ignored ``` ## Performance Guidelines ### Detection Model Selection | Model | Input Size | Speed | Accuracy | Memory | |-------|-----------|-------|----------|--------| | RetinaFace-MNet0.25 | 640x640 | Fast | Good | ~5MB | | RetinaFace-R50 | 640x640 | Medium | Excellent | ~100MB | | BlazeFace | 128x128 | Very Fast | Moderate | ~1MB | ### Embedding Model Selection | Model | Embedding Dim | Speed | Accuracy | Memory | |-------|--------------|-------|----------|--------| | MobileFaceNet | 128 | Fast | Good | ~4MB | | ArcFace-R50 | 512 | Medium | Excellent | ~120MB | | ArcFace-R100 | 512 | Slow | Best | ~250MB | ### Recommended Configurations **Low-power devices (Raspberry Pi, etc.):** - Detection: BlazeFace 128x128 - Embedding: MobileFaceNet 128-dim - Expected: ~30ms per frame **Standard desktop:** - Detection: RetinaFace-MNet 640x640 - Embedding: MobileFaceNet 128-dim - Expected: ~15ms per frame **High-security scenarios:** - Detection: RetinaFace-R50 640x640 - Embedding: ArcFace-R100 512-dim - Expected: ~100ms per frame ## License Information Ensure compliance with model licenses: | Model | License | Commercial Use | |-------|---------|----------------| | RetinaFace | MIT | Yes | | BlazeFace | Apache 2.0 | Yes | | MobileFaceNet | MIT | Yes | | ArcFace | MIT | Yes | | InsightFace models | Non-commercial | Check specific model | ## Troubleshooting ### Model Loading Fails 1. Verify ONNX format version (opset 11-17 recommended) 2. Check input/output tensor names match expected 3. Ensure file is not corrupted: `python -c "import onnx; onnx.load('model.onnx')"` ### Poor Detection Results 1. Ensure input normalization matches model training 2. Check image is RGB (not BGR) 3. Verify input dimensions match model expectations 4. Adjust confidence threshold (default: 0.5) ### Embedding Quality Issues 1. Face alignment is critical - ensure landmarks are correct 2. Check that input is 112x112 after alignment 3. Verify embedding is L2-normalized before comparison 4. Distance threshold typically: 0.4-0.6 for cosine distance ## References - [ONNX Model Zoo](https://github.com/onnx/models) - [InsightFace](https://github.com/deepinsight/insightface) - [RetinaFace Paper](https://arxiv.org/abs/1905.00641) - [ArcFace Paper](https://arxiv.org/abs/1801.07698) - [MobileFaceNet Paper](https://arxiv.org/abs/1804.07573)