Linux-Hello/models/README.md

# Face Recognition ONNX Models

This directory contains ONNX model files for face detection and embedding extraction
used by Linux Hello's facial authentication system.

## System Requirements

**ONNX support requires glibc 2.38+:**
- Ubuntu 24.04 or later
- Fedora 39 or later
- Arch Linux (rolling release)

Check your glibc version:
```bash
ldd --version
```

## Quick Start

```bash
# 1. Download models (see instructions below)
# 2. Place in this directory:
#    models/retinaface.onnx
#    models/mobilefacenet.onnx

# 3. Build with ONNX support
cargo build --release --features onnx

# 4. Test
./target/release/linux-hello detect --image photo.jpg --output detected.jpg
```

## Required Models

### 1. Face Detection Model

**Recommended: RetinaFace**

| Property | Value |
|----------|-------|
| File | `retinaface.onnx` |
| Purpose | Face detection with 5-point landmarks |
| Input Shape | `[1, 3, 640, 640]` (NCHW, RGB) |
| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
| Outputs | `loc`, `conf`, `landm` tensors |

**Alternative: BlazeFace**

| Property | Value |
|----------|-------|
| File | `blazeface.onnx` |
| Purpose | Fast face detection |
| Input Shape | `[1, 3, 128, 128]` or `[1, 3, 256, 256]` |
| Use Case | Real-time detection on low-power devices |

### 2. Face Embedding Model

**Recommended: MobileFaceNet**

| Property | Value |
|----------|-------|
| File | `mobilefacenet.onnx` |
| Purpose | Face embedding extraction |
| Input Shape | `[1, 3, 112, 112]` (NCHW, RGB) |
| Input Range | `[-1, 1]` normalized: `(pixel - 127.5) / 128.0` |
| Output Shape | `[1, 128]` or `[1, 512]` |
| Output | L2-normalized embedding vector |

**Alternative: ArcFace**

| Property | Value |
|----------|-------|
| File | `arcface.onnx` |
| Purpose | High-accuracy face embedding |
| Input Shape | `[1, 3, 112, 112]` |
| Output Shape | `[1, 512]` |
| Use Case | Higher accuracy at cost of larger model |

## Download Instructions

### Option 1: From ONNX Model Zoo

```bash
# RetinaFace (face detection)
wget https://github.com/onnx/models/raw/main/vision/body_analysis/ultraface/models/version-RFB-640.onnx \
     -O retinaface.onnx

# Note: MobileFaceNet may need to be converted from other frameworks
```

### Option 2: From InsightFace

```bash
# Clone InsightFace model repository
git clone https://github.com/deepinsight/insightface.git
cd insightface/model_zoo

# Download and extract models
# See: https://github.com/deepinsight/insightface/tree/master/model_zoo
```

### Option 3: Convert from PyTorch/TensorFlow

**From PyTorch:**

```python
import torch
import torch.onnx

# Load your trained model
model = YourFaceModel()
model.load_state_dict(torch.load('model.pth'))
model.eval()

# Export to ONNX
dummy_input = torch.randn(1, 3, 112, 112)
torch.onnx.export(
    model,
    dummy_input,
    "model.onnx",
    input_names=['input'],
    output_names=['embedding'],
    dynamic_axes={'input': {0: 'batch'}, 'embedding': {0: 'batch'}}
)
```

**From TensorFlow:**

```bash
pip install tf2onnx

python -m tf2onnx.convert \
    --saved-model ./saved_model \
    --output model.onnx \
    --opset 13
```

## Model Specifications

### RetinaFace Output Format

The RetinaFace model outputs three tensors:

1. **loc** (bounding boxes): `[1, num_anchors, 4]`
   - Format: `[dx, dy, dw, dh]` offsets from anchor boxes
   - Decode: `cx = anchor_cx + dx * 0.1 * anchor_w`

2. **conf** (confidence): `[1, num_anchors, 2]`
   - Format: `[background_score, face_score]`
   - Apply softmax to get probability

3. **landm** (landmarks): `[1, num_anchors, 10]`
   - Format: 5 points x 2 coordinates `[x0, y0, x1, y1, ..., x4, y4]`
   - Landmark order:
     - 0: Left eye center
     - 1: Right eye center
     - 2: Nose tip
     - 3: Left mouth corner
     - 4: Right mouth corner

### Anchor Configuration

RetinaFace uses multi-scale anchors:

| Stride | Feature Map Size (640x640) | Anchor Sizes |
|--------|---------------------------|--------------|
| 8      | 80x80                     | 16, 32      |
| 16     | 40x40                     | 64, 128     |
| 32     | 20x20                     | 256, 512    |

### Embedding Normalization

Face embeddings should be L2-normalized for comparison:

```rust
let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
let normalized: Vec<f32> = embedding.iter().map(|x| x / norm).collect();
```

## Expected File Layout

```
models/
├── README.md           # This file
├── retinaface.onnx     # Face detection model
├── mobilefacenet.onnx  # Face embedding model (128-dim)
├── arcface.onnx        # Alternative embedding model (512-dim, optional)
└── blazeface.onnx      # Alternative detection model (optional)
```

## Testing Models

To verify models work correctly:

```bash
# Run integration tests with models
cd linux-hello-daemon
cargo test --features onnx -- --ignored
```

## Performance Guidelines

### Detection Model Selection

| Model | Input Size | Speed | Accuracy | Memory |
|-------|-----------|-------|----------|--------|
| RetinaFace-MNet0.25 | 640x640 | Fast | Good | ~5MB |
| RetinaFace-R50 | 640x640 | Medium | Excellent | ~100MB |
| BlazeFace | 128x128 | Very Fast | Moderate | ~1MB |

### Embedding Model Selection

| Model | Embedding Dim | Speed | Accuracy | Memory |
|-------|--------------|-------|----------|--------|
| MobileFaceNet | 128 | Fast | Good | ~4MB |
| ArcFace-R50 | 512 | Medium | Excellent | ~120MB |
| ArcFace-R100 | 512 | Slow | Best | ~250MB |

### Recommended Configurations

**Low-power devices (Raspberry Pi, etc.):**
- Detection: BlazeFace 128x128
- Embedding: MobileFaceNet 128-dim
- Expected: ~30ms per frame

**Standard desktop:**
- Detection: RetinaFace-MNet 640x640
- Embedding: MobileFaceNet 128-dim
- Expected: ~15ms per frame

**High-security scenarios:**
- Detection: RetinaFace-R50 640x640
- Embedding: ArcFace-R100 512-dim
- Expected: ~100ms per frame

## License Information

Ensure compliance with model licenses:

| Model | License | Commercial Use |
|-------|---------|----------------|
| RetinaFace | MIT | Yes |
| BlazeFace | Apache 2.0 | Yes |
| MobileFaceNet | MIT | Yes |
| ArcFace | MIT | Yes |
| InsightFace models | Non-commercial | Check specific model |

## Troubleshooting

### Model Loading Fails

1. Verify ONNX format version (opset 11-17 recommended)
2. Check input/output tensor names match expected
3. Ensure file is not corrupted: `python -c "import onnx; onnx.load('model.onnx')"`

### Poor Detection Results

1. Ensure input normalization matches model training
2. Check image is RGB (not BGR)
3. Verify input dimensions match model expectations
4. Adjust confidence threshold (default: 0.5)

### Embedding Quality Issues

1. Face alignment is critical - ensure landmarks are correct
2. Check that input is 112x112 after alignment
3. Verify embedding is L2-normalized before comparison
4. Distance threshold typically: 0.4-0.6 for cosine distance

## References

- [ONNX Model Zoo](https://github.com/onnx/models)
- [InsightFace](https://github.com/deepinsight/insightface)
- [RetinaFace Paper](https://arxiv.org/abs/1905.00641)
- [ArcFace Paper](https://arxiv.org/abs/1801.07698)
- [MobileFaceNet Paper](https://arxiv.org/abs/1804.07573)