mirror of
https://github.com/kharonsec/br-acc
synced 2026-04-25 17:15:02 +02:00
Servidores have LGPD-masked CPFs (only 6 middle digits visible). This adds two-layer SAME_AS matching to link 739K servidores to TSE/CNPJ persons: - Phase 0: pre-compute cpf_middle6 on existing full-CPF Person nodes - Phase 4: partial CPF + exact name match (confidence 0.95) - Phase 5: unique name-only match for classified servidores (confidence 0.85) Integration tests against real Neo4j caught and fixed a Cypher bug: MERGE cannot use list index (targets[0]) directly — needs WITH alias first. Also: make link-persons target, cpf_middle6/cpf_partial indexes, testcontainers conftest fix, neutrality fix in value_sanitization.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>