================================================================================
PRD — KYC MICROSERVICE : INSTALLATION & CONFIGURATION DES MODÈLES AI
================================================================================
Projet      : SaaS KYC — Microservice Python
Stack       : FastAPI + PaddleOCR-VL 0.9B + AuraFace-v1 + HyperFace
Serveur     : Linux Ubuntu 22.04 — 15 GB RAM — 8 cores CPU — PAS de GPU
Rédigé pour : Cursor AI (exécution autonome étape par étape)
Version     : 1.0.0
================================================================================

RÈGLES IMPÉRATIVES POUR CURSOR
────────────────────────────────────────────────────────────────────────────────
1. Exécuter chaque étape dans l'ORDRE exact indiqué.
2. Après chaque commande shell, vérifier le code de retour (exit code = 0).
3. Si une étape échoue, afficher l'erreur complète et s'arrêter. NE PAS continuer.
4. Créer les fichiers exactement aux chemins indiqués.
5. NE PAS modifier les versions des packages sans validation explicite.
6. Toujours activer le venv avant toute commande pip ou python.
7. La phrase "CHECKPOINT ✅" marque une vérification obligatoire avant de continuer.
================================================================================


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PHASE 0 — STRUCTURE DU PROJET
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Créer l'arborescence suivante :

    /opt/kyc-service/
    ├── venv/                    ← environnement Python isolé
    ├── models/
    │   ├── auraface/            ← modèle AuraFace-v1 (HuggingFace)
    │   └── hyperface/           ← modèle HyperFace (HuggingFace)
    ├── app/
    │   ├── __init__.py
    │   ├── main.py              ← point d'entrée FastAPI
    │   ├── ocr.py               ← module PaddleOCR-VL
    │   ├── face_match.py        ← module AuraFace + HyperFace
    │   └── utils.py             ← helpers image
    ├── tests/
    │   ├── test_ocr.py
    │   └── test_face_match.py
    ├── requirements.txt
    ├── .env
    ├── Makefile
    └── README.md

COMMANDES DE CRÉATION :

    sudo mkdir -p /opt/kyc-service/{models/auraface,models/hyperface,app,tests}
    sudo chown -R $USER:$USER /opt/kyc-service
    cd /opt/kyc-service
    touch app/__init__.py app/main.py app/ocr.py app/face_match.py app/utils.py
    touch tests/test_ocr.py tests/test_face_match.py
    touch requirements.txt .env Makefile README.md

CHECKPOINT ✅ : ls -la /opt/kyc-service/ doit montrer les 4 dossiers + fichiers.


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PHASE 1 — ENVIRONNEMENT SYSTÈME
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

ÉTAPE 1.1 — Mise à jour système et dépendances OS
────────────────────────────────────────────────────

    sudo apt update && sudo apt upgrade -y

    sudo apt install -y \
        python3.10 \
        python3.10-venv \
        python3.10-dev \
        python3-pip \
        git \
        curl \
        wget \
        libgl1-mesa-glx \
        libglib2.0-0 \
        libsm6 \
        libxext6 \
        libxrender-dev \
        libgomp1 \
        build-essential \
        pkg-config

CHECKPOINT ✅ : python3.10 --version doit afficher Python 3.10.x


ÉTAPE 1.2 — Créer l'environnement virtuel Python
──────────────────────────────────────────────────

    cd /opt/kyc-service
    python3.10 -m venv venv
    source venv/bin/activate

    # Mettre à jour pip, setuptools, wheel en premier
    pip install --upgrade pip setuptools wheel

CHECKPOINT ✅ : which python doit afficher /opt/kyc-service/venv/bin/python
CHECKPOINT ✅ : pip --version doit afficher pip 24.x ou supérieur


ÉTAPE 1.3 — Créer le fichier .env
───────────────────────────────────

Créer le fichier /opt/kyc-service/.env avec ce contenu EXACT :

-------- DÉBUT DU FICHIER .env --------
# KYC Microservice — Configuration
APP_ENV=production
APP_PORT=8000
APP_HOST=0.0.0.0
APP_WORKERS=1
APP_LOG_LEVEL=info

# Chemins des modèles
MODELS_DIR=/opt/kyc-service/models
AURAFACE_MODEL_DIR=/opt/kyc-service/models/auraface
HYPERFACE_MODEL_DIR=/opt/kyc-service/models/hyperface

# HuggingFace
HF_HOME=/opt/kyc-service/models
TRANSFORMERS_CACHE=/opt/kyc-service/models

# Seuils de décision KYC
FACE_MATCH_THRESHOLD_APPROVE=0.55
FACE_MATCH_THRESHOLD_REVIEW=0.40

# Timeouts (secondes)
OCR_TIMEOUT=30
FACE_MATCH_TIMEOUT=20
-------- FIN DU FICHIER .env --------


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PHASE 2 — INSTALLATION DES DÉPENDANCES PYTHON
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

ÉTAPE 2.1 — Créer requirements.txt
────────────────────────────────────

Écrire ce contenu EXACT dans /opt/kyc-service/requirements.txt :

-------- DÉBUT DU FICHIER requirements.txt --------
# ── Framework API ─────────────────────────────────
fastapi==0.115.0
uvicorn[standard]==0.30.6
python-multipart==0.0.9
python-dotenv==1.0.1
httpx==0.27.0

# ── PaddleOCR-VL ──────────────────────────────────
# NOTE : paddlepaddle CPU sera installé séparément (étape 2.2)
paddleocr==2.9.1

# ── AuraFace / InsightFace ────────────────────────
insightface==0.7.3
onnxruntime==1.18.1

# ── HyperFace ─────────────────────────────────────
torch==2.3.1
torchvision==0.18.1
face-alignment==1.4.1

# ── HuggingFace ───────────────────────────────────
huggingface_hub==0.24.6
transformers>=5.0.0
safetensors>=0.4.0

# ── Image / Vision ────────────────────────────────
Pillow==10.4.0
opencv-python-headless==4.10.0.84
numpy==1.26.4
scipy==1.13.1

# ── Utilitaires ───────────────────────────────────
aiofiles==24.1.0
pydantic==2.8.2
pydantic-settings==2.4.0
-------- FIN DU FICHIER requirements.txt --------


ÉTAPE 2.2 — Installer PaddlePaddle CPU (IMPORTANT : avant paddleocr)
──────────────────────────────────────────────────────────────────────

    source /opt/kyc-service/venv/bin/activate
    cd /opt/kyc-service

    # Installer PaddlePaddle version CPU (PAS la version GPU)
    pip install paddlepaddle==3.0.0 \
        -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

CHECKPOINT ✅ : python -c "import paddle; print(paddle.__version__)" doit afficher 3.0.0
CHECKPOINT ✅ : python -c "import paddle; print(paddle.device.get_device())" doit afficher "cpu"

    # Si l'erreur "network timeout" apparaît, utiliser ce miroir alternatif :
    # pip install paddlepaddle==3.0.0 -i https://pypi.tuna.tsinghua.edu.cn/simple


ÉTAPE 2.3 — Installer toutes les autres dépendances
─────────────────────────────────────────────────────

    pip install -r requirements.txt

    # NOTE : torch CPU uniquement (pas de +cu118 dans l'URL)
    # Si torch installe une version GPU par défaut, forcer CPU :
    # pip install torch==2.3.1+cpu torchvision==0.18.1+cpu \
    #     --index-url https://download.pytorch.org/whl/cpu

CHECKPOINT ✅ : Vérifier que tous les packages sont installés :
    pip list | grep -E "fastapi|paddleocr|insightface|torch|transformers|huggingface"
    # Doit afficher au moins 6 lignes de résultats


ÉTAPE 2.4 — Installer safetensors (version spéciale pour PaddleOCR-VL)
────────────────────────────────────────────────────────────────────────

    # Version nightly requise pour PaddleOCR-VL sur Linux x86_64
    pip install https://paddle-whl.bj.bcebos.com/nightly/cpu/safetensors/safetensors-0.6.2.dev0-cp310-cp310-linux_x86_64.whl

    # Si erreur de téléchargement, essayer la version standard :
    # pip install safetensors>=0.4.5

CHECKPOINT ✅ : python -c "import safetensors; print('safetensors OK')"


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PHASE 3 — TÉLÉCHARGEMENT ET CONFIGURATION DES MODÈLES
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

ÉTAPE 3.1 — Télécharger AuraFace-v1 depuis HuggingFace
────────────────────────────────────────────────────────

Créer le script /opt/kyc-service/scripts/download_auraface.py :

-------- DÉBUT DU SCRIPT download_auraface.py --------
"""
Télécharge AuraFace-v1 depuis HuggingFace.
Modèle : fal/AuraFace-v1
Licence : Apache 2.0 — Usage commercial autorisé
Taille  : ~250 MB
"""
import os
from pathlib import Path
from huggingface_hub import snapshot_download

TARGET_DIR = Path("/opt/kyc-service/models/auraface")
TARGET_DIR.mkdir(parents=True, exist_ok=True)

print("Téléchargement de AuraFace-v1...")
print(f"Destination : {TARGET_DIR}")

path = snapshot_download(
    repo_id="fal/AuraFace-v1",
    local_dir=str(TARGET_DIR),
    ignore_patterns=["*.git*", "*.md"],
)

print(f"✅ AuraFace-v1 téléchargé dans : {path}")
print("Fichiers présents :")
for f in TARGET_DIR.iterdir():
    print(f"  - {f.name} ({f.stat().st_size // 1024} KB)")
-------- FIN DU SCRIPT --------

    mkdir -p /opt/kyc-service/scripts
    # (écrire le fichier comme indiqué ci-dessus)
    source /opt/kyc-service/venv/bin/activate
    python /opt/kyc-service/scripts/download_auraface.py

CHECKPOINT ✅ : ls /opt/kyc-service/models/auraface/ doit lister des fichiers .onnx
CHECKPOINT ✅ : La taille totale du dossier doit être > 200 MB
    du -sh /opt/kyc-service/models/auraface/


ÉTAPE 3.2 — Télécharger HyperFace depuis HuggingFace
──────────────────────────────────────────────────────

Créer le script /opt/kyc-service/scripts/download_hyperface.py :

-------- DÉBUT DU SCRIPT download_hyperface.py --------
"""
Télécharge HyperFace-10k-LDM depuis HuggingFace.
Modèle : Idiap/HyperFace-10k-LDM
Licence : MIT — Usage commercial autorisé
Institut : Idiap Research Institute
"""
import os
from pathlib import Path
from huggingface_hub import snapshot_download

TARGET_DIR = Path("/opt/kyc-service/models/hyperface")
TARGET_DIR.mkdir(parents=True, exist_ok=True)

print("Téléchargement de HyperFace-10k-LDM...")
print(f"Destination : {TARGET_DIR}")

path = snapshot_download(
    repo_id="Idiap/HyperFace-10k-LDM",
    local_dir=str(TARGET_DIR),
    ignore_patterns=["*.git*"],
)

print(f"✅ HyperFace téléchargé dans : {path}")
print("Fichiers présents :")
for f in TARGET_DIR.iterdir():
    print(f"  - {f.name} ({f.stat().st_size // 1024} KB)")
-------- FIN DU SCRIPT --------

    python /opt/kyc-service/scripts/download_hyperface.py

CHECKPOINT ✅ : ls /opt/kyc-service/models/hyperface/ doit lister des fichiers


ÉTAPE 3.3 — Pré-télécharger PaddleOCR-VL (premier run)
────────────────────────────────────────────────────────

Créer le script /opt/kyc-service/scripts/download_paddleocr_vl.py :

-------- DÉBUT DU SCRIPT download_paddleocr_vl.py --------
"""
Initialise PaddleOCR-VL et télécharge les poids.
Le modèle est téléchargé automatiquement lors du premier appel.
Taille : ~1.8 GB
Cela peut prendre 5 à 15 minutes selon la connexion.
"""
import os
os.environ["HF_HOME"] = "/opt/kyc-service/models"

print("Initialisation de PaddleOCR-VL 0.9B...")
print("Téléchargement des poids (peut prendre plusieurs minutes)...")

try:
    from paddleocr import PaddleOCRVL
    pipeline = PaddleOCRVL()
    print("✅ PaddleOCR-VL initialisé et prêt !")
    print("Le modèle est mis en cache dans /opt/kyc-service/models/")
except Exception as e:
    print(f"❌ ERREUR : {e}")
    print("Vérifier l'installation de paddlepaddle et paddleocr")
    raise
-------- FIN DU SCRIPT --------

    python /opt/kyc-service/scripts/download_paddleocr_vl.py

    # NOTE : cette étape peut durer 5-15 minutes (téléchargement ~1.8 GB)
    # C'est normal. Attendre la fin sans interrompre.

CHECKPOINT ✅ : "✅ PaddleOCR-VL initialisé et prêt !" doit apparaître dans la sortie
CHECKPOINT ✅ : du -sh /opt/kyc-service/models/ doit afficher > 2 GB


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PHASE 4 — ÉCRITURE DES MODULES PYTHON
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

ÉTAPE 4.1 — Écrire app/utils.py
─────────────────────────────────

Écrire ce contenu dans /opt/kyc-service/app/utils.py :

-------- DÉBUT DU FICHIER app/utils.py --------
"""
Utilitaires partagés : conversion d'images, validation, helpers.
"""
import io
import numpy as np
from PIL import Image


def bytes_to_pil(image_bytes: bytes) -> Image.Image:
    """Convertit des bytes en image PIL."""
    return Image.open(io.BytesIO(image_bytes)).convert("RGB")


def pil_to_cv2(pil_img: Image.Image) -> np.ndarray:
    """Convertit PIL Image (RGB) en tableau NumPy BGR pour OpenCV/InsightFace."""
    rgb = np.array(pil_img)
    return rgb[:, :, ::-1].copy()   # RGB → BGR


def cosine_similarity(v1: np.ndarray, v2: np.ndarray) -> float:
    """
    Calcule la similarité cosinus entre deux embeddings normalisés.
    Retourne un score entre -1 et 1 (1 = identique).
    """
    from scipy.spatial.distance import cosine
    return float(1 - cosine(v1, v2))


def score_to_decision(score: float, threshold_approve: float = 0.55,
                      threshold_review: float = 0.40) -> dict:
    """
    Convertit un score de similarité en décision KYC.

    APPROVED  : score >= threshold_approve  → visages correspondent
    REVIEW    : threshold_review <= score < threshold_approve → vérification manuelle
    REJECTED  : score < threshold_review   → visages différents
    """
    if score >= threshold_approve:
        return {"decision": "APPROVED", "confidence": "HIGH"}
    elif score >= threshold_review:
        return {"decision": "REVIEW",   "confidence": "MEDIUM"}
    else:
        return {"decision": "REJECTED", "confidence": "LOW"}


def validate_image_size(image_bytes: bytes, max_mb: int = 5) -> bool:
    """Valide que l'image ne dépasse pas la taille maximale."""
    return len(image_bytes) <= max_mb * 1024 * 1024
-------- FIN DU FICHIER app/utils.py --------


ÉTAPE 4.2 — Écrire app/ocr.py
───────────────────────────────

Écrire ce contenu dans /opt/kyc-service/app/ocr.py :

-------- DÉBUT DU FICHIER app/ocr.py --------
"""
Module OCR — PaddleOCR-VL 0.9B
Licence : Apache 2.0
Usage   : Extraction du texte des documents d'identité (CNI, passeport, etc.)
"""
import os
import logging
import tempfile
from pathlib import Path
from PIL import Image

logger = logging.getLogger("kyc.ocr")

# Instance globale (chargée une seule fois au startup)
_ocr_pipeline = None


def load_ocr_model() -> None:
    """
    Charge PaddleOCR-VL en mémoire.
    À appeler UNE SEULE FOIS au démarrage de FastAPI.
    """
    global _ocr_pipeline
    if _ocr_pipeline is not None:
        logger.info("PaddleOCR-VL déjà chargé, skip.")
        return

    os.environ["HF_HOME"] = "/opt/kyc-service/models"
    logger.info("Chargement de PaddleOCR-VL 0.9B...")

    from paddleocr import PaddleOCRVL
    _ocr_pipeline = PaddleOCRVL()

    logger.info("✅ PaddleOCR-VL chargé et prêt")


def is_ocr_ready() -> bool:
    """Retourne True si le modèle OCR est chargé."""
    return _ocr_pipeline is not None


def extract_text(pil_image: Image.Image) -> dict:
    """
    Extrait le texte d'une image de document avec PaddleOCR-VL.

    Args:
        pil_image : Image PIL du document d'identité

    Returns:
        dict avec les clés :
            - raw_text    : texte brut extrait (str)
            - lines       : liste des lignes de texte (list)
            - success     : booléen
            - error       : message d'erreur si applicable (str|None)
    """
    if _ocr_pipeline is None:
        return {"raw_text": "", "lines": [], "success": False,
                "error": "OCR model not loaded"}

    tmp_path = None
    try:
        # PaddleOCR-VL travaille avec des chemins de fichiers
        with tempfile.NamedTemporaryFile(
            suffix=".png", delete=False, dir="/tmp"
        ) as tmp:
            pil_image.save(tmp.name, format="PNG")
            tmp_path = tmp.name

        results = _ocr_pipeline.predict(tmp_path)

        lines = []
        for res in results:
            if hasattr(res, "rec_texts") and res.rec_texts:
                lines.extend([t for t in res.rec_texts if t.strip()])

        raw_text = "\n".join(lines)
        logger.info(f"OCR extrait {len(lines)} lignes de texte")

        return {
            "raw_text": raw_text,
            "lines":    lines,
            "success":  True,
            "error":    None,
        }

    except Exception as e:
        logger.error(f"Erreur OCR : {e}")
        return {"raw_text": "", "lines": [], "success": False, "error": str(e)}

    finally:
        if tmp_path and Path(tmp_path).exists():
            Path(tmp_path).unlink()
-------- FIN DU FICHIER app/ocr.py --------


ÉTAPE 4.3 — Écrire app/face_match.py
──────────────────────────────────────

Écrire ce contenu dans /opt/kyc-service/app/face_match.py :

-------- DÉBUT DU FICHIER app/face_match.py --------
"""
Module Face Matching — AuraFace-v1 (primaire) + HyperFace (fallback)
Licences : Apache 2.0 (AuraFace) + MIT (HyperFace)
Usage    : Comparer le visage d'un document d'identité avec un selfie
"""
import os
import logging
import numpy as np
from pathlib import Path
from PIL import Image
from typing import Optional

from app.utils import pil_to_cv2, cosine_similarity, score_to_decision

logger = logging.getLogger("kyc.face_match")

# Instances globales
_aura_face_app = None
_hyperface_model = None

MODELS_DIR = Path("/opt/kyc-service/models")


# ─── Chargement AuraFace ──────────────────────────────────────────────────────

def load_auraface() -> None:
    """
    Charge AuraFace-v1 via InsightFace.
    Utilise CPUExecutionProvider car pas de GPU sur ce serveur.
    """
    global _aura_face_app
    if _aura_face_app is not None:
        return

    logger.info("Chargement de AuraFace-v1...")
    from insightface.app import FaceAnalysis

    _aura_face_app = FaceAnalysis(
        name="auraface",
        providers=["CPUExecutionProvider"],
        root=str(MODELS_DIR),      # cherche dans models/auraface/
    )
    # ctx_id=-1 = CPU, det_size = taille de détection (640x640 recommandé)
    _aura_face_app.prepare(ctx_id=-1, det_size=(640, 640))
    logger.info("✅ AuraFace-v1 chargé")


# ─── Chargement HyperFace ─────────────────────────────────────────────────────

def load_hyperface() -> None:
    """
    Charge HyperFace (Idiap Research Institute — MIT License).
    Utilisé comme modèle de vérification croisée ou fallback.
    """
    global _hyperface_model
    if _hyperface_model is not None:
        return

    logger.info("Chargement de HyperFace...")
    try:
        import torch
        from transformers import AutoModel, AutoFeatureExtractor

        model_path = str(MODELS_DIR / "hyperface")
        _hyperface_model = {
            "model":     AutoModel.from_pretrained(model_path),
            "extractor": AutoFeatureExtractor.from_pretrained(model_path),
        }
        _hyperface_model["model"].eval()
        logger.info("✅ HyperFace chargé")
    except Exception as e:
        # HyperFace est optionnel — log warning mais ne pas planter
        logger.warning(f"HyperFace non disponible (optionnel) : {e}")
        _hyperface_model = None


def load_all_face_models() -> None:
    """Charge tous les modèles de face matching. Appeler au startup."""
    load_auraface()
    load_hyperface()


def is_face_ready() -> bool:
    return _aura_face_app is not None


# ─── Extraction d'embedding ───────────────────────────────────────────────────

def _get_embedding_aura(cv2_image: np.ndarray) -> Optional[np.ndarray]:
    """
    Extrait l'embedding facial normalisé avec AuraFace.
    Retourne None si aucun visage détecté.
    """
    if _aura_face_app is None:
        raise RuntimeError("AuraFace non chargé")

    faces = _aura_face_app.get(cv2_image)
    if not faces:
        return None

    # Prendre le visage avec le plus grand bounding box (le principal)
    largest_face = max(
        faces,
        key=lambda f: (f.bbox[2] - f.bbox[0]) * (f.bbox[3] - f.bbox[1])
    )
    return largest_face.normed_embedding   # vecteur normalisé 512-D


# ─── Face Match principal ─────────────────────────────────────────────────────

def compare_faces(
    document_img: Image.Image,
    selfie_img:   Image.Image,
    threshold_approve: float = 0.55,
    threshold_review:  float = 0.40,
) -> dict:
    """
    Compare le visage sur le document avec le selfie.

    Args:
        document_img      : image PIL du document d'identité
        selfie_img        : image PIL du selfie
        threshold_approve : seuil pour APPROVED (défaut 0.55)
        threshold_review  : seuil pour REVIEW (défaut 0.40)

    Returns:
        dict avec les clés :
            - success          : bool
            - model_used       : nom du modèle utilisé
            - similarity_score : float entre 0 et 1
            - decision         : APPROVED | REVIEW | REJECTED
            - confidence       : HIGH | MEDIUM | LOW
            - doc_face_found   : bool (visage détecté sur le document)
            - selfie_face_found: bool (visage détecté sur le selfie)
            - error            : str | None
    """
    result = {
        "success":           False,
        "model_used":        "AuraFace-v1",
        "similarity_score":  0.0,
        "decision":          "REJECTED",
        "confidence":        "LOW",
        "doc_face_found":    False,
        "selfie_face_found": False,
        "error":             None,
    }

    try:
        doc_cv2    = pil_to_cv2(document_img)
        selfie_cv2 = pil_to_cv2(selfie_img)

        # Extraire les embeddings
        doc_emb    = _get_embedding_aura(doc_cv2)
        selfie_emb = _get_embedding_aura(selfie_cv2)

        result["doc_face_found"]    = doc_emb is not None
        result["selfie_face_found"] = selfie_emb is not None

        if doc_emb is None:
            result["error"] = "Aucun visage détecté sur le document"
            return result

        if selfie_emb is None:
            result["error"] = "Aucun visage détecté sur le selfie"
            return result

        # Calculer la similarité cosinus
        score    = cosine_similarity(doc_emb, selfie_emb)
        decision = score_to_decision(score, threshold_approve, threshold_review)

        result.update({
            "success":          True,
            "similarity_score": round(score, 4),
            **decision,
        })

    except Exception as e:
        logger.error(f"Erreur face matching : {e}")
        result["error"] = str(e)

    return result
-------- FIN DU FICHIER app/face_match.py --------


ÉTAPE 4.4 — Écrire app/main.py
────────────────────────────────

Écrire ce contenu dans /opt/kyc-service/app/main.py :

-------- DÉBUT DU FICHIER app/main.py --------
"""
KYC Microservice — Point d'entrée FastAPI
Modèles : PaddleOCR-VL 0.9B + AuraFace-v1 + HyperFace
Licence : Apache 2.0 + MIT
"""
import os
import io
import time
import logging
from contextlib import asynccontextmanager
from dotenv import load_dotenv

from fastapi import FastAPI, File, UploadFile, HTTPException
from fastapi.responses import JSONResponse
from PIL import Image

# Charger les variables d'environnement
load_dotenv("/opt/kyc-service/.env")

# Modules KYC
from app.ocr import load_ocr_model, extract_text, is_ocr_ready
from app.face_match import load_all_face_models, compare_faces, is_face_ready
from app.utils import bytes_to_pil, validate_image_size

# ── Logging ───────────────────────────────────────────────────────────────────
logging.basicConfig(
    level=getattr(logging, os.getenv("APP_LOG_LEVEL", "info").upper()),
    format="%(asctime)s [%(levelname)s] %(name)s : %(message)s",
)
logger = logging.getLogger("kyc.main")


# ── Lifespan : chargement des modèles au démarrage ────────────────────────────
@asynccontextmanager
async def lifespan(app: FastAPI):
    logger.info("═══ Démarrage du microservice KYC ═══")

    logger.info("→ Chargement PaddleOCR-VL...")
    load_ocr_model()

    logger.info("→ Chargement AuraFace-v1 + HyperFace...")
    load_all_face_models()

    logger.info("═══ Tous les modèles sont prêts ✅ ═══")
    yield
    logger.info("Arrêt du microservice KYC.")


# ── Application FastAPI ───────────────────────────────────────────────────────
app = FastAPI(
    title="KYC Microservice",
    description="Document OCR + Face Matching pour SaaS KYC",
    version="1.0.0",
    lifespan=lifespan,
)


# ─────────────────────────────────────────────────────────────────────────────
# ENDPOINTS
# ─────────────────────────────────────────────────────────────────────────────

@app.get("/health")
def health():
    """
    Vérifie l'état du service et des modèles.
    Appelé par Laravel pour vérifier la disponibilité du microservice.
    """
    return {
        "status": "ok",
        "models": {
            "paddleocr_vl": is_ocr_ready(),
            "auraface_v1":  is_face_ready(),
        }
    }


@app.post("/ocr")
async def endpoint_ocr(file: UploadFile = File(...)):
    """
    Extrait le texte d'un document d'identité.
    Utilisé pour lire le nom, la date de naissance, le numéro de document, etc.

    Input  : image (JPG/PNG, max 5 MB)
    Output : texte brut + lignes extraites
    """
    start    = time.time()
    contents = await file.read()

    if not validate_image_size(contents):
        raise HTTPException(413, "Image trop volumineuse (max 5 MB)")

    pil_img = bytes_to_pil(contents)
    result  = extract_text(pil_img)
    elapsed = round(time.time() - start, 2)

    return JSONResponse({
        "success":     result["success"],
        "elapsed_sec": elapsed,
        "ocr":         result,
    })


@app.post("/face-match")
async def endpoint_face_match(
    document: UploadFile = File(...),
    selfie:   UploadFile = File(...),
):
    """
    Compare le visage du document avec le selfie.

    Input  : document (image), selfie (image)
    Output : score de similarité + décision KYC
    """
    start = time.time()

    doc_bytes    = await document.read()
    selfie_bytes = await selfie.read()

    if not validate_image_size(doc_bytes) or not validate_image_size(selfie_bytes):
        raise HTTPException(413, "Image trop volumineuse (max 5 MB)")

    doc_pil    = bytes_to_pil(doc_bytes)
    selfie_pil = bytes_to_pil(selfie_bytes)

    threshold_approve = float(os.getenv("FACE_MATCH_THRESHOLD_APPROVE", 0.55))
    threshold_review  = float(os.getenv("FACE_MATCH_THRESHOLD_REVIEW",  0.40))

    result  = compare_faces(doc_pil, selfie_pil, threshold_approve, threshold_review)
    elapsed = round(time.time() - start, 2)

    if not result["doc_face_found"] or not result["selfie_face_found"]:
        raise HTTPException(422, result.get("error", "Visage non détecté"))

    return JSONResponse({
        "success":     result["success"],
        "elapsed_sec": elapsed,
        **result,
    })


@app.post("/kyc/verify")
async def endpoint_kyc_verify(
    document: UploadFile = File(...),
    selfie:   UploadFile = File(...),
):
    """
    Pipeline KYC COMPLET :
    Étape 1 → OCR du document (extraction texte)
    Étape 2 → Face matching (document ↔ selfie)
    Étape 3 → Décision finale consolidée

    C'est l'endpoint principal appelé par Laravel.

    Input  : document (image), selfie (image)
    Output : résultat complet KYC avec OCR + face match + décision
    """
    start = time.time()

    doc_bytes    = await document.read()
    selfie_bytes = await selfie.read()

    if not validate_image_size(doc_bytes) or not validate_image_size(selfie_bytes):
        raise HTTPException(413, "Image trop volumineuse (max 5 MB)")

    doc_pil    = bytes_to_pil(doc_bytes)
    selfie_pil = bytes_to_pil(selfie_bytes)

    # ── Étape 1 : OCR ─────────────────────────────────────────────────────────
    ocr_result = extract_text(doc_pil)

    # ── Étape 2 : Face Matching ────────────────────────────────────────────────
    threshold_approve = float(os.getenv("FACE_MATCH_THRESHOLD_APPROVE", 0.55))
    threshold_review  = float(os.getenv("FACE_MATCH_THRESHOLD_REVIEW",  0.40))
    face_result       = compare_faces(
        doc_pil, selfie_pil, threshold_approve, threshold_review
    )

    # ── Étape 3 : Décision finale ──────────────────────────────────────────────
    elapsed        = round(time.time() - start, 2)
    final_decision = face_result.get("decision", "REJECTED")
    kyc_passed     = final_decision == "APPROVED"

    return JSONResponse({
        "success":        True,
        "elapsed_sec":    elapsed,
        "kyc_passed":     kyc_passed,
        "final_decision": final_decision,
        "steps": {
            "ocr":        ocr_result,
            "face_match": face_result,
        },
    })


# ── Démarrage direct ──────────────────────────────────────────────────────────
if __name__ == "__main__":
    import uvicorn
    uvicorn.run(
        "app.main:app",
        host=os.getenv("APP_HOST", "0.0.0.0"),
        port=int(os.getenv("APP_PORT", 8000)),
        workers=int(os.getenv("APP_WORKERS", 1)),
        reload=False,
    )
-------- FIN DU FICHIER app/main.py --------


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PHASE 5 — TESTS DE VALIDATION
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

ÉTAPE 5.1 — Écrire les tests
──────────────────────────────

Écrire ce contenu dans /opt/kyc-service/tests/test_ocr.py :

-------- DÉBUT DU FICHIER tests/test_ocr.py --------
"""
Tests unitaires — Module OCR (PaddleOCR-VL)
Lancer avec : python -m pytest tests/test_ocr.py -v
"""
import sys
sys.path.insert(0, "/opt/kyc-service")

from PIL import Image, ImageDraw, ImageFont
from app.ocr import load_ocr_model, extract_text, is_ocr_ready


def create_test_document_image() -> Image.Image:
    """Crée une image de document factice avec du texte."""
    img  = Image.new("RGB", (800, 500), color=(240, 240, 240))
    draw = ImageDraw.Draw(img)
    draw.rectangle([20, 20, 780, 480], outline=(100, 100, 100), width=3)
    draw.text((40, 60),  "CARTE NATIONALE D'IDENTITÉ",        fill=(0, 0, 0))
    draw.text((40, 120), "NOM : DUPONT",                      fill=(0, 0, 0))
    draw.text((40, 160), "PRÉNOM : Jean",                     fill=(0, 0, 0))
    draw.text((40, 200), "DATE DE NAISSANCE : 15/03/1990",    fill=(0, 0, 0))
    draw.text((40, 240), "NUMÉRO : 123456789",                fill=(0, 0, 0))
    return img


def test_ocr_model_loads():
    """Vérifie que le modèle OCR se charge sans erreur."""
    load_ocr_model()
    assert is_ocr_ready(), "Le modèle OCR devrait être chargé"
    print("✅ test_ocr_model_loads : PASS")


def test_ocr_extracts_text():
    """Vérifie que l'OCR extrait du texte d'une image."""
    load_ocr_model()
    test_img = create_test_document_image()
    result   = extract_text(test_img)

    assert result["success"] is True,        f"OCR échoué : {result['error']}"
    assert isinstance(result["raw_text"], str), "raw_text doit être une chaîne"
    assert len(result["raw_text"]) > 0,      "OCR doit extraire du texte"
    assert isinstance(result["lines"], list), "lines doit être une liste"
    print(f"✅ test_ocr_extracts_text : PASS — {len(result['lines'])} lignes extraites")
    print(f"   Texte : {result['raw_text'][:100]}...")


def test_ocr_returns_structure():
    """Vérifie la structure du retour de extract_text."""
    load_ocr_model()
    test_img = create_test_document_image()
    result   = extract_text(test_img)

    for key in ["raw_text", "lines", "success", "error"]:
        assert key in result, f"Clé manquante : {key}"
    print("✅ test_ocr_returns_structure : PASS")


if __name__ == "__main__":
    test_ocr_model_loads()
    test_ocr_extracts_text()
    test_ocr_returns_structure()
    print("\n🎉 Tous les tests OCR passent !")
-------- FIN DU FICHIER tests/test_ocr.py --------


Écrire ce contenu dans /opt/kyc-service/tests/test_face_match.py :

-------- DÉBUT DU FICHIER tests/test_face_match.py --------
"""
Tests unitaires — Module Face Matching (AuraFace)
Lancer avec : python -m pytest tests/test_face_match.py -v
"""
import sys
sys.path.insert(0, "/opt/kyc-service")

import numpy as np
from PIL import Image
from app.face_match import load_all_face_models, compare_faces, is_face_ready
from app.utils import cosine_similarity, score_to_decision


def create_blank_face_image(color=(220, 180, 150)) -> Image.Image:
    """Crée une image de visage factice (carré coloré)."""
    img  = Image.new("RGB", (640, 640), color=color)
    return img


def test_face_models_load():
    """Vérifie que les modèles de face matching se chargent."""
    load_all_face_models()
    assert is_face_ready(), "AuraFace devrait être chargé"
    print("✅ test_face_models_load : PASS")


def test_cosine_similarity():
    """Vérifie la fonction de similarité cosinus."""
    v1 = np.array([1.0, 0.0, 0.0])
    v2 = np.array([1.0, 0.0, 0.0])
    assert cosine_similarity(v1, v2) == 1.0, "Vecteurs identiques → similarité 1.0"

    v3 = np.array([0.0, 1.0, 0.0])
    assert cosine_similarity(v1, v3) == 0.0, "Vecteurs orthogonaux → similarité 0.0"
    print("✅ test_cosine_similarity : PASS")


def test_score_to_decision():
    """Vérifie la conversion score → décision."""
    assert score_to_decision(0.80)["decision"] == "APPROVED"
    assert score_to_decision(0.55)["decision"] == "APPROVED"
    assert score_to_decision(0.50)["decision"] == "REVIEW"
    assert score_to_decision(0.40)["decision"] == "REVIEW"
    assert score_to_decision(0.20)["decision"] == "REJECTED"
    print("✅ test_score_to_decision : PASS")


def test_compare_faces_no_face_detected():
    """
    Vérifie que compare_faces gère le cas sans visage détectable.
    Une image uniforme ne devrait pas contenir de visage.
    """
    load_all_face_models()
    blank_img = create_blank_face_image()
    result    = compare_faces(blank_img, blank_img)

    # Un carré uni ne contient pas de visage → face_found = False
    # Le modèle doit retourner une erreur explicite, pas crasher
    assert "error" in result or result["doc_face_found"] is False
    print("✅ test_compare_faces_no_face_detected : PASS")


def test_compare_faces_result_structure():
    """Vérifie la structure du retour de compare_faces."""
    load_all_face_models()
    img1 = create_blank_face_image()
    img2 = create_blank_face_image()
    result = compare_faces(img1, img2)

    required_keys = [
        "success", "model_used", "similarity_score",
        "decision", "confidence", "doc_face_found",
        "selfie_face_found", "error",
    ]
    for key in required_keys:
        assert key in result, f"Clé manquante dans le résultat : {key}"
    print("✅ test_compare_faces_result_structure : PASS")


if __name__ == "__main__":
    test_face_models_load()
    test_cosine_similarity()
    test_score_to_decision()
    test_compare_faces_no_face_detected()
    test_compare_faces_result_structure()
    print("\n🎉 Tous les tests Face Match passent !")
-------- FIN DU FICHIER tests/test_face_match.py --------


ÉTAPE 5.2 — Installer pytest et lancer les tests
──────────────────────────────────────────────────

    source /opt/kyc-service/venv/bin/activate
    pip install pytest

    cd /opt/kyc-service

    # Test OCR
    python tests/test_ocr.py

    # Test Face Matching
    python tests/test_face_match.py

CHECKPOINT ✅ : "🎉 Tous les tests OCR passent !" doit apparaître
CHECKPOINT ✅ : "🎉 Tous les tests Face Match passent !" doit apparaître


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PHASE 6 — DÉMARRAGE ET VALIDATION FINALE DU SERVICE
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

ÉTAPE 6.1 — Créer le Makefile
───────────────────────────────

Écrire ce contenu dans /opt/kyc-service/Makefile :

-------- DÉBUT DU FICHIER Makefile --------
.PHONY: install download start test health stop

VENV = /opt/kyc-service/venv/bin
PYTHON = $(VENV)/python
PIP = $(VENV)/pip

install:
	$(PIP) install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
	$(PIP) install -r requirements.txt

download:
	$(PYTHON) scripts/download_auraface.py
	$(PYTHON) scripts/download_hyperface.py
	$(PYTHON) scripts/download_paddleocr_vl.py

start:
	$(VENV)/uvicorn app.main:app \
		--host 127.0.0.1 \
		--port 8000 \
		--workers 1 \
		--log-level info

start-bg:
	nohup $(VENV)/uvicorn app.main:app \
		--host 127.0.0.1 \
		--port 8000 \
		--workers 1 \
		> /var/log/kyc-service.log 2>&1 &

test:
	$(PYTHON) tests/test_ocr.py
	$(PYTHON) tests/test_face_match.py

health:
	curl -s http://127.0.0.1:8000/health | python3 -m json.tool

stop:
	pkill -f "uvicorn app.main:app" || true
-------- FIN DU FICHIER Makefile --------


ÉTAPE 6.2 — Démarrer le microservice
──────────────────────────────────────

    # Terminal 1 : démarrer le service (garde la main)
    source /opt/kyc-service/venv/bin/activate
    cd /opt/kyc-service
    uvicorn app.main:app --host 127.0.0.1 --port 8000 --workers 1

    # Terminal 2 : tester le health check (attendre ~60s le chargement des modèles)
    curl http://127.0.0.1:8000/health

CHECKPOINT ✅ : La réponse doit être :
    {
        "status": "ok",
        "models": {
            "paddleocr_vl": true,
            "auraface_v1":  true
        }
    }


ÉTAPE 6.3 — Test de l'endpoint /ocr (curl)
────────────────────────────────────────────

    # Tester avec une image de test (remplacer test_document.jpg par une vraie image)
    curl -X POST http://127.0.0.1:8000/ocr \
        -F "file=@/chemin/vers/test_document.jpg" \
        | python3 -m json.tool

CHECKPOINT ✅ : La réponse doit contenir "success": true et du texte dans "raw_text"


ÉTAPE 6.4 — Test de l'endpoint /kyc/verify (curl)
───────────────────────────────────────────────────

    curl -X POST http://127.0.0.1:8000/kyc/verify \
        -F "document=@/chemin/vers/document.jpg" \
        -F "selfie=@/chemin/vers/selfie.jpg" \
        | python3 -m json.tool

CHECKPOINT ✅ : La réponse doit contenir "kyc_passed" et "final_decision"


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PHASE 7 — CONFIGURATION SYSTEMD (SERVICE AU DÉMARRAGE)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

ÉTAPE 7.1 — Créer le service systemd
──────────────────────────────────────

    sudo nano /etc/systemd/system/kyc-service.service

Écrire ce contenu dans le fichier systemd :

-------- DÉBUT DU FICHIER kyc-service.service --------
[Unit]
Description=KYC Microservice — PaddleOCR-VL + AuraFace
After=network.target
Wants=network-online.target

[Service]
Type=simple
User=www-data
Group=www-data
WorkingDirectory=/opt/kyc-service
Environment="PATH=/opt/kyc-service/venv/bin"
Environment="HF_HOME=/opt/kyc-service/models"
ExecStart=/opt/kyc-service/venv/bin/uvicorn app.main:app \
    --host 127.0.0.1 \
    --port 8000 \
    --workers 1 \
    --log-level info
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal
SyslogIdentifier=kyc-service

# Limites mémoire (6 GB max pour les modèles AI)
MemoryLimit=6G

[Install]
WantedBy=multi-user.target
-------- FIN DU FICHIER kyc-service.service --------


ÉTAPE 7.2 — Activer et démarrer le service
────────────────────────────────────────────

    sudo chown -R www-data:www-data /opt/kyc-service
    sudo systemctl daemon-reload
    sudo systemctl enable kyc-service
    sudo systemctl start kyc-service

    # Vérifier le statut
    sudo systemctl status kyc-service

    # Voir les logs en temps réel
    sudo journalctl -u kyc-service -f

CHECKPOINT ✅ : sudo systemctl status kyc-service doit afficher "active (running)"
CHECKPOINT ✅ : curl http://127.0.0.1:8000/health doit répondre {"status": "ok"}


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PHASE 8 — RÉSUMÉ FINAL ET PASSAGE À LARAVEL
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

À la fin de cette phase, voici ce qui doit être opérationnel :

MODÈLES INSTALLÉS :
──────────────────
  ✅ PaddleOCR-VL 0.9B  — /opt/kyc-service/models/  (via paddleocr auto-cache)
  ✅ AuraFace-v1         — /opt/kyc-service/models/auraface/
  ✅ HyperFace-10k-LDM  — /opt/kyc-service/models/hyperface/

ENDPOINTS DISPONIBLES :
───────────────────────
  GET  http://127.0.0.1:8000/health        → état des modèles
  POST http://127.0.0.1:8000/ocr           → extraction texte document
  POST http://127.0.0.1:8000/face-match    → comparaison visages uniquement
  POST http://127.0.0.1:8000/kyc/verify    → pipeline KYC complet ← PRINCIPAL

STRUCTURE FINALE :
──────────────────
  /opt/kyc-service/
  ├── venv/
  ├── models/         (> 2 GB — modèles AI)
  ├── app/
  │   ├── main.py     (FastAPI)
  │   ├── ocr.py      (PaddleOCR-VL)
  │   ├── face_match.py (AuraFace + HyperFace)
  │   └── utils.py    (helpers)
  ├── tests/
  ├── requirements.txt
  └── .env

PROCHAINE ÉTAPE → INTÉGRATION LARAVEL :
────────────────────────────────────────
  Le microservice écoute sur http://127.0.0.1:8000
  Laravel l'appellera via Http::post() avec multipart/form-data
  Les fichiers à créer côté Laravel :
    - app/Services/KycService.php
    - app/Jobs/ProcessKycVerification.php
    - app/Http/Controllers/KycController.php
    - database/migrations/xxxx_create_kyc_verifications_table.php
    - config/kyc.php
    - (ajouter dans .env Laravel) KYC_MICROSERVICE_URL=http://127.0.0.1:8000

  Confirmer "MICROSERVICE OK" pour démarrer la phase Laravel.

================================================================================
FIN DU PRD — INSTALLATION ET CONFIGURATION DES MODÈLES KYC
================================================================================