# Implementation Spec: Face + Neck Compositing for MORO Virtual Try-On (Local Mode)

**File to modify:** `/mnt/gxo_volume_01/kimono-pipeline/main.py`  
**Python interpreter:** `/mnt/gxo_volume_01/kimono-pipeline/venv/bin/python`  
**Goal:** Replace the current `provider=local` path (face-only swap) with a face+neck composite that blends naturally into the kimono collar.

---

## Background

The current local path calls `InsightFace inswapper_128` which only composites the face bounding box region. This leaves a visible seam at the jaw/chin because:
- The user's neck skin tone may differ from the model's
- The hard cutoff at the jaw creates an unnatural boundary

The proposed approach extends the composite mask downward to cover the neck, applies color harmonization so the skin tones match, and fades out at the collar line.

No new dependencies are required. Everything uses the existing `insightface`, `onnxruntime`, `opencv-python`, and `numpy` packages already installed in the venv.

---

## Architecture Overview

```
POST /swap  (provider=local)
        │
        ├─ _decode() both images
        ├─ _biggest() to get the largest face in each image
        │
        ├─ inswapper.get()         ← existing face swap (no change)
        │
        ├─ _build_neck_mask()      ← NEW: face bbox + neck extension mask
        ├─ _color_harmonize()      ← NEW: LAB color matching at boundary
        ├─ _composite()            ← NEW: alpha blend with the mask
        │
        └─ _enhance()              ← existing GFPGAN (no change)
```

---

## Step-by-Step Implementation

### Step 1 — `_build_neck_mask()`

This function generates an alpha mask that covers the face bounding box plus a tapered extension downward to the collar.

```python
def _build_neck_mask(
    img_shape: tuple,
    bbox: np.ndarray,
    neck_ratio: float = 0.65,
    neck_width_ratio: float = 0.42,
    blur_ksize: int = 61,
) -> np.ndarray:
    """
    Returns a uint8 mask [0-255] the same H×W as img_shape.

    bbox:             face bounding box [x1, y1, x2, y2] from InsightFace
    neck_ratio:       how far below bbox bottom to extend, as a fraction of
                      face height. 0.65 = 65% of face height.
    neck_width_ratio: width of the neck at the collar relative to face width.
                      0.42 is a good default (neck is narrower than face).
    blur_ksize:       Gaussian blur kernel for soft edges (must be odd).
    """
    h, w = img_shape[:2]
    x1, y1, x2, y2 = (int(v) for v in bbox)

    face_h = y2 - y1
    face_w = x2 - x1
    cx = (x1 + x2) // 2

    mask = np.zeros((h, w), dtype=np.uint8)

    # --- Face ellipse (same as _enhance's current mask) ---
    pad = 10  # small outward padding to avoid hard edges at ears/hairline
    cv2.ellipse(
        mask,
        (cx, (y1 + y2) // 2),
        (face_w // 2 + pad, face_h // 2 + pad),
        0, 0, 360, 255, -1,
    )

    # --- Neck trapezoid: wider at chin, narrower at collar ---
    chin_half_w = int(face_w * 0.36)
    collar_half_w = int(face_w * neck_width_ratio / 2)
    neck_bottom = min(y2 + int(face_h * neck_ratio), h - 1)

    pts = np.array([
        [cx - chin_half_w, y2],
        [cx + chin_half_w, y2],
        [cx + collar_half_w, neck_bottom],
        [cx - collar_half_w, neck_bottom],
    ], dtype=np.int32)
    cv2.fillPoly(mask, [pts], 255)

    # --- Small ellipse at collar line for a rounded fade-out bottom ---
    collar_oval_h = max(int(face_h * 0.08), 6)
    cv2.ellipse(
        mask,
        (cx, neck_bottom),
        (collar_half_w, collar_oval_h),
        0, 0, 360, 255, -1,
    )

    # --- Soft Gaussian edge so there is no hard cutoff ---
    mask = cv2.GaussianBlur(mask, (blur_ksize, blur_ksize), 0)
    return mask
```

**Key parameters (can be tuned without touching logic):**

| Parameter | Default | Effect when increased |
|-----------|---------|----------------------|
| `neck_ratio` | 0.65 | Extends neck coverage lower |
| `neck_width_ratio` | 0.42 | Widens the neck area |
| `blur_ksize` | 61 | Softer, wider blend boundary |

---

### Step 2 — `_color_harmonize()`

Adjusts the swapped image's color statistics in the neck region toward the model's body color, preventing visible skin tone mismatch.

Uses the **Reinhard transfer method** in LAB color space (channel-by-channel mean/std matching). A blend factor of `0.35` means only 35% correction — this prevents over-correction when faces are from very different ethnicities.

```python
def _color_harmonize(
    swapped: np.ndarray,
    original_target: np.ndarray,
    mask: np.ndarray,
    blend: float = 0.35,
) -> np.ndarray:
    """
    Shift the color of `swapped` toward `original_target` in the masked
    region using LAB Reinhard transfer.

    swapped:         result of inswapper.get() (BGR uint8)
    original_target: the original kimono model image (BGR uint8)
    mask:            uint8 mask [0-255] from _build_neck_mask()
    blend:           0.0 = no correction, 1.0 = full target color
    """
    if blend <= 0.0:
        return swapped

    mask_bool = mask > 64  # pixels meaningfully inside the mask

    # Need at least some pixels to measure from
    if mask_bool.sum() < 100:
        return swapped

    src_lab = cv2.cvtColor(swapped, cv2.COLOR_BGR2LAB).astype(np.float32)
    tgt_lab = cv2.cvtColor(original_target, cv2.COLOR_BGR2LAB).astype(np.float32)
    result_lab = src_lab.copy()

    for ch in range(3):
        src_vals = src_lab[:, :, ch][mask_bool]
        tgt_vals = tgt_lab[:, :, ch][mask_bool]

        src_std = src_vals.std()
        if src_std < 1e-6:
            continue  # flat channel — nothing to transfer

        src_mean = src_vals.mean()
        tgt_mean = tgt_vals.mean()
        tgt_std  = tgt_vals.std()

        # Scale and shift the full channel, then lerp toward original
        scaled = (src_lab[:, :, ch] - src_mean) * (tgt_std / max(src_std, 1e-6)) + tgt_mean
        result_lab[:, :, ch] = np.clip(
            src_lab[:, :, ch] * (1 - blend) + scaled * blend,
            0, 255,
        )

    return cv2.cvtColor(result_lab.astype(np.uint8), cv2.COLOR_LAB2BGR)
```

---

### Step 3 — `_composite()`

Blends the harmonized swapped image onto the original target using the neck mask as the alpha channel.

```python
def _composite(
    swapped: np.ndarray,
    target: np.ndarray,
    mask: np.ndarray,
) -> np.ndarray:
    """
    Alpha-blend swapped over target using mask.
    mask: uint8 [0-255]
    """
    alpha = mask.astype(np.float32)[:, :, np.newaxis] / 255.0
    blended = (swapped.astype(np.float32) * alpha
               + target.astype(np.float32) * (1.0 - alpha))
    return np.clip(blended, 0, 255).astype(np.uint8)
```

---

### Step 4 — Wire it into the `/swap` endpoint

Replace the `provider == local` block in the `swap()` endpoint.

**Before (current code):**
```python
fa, swapper, gfpgan = _models()
src = _decode(source_bytes)
tgt = _decode(target_bytes)
sf = _biggest(fa, src)
tf = _biggest(fa, tgt)
if sf is None:
    raise HTTPException(status_code=422, detail="no face detected in source (user) image")
if tf is None:
    raise HTTPException(status_code=422, detail="no face detected in target (model) image")
result = swapper.get(tgt.copy(), tf, sf, paste_back=True)
result = _enhance(gfpgan, result, _biggest(fa, result) or tf)
ok, buf = cv2.imencode(".jpg", result, [cv2.IMWRITE_JPEG_QUALITY, 95])
```

**After (new code):**
```python
fa, swapper, gfpgan = _models()
src = _decode(source_bytes)
tgt = _decode(target_bytes)
sf = _biggest(fa, src)
tf = _biggest(fa, tgt)
if sf is None:
    raise HTTPException(status_code=422, detail="no face detected in source (user) image")
if tf is None:
    raise HTTPException(status_code=422, detail="no face detected in target (model) image")

# Face swap (unchanged)
swapped = swapper.get(tgt.copy(), tf, sf, paste_back=True)

# Build neck mask using TARGET face position
neck_mask = _build_neck_mask(tgt.shape, tf.bbox)

# Color-harmonize the swapped result toward the model's body tones
swapped_harmonized = _color_harmonize(swapped, tgt, neck_mask)

# Composite: swapped face+neck over the original kimono image
result = _composite(swapped_harmonized, tgt, neck_mask)

# GFPGAN enhancement (unchanged)
result = _enhance(gfpgan, result, _biggest(fa, result) or tf)

ok, buf = cv2.imencode(".jpg", result, [cv2.IMWRITE_JPEG_QUALITY, 95])
```

---

## Irregular Cases

### Case 1 — Face not detected in user (source) image

**Cause:** Blurry photo, extreme angle, sunglasses, face too small.

**Handling:** Already handled — `sf is None` raises HTTP 422.

**Add a minimum face size check before the 422:**
```python
if sf is None:
    raise HTTPException(status_code=422, detail="no face detected in source (user) image")

# Reject faces that are too small for a reliable swap
FACE_MIN_PX = 80
src_face_w = sf.bbox[2] - sf.bbox[0]
src_face_h = sf.bbox[3] - sf.bbox[1]
if src_face_w < FACE_MIN_PX or src_face_h < FACE_MIN_PX:
    raise HTTPException(
        status_code=422,
        detail=f"source face is too small ({int(src_face_w)}×{int(src_face_h)}px). "
               "Please upload a closer photo."
    )
```

---

### Case 2 — Neck mask extends outside the image boundary

**Cause:** Model image is a tight crop (face near bottom edge).

**Handling:** Already handled in `_build_neck_mask()` via `min(neck_bottom, h - 1)`.

However, also clip `x` coordinates:
```python
# In _build_neck_mask(), replace the pts assignment:
img_w = w  # already available in the function
pts = np.array([
    [max(cx - chin_half_w, 0), y2],
    [min(cx + chin_half_w, img_w - 1), y2],
    [min(cx + collar_half_w, img_w - 1), neck_bottom],
    [max(cx - collar_half_w, 0), neck_bottom],
], dtype=np.int32)
```

---

### Case 3 — User has very long hair covering the neck area

**Cause:** The neck mask will composite user's hair-covered area instead of clean neck skin.

**Behavior:** The hair from the **model** image will show in the neck region (because composite blends back to `target` where mask is low), creating a hair-over-hair double layer effect on the edges.

**Mitigation — reduce `neck_ratio` for a shallower extension:**
```python
# In the swap endpoint, measure hair risk by comparing face height to image height
tgt_face_h = tf.bbox[3] - tf.bbox[1]
img_h = tgt.shape[0]
face_fraction = tgt_face_h / img_h

# Tighter crop → face is large → likely more neck visible → extend more
neck_ratio = 0.65 if face_fraction > 0.25 else 0.45

neck_mask = _build_neck_mask(tgt.shape, tf.bbox, neck_ratio=neck_ratio)
```

For best results in production: instruct users to upload a front-facing photo with hair pulled back, or clipped behind the shoulders.

---

### Case 4 — Extreme face angle (profile or heavy tilt)

**Cause:** User photo is not frontal (yaw > ~30°).

**Detection via InsightFace pose attribute:**
```python
# InsightFace FaceInfo has .pose attribute: (pitch, yaw, roll) in degrees
MAX_YAW_DEG = 35

if hasattr(sf, 'pose') and sf.pose is not None:
    yaw = abs(float(sf.pose[1]))
    if yaw > MAX_YAW_DEG:
        raise HTTPException(
            status_code=422,
            detail=f"Face angle too extreme ({int(yaw)}° yaw). "
                   "Please use a front-facing photo."
        )
```

If `sf.pose` is `None` (older model), skip this check silently.

---

### Case 5 — Very high kimono collar (little or no neck visible)

**Cause:** Some kimono styles have a collar that comes up to the chin. Extending the neck mask downward hits the collar fabric immediately.

**Behavior:** Without special handling, the user's neck skin would be composited over the collar fabric.

**Detection and handling:**
```python
# Estimate collar position from the target face bbox
# If the face bottom is very close to the image bottom, collar is probably high
tgt_face_bottom = tf.bbox[3]
img_h = tgt.shape[0]

# Face bottom in the upper 55% of the image → likely wide shot → normal
# Face bottom below 70% → tight shot → reduce neck extension
face_bottom_fraction = tgt_face_bottom / img_h

if face_bottom_fraction > 0.70:
    # Tight crop, collar likely close — be conservative
    neck_ratio = 0.30
else:
    neck_ratio = 0.65

neck_mask = _build_neck_mask(tgt.shape, tf.bbox, neck_ratio=neck_ratio)
```

---

### Case 6 — Multiple faces in the model or user image

**Cause:** A product group photo or a user selfie with friends.

**Behavior:** `_biggest()` already picks the largest face. This is the correct behavior for both images.

**No code change needed.** Document this assumption in a comment:
```python
# _biggest() selects the largest face by bounding-box area.
# For model images with multiple models, the central/largest figure is used.
# For user images with friends in the background, the foreground selfie face is used.
sf = _biggest(fa, src)
tf = _biggest(fa, tgt)
```

---

### Case 7 — Extreme skin tone difference between user and model

**Cause:** User has significantly darker or lighter skin than the kimono model.

**Behavior without handling:** The color harmonization with `blend=0.35` shifts the face toward the model tone, which may look over-corrected (unnaturally lightened/darkened).

**Handling — detect extreme delta and cap correction:**
```python
def _color_harmonize(swapped, original_target, mask, blend=0.35):
    ...
    for ch in range(3):
        ...
        # Cap correction if the delta is extreme (more than 40 LAB units)
        mean_delta = abs(tgt_mean - src_mean)
        effective_blend = blend if mean_delta < 40 else blend * 0.5
        result_lab[:, :, ch] = np.clip(
            src_lab[:, :, ch] * (1 - effective_blend) + scaled * effective_blend,
            0, 255,
        )
```

---

### Case 8 — Source or target image is grayscale or has alpha channel (RGBA/PNG)

**Cause:** User uploads a PNG with transparency, or a black-and-white photo.

**Handling in `_decode()`:**
```python
def _decode(data: bytes) -> np.ndarray:
    img = cv2.imdecode(np.frombuffer(data, np.uint8), cv2.IMREAD_COLOR)
    if img is None:
        raise HTTPException(status_code=400, detail="invalid image")
    # IMREAD_COLOR always converts to 3-channel BGR — handles RGBA and grayscale.
    # However, explicitly check shape just in case:
    if img.ndim == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
    elif img.shape[2] == 4:
        img = cv2.cvtColor(img, cv2.COLOR_BGRA2BGR)
    return img
```

---

### Case 9 — GFPGAN model file missing (no enhancement)

**Cause:** `gfpgan_1.4.onnx` not found on the server.

**Behavior:** Already handled — `gfpgan` is `None` and `_enhance()` returns the image unchanged. No crash.

**No code change needed.** Confirm the model file exists at startup:
```python
@lru_cache(maxsize=1)
def _models():
    ...
    if not os.path.exists(GFPGAN_PATH):
        import logging
        logging.getLogger("kimono-pipeline").warning(
            "GFPGAN model not found at %s — enhancement disabled", GFPGAN_PATH
        )
    gfpgan = ort.InferenceSession(GFPGAN_PATH, ...) if os.path.exists(GFPGAN_PATH) else None
    ...
```

---

### Case 10 — Image with EXIF rotation (phone camera photo)

**Cause:** Mobile camera photos embed rotation in EXIF. OpenCV's `imdecode` ignores EXIF — the image loads sideways.

**Handling in `_decode()`:**
```python
def _decode(data: bytes) -> np.ndarray:
    # Apply EXIF rotation before OpenCV decodes
    from PIL import Image, ImageOps
    import io as _io
    try:
        pil_img = ImageOps.exif_transpose(Image.open(_io.BytesIO(data)))
        buf = _io.BytesIO()
        pil_img.convert("RGB").save(buf, format="JPEG", quality=95)
        data = buf.getvalue()
    except Exception:
        pass  # not a PIL-readable format; fall through to OpenCV

    img = cv2.imdecode(np.frombuffer(data, np.uint8), cv2.IMREAD_COLOR)
    if img is None:
        raise HTTPException(status_code=400, detail="invalid image")
    return img
```

**Check if Pillow is installed in the venv first:**
```bash
/mnt/gxo_volume_01/kimono-pipeline/venv/bin/pip install Pillow
```

---

## Complete Modified `main.py`

Below is the full file with all changes integrated. Replace the existing file entirely.

```python
"""
MORO Kimono Face-Swap Service — face + neck composite (local mode).

POST /swap  (multipart: source=<user face>, target=<kimono model image>,
             provider=local|chatgpt)

Local mode:
  InsightFace inswapper_128  — face swap
  _build_neck_mask()         — extend composite region to collar
  _color_harmonize()         — LAB color transfer for skin tone blend
  _composite()               — alpha blend face+neck over target
  GFPGAN 1.4                 — face / boundary sharpening

ChatGPT mode:
  OpenAI gpt-image-2 edit API  (unchanged from original)
"""
import base64
import io as _io
import logging
import os
from functools import lru_cache

import cv2
import numpy as np
from fastapi import FastAPI, File, Form, HTTPException, UploadFile
from fastapi.responses import Response

logger = logging.getLogger("kimono-pipeline")

# ---------------------------------------------------------------------------
# Model paths
# ---------------------------------------------------------------------------

INSWAPPER_CANDIDATES = (
    os.getenv("INSWAPPER_MODEL_PATH", ""),
    os.path.expanduser("~/.insightface/models/inswapper_128.onnx"),
    "/var/www/.insightface/models/inswapper_128.onnx",
)
GFPGAN_PATH = os.getenv(
    "GFPGAN_MODEL_PATH", "data/private/model_assets/restore/gfpgan_1.4.onnx"
)


def _inswapper_path() -> str:
    for p in INSWAPPER_CANDIDATES:
        if p and os.path.exists(p):
            return p
    raise RuntimeError("inswapper_128.onnx not found; set INSWAPPER_MODEL_PATH")


@lru_cache(maxsize=1)
def _models():
    import insightface
    import onnxruntime as ort
    from insightface.app import FaceAnalysis

    fa = FaceAnalysis(
        name="buffalo_l",
        allowed_modules=["detection", "landmark_2d_106", "recognition"],
        providers=["CPUExecutionProvider"],
    )
    fa.prepare(ctx_id=0, det_size=(640, 640))
    swapper = insightface.model_zoo.get_model(
        _inswapper_path(), providers=["CPUExecutionProvider"]
    )
    if not os.path.exists(GFPGAN_PATH):
        logger.warning("GFPGAN model not found at %s — enhancement disabled", GFPGAN_PATH)
    gfpgan = (
        ort.InferenceSession(GFPGAN_PATH, providers=["CPUExecutionProvider"])
        if os.path.exists(GFPGAN_PATH)
        else None
    )
    return fa, swapper, gfpgan


# ---------------------------------------------------------------------------
# Image helpers
# ---------------------------------------------------------------------------

def _decode(data: bytes) -> np.ndarray:
    """Decode image bytes to BGR ndarray, handling EXIF rotation and alpha."""
    try:
        from PIL import Image, ImageOps
        pil_img = ImageOps.exif_transpose(Image.open(_io.BytesIO(data)))
        buf = _io.BytesIO()
        pil_img.convert("RGB").save(buf, format="JPEG", quality=95)
        data = buf.getvalue()
    except Exception:
        pass

    img = cv2.imdecode(np.frombuffer(data, np.uint8), cv2.IMREAD_COLOR)
    if img is None:
        raise HTTPException(status_code=400, detail="invalid image")

    if img.ndim == 2:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
    elif img.shape[2] == 4:
        img = cv2.cvtColor(img, cv2.COLOR_BGRA2BGR)
    return img


def _biggest(fa, img):
    """Return the largest face detected in img, or None."""
    faces = fa.get(img)
    if not faces:
        return None
    return max(faces, key=lambda f: (f.bbox[2] - f.bbox[0]) * (f.bbox[3] - f.bbox[1]))


# ---------------------------------------------------------------------------
# GFPGAN enhancement (unchanged)
# ---------------------------------------------------------------------------

def _enhance(gfpgan, img, face):
    if gfpgan is None:
        return img
    from insightface.utils import face_align

    M = face_align.estimate_norm(face.kps, 512)
    aligned = cv2.warpAffine(img, M, (512, 512), borderMode=cv2.BORDER_REPLICATE)
    x = cv2.cvtColor(aligned, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
    x = ((x - 0.5) / 0.5).transpose(2, 0, 1)[None]
    out = gfpgan.run(None, {gfpgan.get_inputs()[0].name: x})[0][0]
    out = np.clip((out.transpose(1, 2, 0) + 1) / 2, 0, 1) * 255
    restored = cv2.cvtColor(out.astype(np.uint8), cv2.COLOR_RGB2BGR)
    IM = cv2.invertAffineTransform(M)
    h, w = img.shape[:2]
    back = cv2.warpAffine(restored, IM, (w, h), borderMode=cv2.BORDER_TRANSPARENT)
    mask = np.zeros((512, 512), np.uint8)
    cv2.ellipse(mask, (256, 256), (190, 235), 0, 0, 360, 255, -1)
    mask = cv2.GaussianBlur(mask, (31, 31), 0)
    mb = cv2.warpAffine(mask, IM, (w, h))[..., None].astype(np.float32) / 255.0
    return (img * (1 - mb) + back * mb).astype(np.uint8)


# ---------------------------------------------------------------------------
# Face + neck compositing  (NEW)
# ---------------------------------------------------------------------------

def _build_neck_mask(
    img_shape: tuple,
    bbox: np.ndarray,
    neck_ratio: float = 0.65,
    neck_width_ratio: float = 0.42,
    blur_ksize: int = 61,
) -> np.ndarray:
    """
    uint8 mask [0-255] covering face bbox + tapered neck extension.

    neck_ratio:       extension below bbox as fraction of face height
    neck_width_ratio: collar width relative to face width
    blur_ksize:       Gaussian kernel size for soft edges (must be odd)
    """
    h, w = img_shape[:2]
    x1, y1, x2, y2 = (int(v) for v in bbox)

    face_h = y2 - y1
    face_w = x2 - x1
    cx = (x1 + x2) // 2

    mask = np.zeros((h, w), dtype=np.uint8)

    # Face ellipse
    pad = 10
    cv2.ellipse(
        mask,
        (cx, (y1 + y2) // 2),
        (face_w // 2 + pad, face_h // 2 + pad),
        0, 0, 360, 255, -1,
    )

    # Neck trapezoid
    chin_half_w  = int(face_w * 0.36)
    collar_half_w = max(int(face_w * neck_width_ratio / 2), 4)
    neck_bottom  = min(y2 + int(face_h * neck_ratio), h - 1)

    pts = np.array([
        [max(cx - chin_half_w, 0),       y2],
        [min(cx + chin_half_w, w - 1),   y2],
        [min(cx + collar_half_w, w - 1), neck_bottom],
        [max(cx - collar_half_w, 0),     neck_bottom],
    ], dtype=np.int32)
    cv2.fillPoly(mask, [pts], 255)

    # Rounded collar fade
    collar_oval_h = max(int(face_h * 0.08), 6)
    cv2.ellipse(mask, (cx, neck_bottom), (collar_half_w, collar_oval_h), 0, 0, 360, 255, -1)

    mask = cv2.GaussianBlur(mask, (blur_ksize, blur_ksize), 0)
    return mask


def _color_harmonize(
    swapped: np.ndarray,
    target: np.ndarray,
    mask: np.ndarray,
    blend: float = 0.35,
) -> np.ndarray:
    """
    Shift swapped image color toward target in the mask region.
    Uses Reinhard LAB transfer with a blend factor so it never over-corrects.
    """
    if blend <= 0.0:
        return swapped

    mask_bool = mask > 64
    if mask_bool.sum() < 100:
        return swapped

    src_lab = cv2.cvtColor(swapped, cv2.COLOR_BGR2LAB).astype(np.float32)
    tgt_lab = cv2.cvtColor(target,  cv2.COLOR_BGR2LAB).astype(np.float32)
    result  = src_lab.copy()

    for ch in range(3):
        src_vals = src_lab[:, :, ch][mask_bool]
        tgt_vals = tgt_lab[:, :, ch][mask_bool]

        src_std = src_vals.std()
        if src_std < 1e-6:
            continue

        src_mean = src_vals.mean()
        tgt_mean = tgt_vals.mean()
        tgt_std  = tgt_vals.std()

        # Cap blend when skin tone delta is extreme (>40 LAB units)
        mean_delta    = abs(tgt_mean - src_mean)
        eff_blend     = blend if mean_delta < 40 else blend * 0.5

        scaled = (src_lab[:, :, ch] - src_mean) * (tgt_std / max(src_std, 1e-6)) + tgt_mean
        result[:, :, ch] = np.clip(
            src_lab[:, :, ch] * (1 - eff_blend) + scaled * eff_blend,
            0, 255,
        )

    return cv2.cvtColor(result.astype(np.uint8), cv2.COLOR_LAB2BGR)


def _composite(swapped: np.ndarray, target: np.ndarray, mask: np.ndarray) -> np.ndarray:
    """Alpha-blend swapped over target using mask [0-255]."""
    alpha   = mask.astype(np.float32)[:, :, np.newaxis] / 255.0
    blended = swapped.astype(np.float32) * alpha + target.astype(np.float32) * (1.0 - alpha)
    return np.clip(blended, 0, 255).astype(np.uint8)


# ---------------------------------------------------------------------------
# ChatGPT provider (unchanged)
# ---------------------------------------------------------------------------

GPT_IMAGE_MODEL  = os.getenv("GPT_IMAGE_MODEL", "gpt-image-2")
GPT_IMAGE_PROMPT = (
    "Replace the face and neck of the woman in the first image with the exact face and identity of "
    "the person in the second image. Keep her eyes, nose, mouth, eyebrows and face shape so the "
    "result is clearly recognizable as the second person. Keep everything else in the first image "
    "identical: the kimono, obi, sleeves, hands, body, pose, hairstyle and the full background. "
    "CRITICAL — make it look natural and seamless: regrade the new face and neck so their skin tone, "
    "color, brightness, contrast, warmth and white balance exactly match the hands, chest and body "
    "of the first image, lit by the same light from the same direction. No color mismatch between "
    "face and body, no brightness difference, no visible seam or hard edge at the jaw, ears or "
    "neckline; blend the transition smoothly. Apply one consistent color grade and lighting across "
    "the whole photo so the person looks like a single real photograph. Photorealistic."
)


def _gpt_image_swap(model_bytes: bytes, user_bytes: bytes) -> bytes:
    import requests

    key = os.getenv("OPENAI_API_KEY") or os.getenv("CHATGPT_API_KEY") or ""
    if not key:
        raise HTTPException(
            status_code=500,
            detail="OPENAI_API_KEY/CHATGPT_API_KEY not set for chatgpt provider",
        )
    files = [
        ("image[]", ("model.jpg", model_bytes, "image/jpeg")),
        ("image[]", ("user.jpg",  user_bytes,  "image/jpeg")),
    ]
    data = {
        "model":   GPT_IMAGE_MODEL,
        "prompt":  GPT_IMAGE_PROMPT,
        "size":    "1024x1536",
        "quality": "high",
    }
    resp = requests.post(
        "https://api.openai.com/v1/images/edits",
        headers={"Authorization": "Bearer " + key},
        files=files,
        data=data,
        timeout=300,
    )
    if resp.status_code != 200:
        raise HTTPException(status_code=502, detail="gpt-image error: " + resp.text[:300])
    return base64.b64decode(resp.json()["data"][0]["b64_json"])


# ---------------------------------------------------------------------------
# FastAPI app
# ---------------------------------------------------------------------------

FACE_MIN_PX = 80
MAX_YAW_DEG = 35

app = FastAPI(title="MORO Kimono Face-Swap Service", version="0.2.0")


@app.get("/health")
def health():
    return {"status": "ok"}


@app.post("/swap")
async def swap(
    source: UploadFile = File(...),
    target: UploadFile = File(...),
    provider: str = Form("local"),
):
    source_bytes = await source.read()
    target_bytes = await target.read()

    if provider == "chatgpt":
        return Response(
            content=_gpt_image_swap(target_bytes, source_bytes),
            media_type="image/jpeg",
        )

    # --- Local: face + neck composite ---
    fa, swapper, gfpgan = _models()
    src = _decode(source_bytes)
    tgt = _decode(target_bytes)

    # _biggest() picks the largest face; handles multi-face images correctly
    sf = _biggest(fa, src)
    tf = _biggest(fa, tgt)

    if sf is None:
        raise HTTPException(status_code=422, detail="no face detected in source (user) image")
    if tf is None:
        raise HTTPException(status_code=422, detail="no face detected in target (model) image")

    # Case 1: face too small for reliable swap
    src_face_w = sf.bbox[2] - sf.bbox[0]
    src_face_h = sf.bbox[3] - sf.bbox[1]
    if src_face_w < FACE_MIN_PX or src_face_h < FACE_MIN_PX:
        raise HTTPException(
            status_code=422,
            detail=f"source face is too small ({int(src_face_w)}×{int(src_face_h)}px). "
                   "Please upload a closer photo.",
        )

    # Case 4: extreme face angle
    if hasattr(sf, "pose") and sf.pose is not None:
        yaw = abs(float(sf.pose[1]))
        if yaw > MAX_YAW_DEG:
            raise HTTPException(
                status_code=422,
                detail=f"Face angle too extreme ({int(yaw)}° yaw). "
                       "Please use a front-facing photo.",
            )

    # Case 5 + Case 3: adapt neck_ratio based on image geometry
    tgt_face_h        = tf.bbox[3] - tf.bbox[1]
    tgt_face_bottom   = tf.bbox[3]
    img_h             = tgt.shape[0]
    face_fraction     = tgt_face_h / img_h
    bottom_fraction   = tgt_face_bottom / img_h

    if bottom_fraction > 0.70:
        neck_ratio = 0.30   # tight crop — collar is close
    elif face_fraction > 0.25:
        neck_ratio = 0.65   # face fills frame — likely more neck visible
    else:
        neck_ratio = 0.50   # default

    # Face swap
    swapped = swapper.get(tgt.copy(), tf, sf, paste_back=True)

    # Build face + neck mask (Case 2 boundary clipping handled inside)
    neck_mask = _build_neck_mask(tgt.shape, tf.bbox, neck_ratio=neck_ratio)

    # Color harmonization
    swapped_harmonized = _color_harmonize(swapped, tgt, neck_mask)

    # Composite
    result = _composite(swapped_harmonized, tgt, neck_mask)

    # GFPGAN enhancement
    result_face = _biggest(fa, result)
    result = _enhance(gfpgan, result, result_face if result_face is not None else tf)

    ok, buf = cv2.imencode(".jpg", result, [cv2.IMWRITE_JPEG_QUALITY, 95])
    if not ok:
        raise HTTPException(status_code=500, detail="encode failed")
    return Response(content=buf.tobytes(), media_type="image/jpeg")
```

---

## Deployment

```bash
# 1. Back up current file
cp /mnt/gxo_volume_01/kimono-pipeline/main.py \
   /mnt/gxo_volume_01/kimono-pipeline/main.py.bak.$(date +%Y%m%d)

# 2. Install Pillow for EXIF handling (Case 10)
/mnt/gxo_volume_01/kimono-pipeline/venv/bin/pip install Pillow

# 3. Write the new main.py (content above)

# 4. Restart the service
# Find the process: ps aux | grep kimono
# Then restart however it is managed (supervisord / systemd / pm2)
sudo supervisorctl restart kimono-pipeline
# OR
sudo systemctl restart kimono-pipeline
```

---

## Quick Smoke Test

```bash
# Health check
curl http://127.0.0.1:5005/health
# Expected: {"status":"ok"}

# Swap test (replace paths with real images)
curl -X POST http://127.0.0.1:5005/swap \
  -F "source=@/tmp/user_face.jpg" \
  -F "target=@/tmp/kimono_model.jpg" \
  -F "provider=local" \
  --output /tmp/result.jpg

# Verify output
file /tmp/result.jpg   # should be JPEG image data
```

---

## Tuning After First Review

| If you see... | Adjust |
|---------------|--------|
| Neck seam still visible | Increase `neck_ratio` (e.g. 0.80) |
| Neck extends into collar fabric | Decrease `neck_ratio` (e.g. 0.40) |
| Skin color mismatch at boundary | Increase `blend` in `_color_harmonize` (e.g. 0.50) |
| Face color looks unnaturally shifted | Decrease `blend` (e.g. 0.20) |
| Soft halo around face boundary | Decrease `blur_ksize` (e.g. 41) |
| Hard edge visible at face boundary | Increase `blur_ksize` (e.g. 81) |