live-voice-chat/tests/README.md

# Voice-chat tests

Two tiers.

## Unit tests — fast, GPU-free

```
python -m pytest tests/unit -v
```

These exercise pure logic: config parsing, prompt derivation, LoRA spec
parsing, frame-length fitting, library round-robin selection, the
pipeline's video branch, and ffmpeg mux argument shaping. They do not
touch CUDA, Wan2.2, MuseTalk, or a real ffmpeg binary. Safe to run on
Windows, outside Docker, without any models installed.

Current unit files:

- `test_video_config.py` — `VideoConfig.from_dict` round-trip, LoRA target validation
- `test_video_engine_logic.py` — prompt derivation, library cursor, frame fitting
- `test_pipeline_video_branch.py` — pipeline takes the video path iff engine is ready
- `test_musetalk_fit_frames.py` — frame-length adjustment to match audio duration
- `test_muxer_ffmpeg.py` — ffmpeg command construction

## Component tests — slow, GPU-required, run inside Docker

Each script in `tests/component/` exercises one subsystem end-to-end
against the real models. The numbered prefix reflects the implementation
phase each script gates, and also serves as a reasonable run order when
debugging a fresh environment:

| Script | Phase | Tests |
|---|---|---|
| `test_01_video_skeleton.py` | 1 | VideoEngine loads, config gate respected |
| `test_02_wan22_loras.py` | 2 | Wan2.2 pipeline loads, LoRA stack applies |
| `test_03_idle_clip.py` | 3 | `set_avatar` → idle MP4, written to disk for eyeballing |
| `test_04_library_prebake.py` | 4 | library mode pre-bakes N base clips |
| `test_05_musetalk_lipsync.py` | 5 | MuseTalk lip-sync on library frames + ffmpeg mux |
| `test_06_reflective.py` | 6 | reflective mode: fresh Wan2.2 per reply |
| `test_07_endpoints.py` | 7 | HTTP endpoints return sane responses |
| `test_08_lora_reload.py` | 8 | `/api/reload-loras` swaps LoRAs live |
| `test_09_gguf_generate.py` | 9 | GGUF-quantised DIT end-to-end I2V generation |
| `test_10_t5_encode.py` | 10 | T5 encoder (optionally fp8-quantised) on CUDA |
| `test_11_image_encode.py` | 11 | Avatar image → VAE latent path |
| `test_12_dit_single_step.py` | 12 | Single DIT step on the loaded expert(s) |
| `test_13_vae_decode.py` | 13 | VAE decode back to RGB frames |

Tests 09-13 are focused on the GGUF + Blackwell (SM120) path and are how
new quant schemes / attention backends get validated before wiring them
into the full pipeline.

Run one:

```
# Inside the container:
docker compose exec voice-chat python -m tests.component.test_03_idle_clip
```

Run all (slow, ~20+ minutes on a 5090):

```
docker compose exec voice-chat python -m tests.component.run_all
```

Each component script writes its artifacts (MP4s, PNG frame dumps, logs)
to `tests/component/_out/` so you can visually inspect results. That
directory is gitignored.