Voice-chat tests

Two tiers.

Unit tests — fast, GPU-free

python -m pytest tests/unit -v

These exercise pure logic: config parsing, prompt derivation, LoRA spec parsing, frame-length fitting, library round-robin selection, the pipeline's video branch, and ffmpeg mux argument shaping. They do not touch CUDA, Wan2.2, MuseTalk, or a real ffmpeg binary. Safe to run on Windows, outside Docker, without any models installed.

Current unit files:

test_video_config.py — VideoConfig.from_dict round-trip, LoRA target validation
test_video_engine_logic.py — prompt derivation, library cursor, frame fitting
test_pipeline_video_branch.py — pipeline takes the video path iff engine is ready
test_musetalk_fit_frames.py — frame-length adjustment to match audio duration
test_muxer_ffmpeg.py — ffmpeg command construction

Component tests — slow, GPU-required, run inside Docker

Each script in tests/component/ exercises one subsystem end-to-end against the real models. The numbered prefix reflects the implementation phase each script gates, and also serves as a reasonable run order when debugging a fresh environment:

Script	Phase	Tests
`test_01_video_skeleton.py`	1	VideoEngine loads, config gate respected
`test_02_wan22_loras.py`	2	Wan2.2 pipeline loads, LoRA stack applies
`test_03_idle_clip.py`	3	`set_avatar` → idle MP4, written to disk for eyeballing
`test_04_library_prebake.py`	4	library mode pre-bakes N base clips
`test_05_musetalk_lipsync.py`	5	MuseTalk lip-sync on library frames + ffmpeg mux
`test_06_reflective.py`	6	reflective mode: fresh Wan2.2 per reply
`test_07_endpoints.py`	7	HTTP endpoints return sane responses
`test_08_lora_reload.py`	8	`/api/reload-loras` swaps LoRAs live
`test_09_gguf_generate.py`	9	GGUF-quantised DIT end-to-end I2V generation
`test_10_t5_encode.py`	10	T5 encoder (optionally fp8-quantised) on CUDA
`test_11_image_encode.py`	11	Avatar image → VAE latent path
`test_12_dit_single_step.py`	12	Single DIT step on the loaded expert(s)
`test_13_vae_decode.py`	13	VAE decode back to RGB frames

Tests 09-13 are focused on the GGUF + Blackwell (SM120) path and are how new quant schemes / attention backends get validated before wiring them into the full pipeline.

Run one:

# Inside the container:
docker compose exec voice-chat python -m tests.component.test_03_idle_clip

Run all (slow, ~20+ minutes on a 5090):

docker compose exec voice-chat python -m tests.component.run_all

Each component script writes its artifacts (MP4s, PNG frame dumps, logs) to tests/component/_out/ so you can visually inspect results. That directory is gitignored.

2.7 KiB Raw Blame History

Voice-chat tests

Unit tests — fast, GPU-free

Component tests — slow, GPU-required, run inside Docker

2.7 KiB

Raw Blame History