Benchmarks the dense 5B Turbo model (Q8_0 GGUF + fp8 T5) as a
lower-VRAM alternative to the 14B MoE pipeline. Includes dtype
patches for dense WanModel, Wan 2.2 VAE config (48 channels, 16x
spatial), and Blackwell fp8 workaround.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Upgrade PyTorch to 2.7+ with cu128 wheels for Blackwell (sm_120) GPU
support. Replace silero-vad (which depends on torchaudio) with a direct
ONNX Runtime implementation of the same Silero VAD model, eliminating
the torchaudio dependency entirely.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>