LPM 1.0 (Large Performance Model) is a state-of-the-art AI video generation system designed specifically for real-time character performance. Built as a 17B-parameter Diffusion Transformer, it transforms single images into expressive, interactive characters that can speak, listen, sing, and react in real time with unprecedented latency of just 0.35 seconds.
Key Features
- Real-Time Full-Duplex Conversation: Characters generate reactive listening behavior (nods, gaze shifts, micro-expressions) while simultaneously speaking with precise lip sync
- Identity-Consistent Generation: Multi-granularity identity conditioning maintains character appearance for 45+ minutes without drift
- Multimodal Control: Unified text, audio, and image conditioning enables fine-grained directorial control
- Zero-Shot Generalization: Works with any character style (photorealistic, anime, 3D, non-humanoid) without fine-tuning
- Infinite-Length Output: Continuous generation capability for extended sessions
- High-Quality Output: Generates 480P/720P video at 24fps
Technical Architecture
LPM 1.0 employs a co-designed pipeline with a Base LPM (17B DiT) trained on multimodal human-centric data, distilled via DMD (Distribution Matching Distillation) into an Online LPM causal streaming generator that compresses diffusion to just 2 steps for real-time performance.
Target Applications
- Conversational AI Agents: Visual companions for chatbots like ChatGPT and Doubao
- Game NPCs: Real-time, expressive non-player characters with natural dialogue
- Live Streaming Characters: Virtual streamers with full-duplex interaction capabilities
- Accessibility & Education: Virtual tutors and companion characters for users with communication challenges
Performance Benchmarks
LPM 1.0 outperforms alternatives like LiveAvatar, Kling-Avatar-2, and OmniHuman across all dimensions including latency (0.35s vs 1s+), full-duplex support, generation length, and character generalization.

