LogoDomain Rank App
icon of LPM 1.0

LPM 1.0

LPM 1.0 is a 17B-parameter AI model for real-time character video generation with full-duplex conversation, identity consistency, and 0.35s latency.

Introduction

LPM 1.0 (Large Performance Model) is a state-of-the-art AI video generation system designed specifically for real-time character performance. Built as a 17B-parameter Diffusion Transformer, it transforms single images into expressive, interactive characters that can speak, listen, sing, and react in real time with unprecedented latency of just 0.35 seconds.

Key Features
  • Real-Time Full-Duplex Conversation: Characters generate reactive listening behavior (nods, gaze shifts, micro-expressions) while simultaneously speaking with precise lip sync
  • Identity-Consistent Generation: Multi-granularity identity conditioning maintains character appearance for 45+ minutes without drift
  • Multimodal Control: Unified text, audio, and image conditioning enables fine-grained directorial control
  • Zero-Shot Generalization: Works with any character style (photorealistic, anime, 3D, non-humanoid) without fine-tuning
  • Infinite-Length Output: Continuous generation capability for extended sessions
  • High-Quality Output: Generates 480P/720P video at 24fps
Technical Architecture

LPM 1.0 employs a co-designed pipeline with a Base LPM (17B DiT) trained on multimodal human-centric data, distilled via DMD (Distribution Matching Distillation) into an Online LPM causal streaming generator that compresses diffusion to just 2 steps for real-time performance.

Target Applications
  • Conversational AI Agents: Visual companions for chatbots like ChatGPT and Doubao
  • Game NPCs: Real-time, expressive non-player characters with natural dialogue
  • Live Streaming Characters: Virtual streamers with full-duplex interaction capabilities
  • Accessibility & Education: Virtual tutors and companion characters for users with communication challenges
Performance Benchmarks

LPM 1.0 outperforms alternatives like LiveAvatar, Kling-Avatar-2, and OmniHuman across all dimensions including latency (0.35s vs 1s+), full-duplex support, generation length, and character generalization.

Analytics