项目描述
MiniCPM-o is an end-side multimodal LLM capable of processing images, video, text, and audio inputs, providing high-quality text and speech outputs. It supports real-time speech conversation, emotion/speed/style control, and multimodal live streaming on devices like iPad.
项目信息
创建于 1/29/2024
更新于 7/2/2025
分类
conversational-assistant
speech-technology
ai-content-generation
标签
ready-to-use
model-deployment
multimodal
real-time-processing
open-source-community
主题
minicpm
minicpm-v
multi-modal