Titan AI LogoTitan AI

MiniCPM-o

19,740
1,437
Python

项目描述

MiniCPM-o is an end-side multimodal LLM capable of processing images, video, text, and audio inputs, providing high-quality text and speech outputs. It supports real-time speech conversation, emotion/speed/style control, and multimodal live streaming on devices like iPad.

Project Information

Created on 1/29/2024
Updated on 7/2/2025

Categories

speech-technology
ai-content-generation
conversational-assistant

Tags

ready-to-use
model-deployment
multimodal
real-time-processing
open-source-community

Topics

minicpm
minicpm-v
multi-modal