

MiniMax M2.5 is an open-source frontier model extensively trained with reinforcement learning in hundreds of thousands of complex real-world environments. It achieves state-of-the-art performance across coding, agentic tool use, search, office work, and other economically valuable tasks, with scores including 80.2% in SWE-Bench Verified, 51.3% in Multi-SWE-Bench, and 76.3% in BrowseComp.
The model demonstrates substantial improvements in programming evaluations, especially in multilingual coding tasks across over 10 languages including Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, and Ruby. It covers the entire development lifecycle of complex systems from system design to comprehensive code review and testing. M2.5 excels at search and tool calling with industry-leading performance on benchmarks like BrowseComp and Wide Search, achieving better results with approximately 20% fewer rounds compared to previous versions.
M2.5 was trained to produce truly deliverable outputs in office scenarios through collaboration with senior professionals in finance, law, and social sciences. It shows significant capability improvements in high-value workspace scenarios such as Word, PowerPoint, and Excel financial modeling. The model achieves an average win rate of 59.0% in office work evaluations using the internal Cowork Agent framework.
The model's efficiency stems from its task decomposition effectiveness, token efficiency, and inference speed. M2.5 is served natively at 100 tokens per second, nearly twice that of other frontier models, and completes complex tasks 37% faster than previous versions. It consumes fewer tokens while maintaining high performance, with runtime comparable to Claude Opus 4.6 at only 10% of the cost.
M2.5 enables innovative agentic applications through its cost-effectiveness, running continuously for $1 per hour at 100 tokens per second or $0.30 at 50 tokens per second. The model powers complex agents without cost concerns, making long-horizon agent scaling economically feasible. It has been deployed in MiniMax Agent, where it handles 30% of company tasks across R&D, product, sales, HR, and finance functions.
admin
MiniMax M2.5 targets software developers, engineers, and programming professionals needing state-of-the-art coding assistance across multiple languages and platforms. It serves businesses requiring office work automation in finance, law, and social sciences domains. The model benefits organizations developing agentic applications and complex systems where cost-effective AI performance is critical for productivity gains.