LFM2.5 is Liquid AI's most capable release for edge AI deployment, building on the LFM2 device-optimized architecture. It represents a significant leap forward in building reliable agents on the edge with open-weight models optimized for instruction following.

LFM2.5 is a family of AI models designed specifically for on-device deployment at the edge. The product consists of the LFM2.5-1.2B model family, which represents Liquid AI's most capable release yet for edge AI deployment. It builds on the LFM2 device-optimized architecture and is optimized for instruction following capabilities to serve as the building block of on-device agentic AI.

The LFM2.5 family includes multiple specialized models: Text models offering uncompromised quality for high-performance on-device workflows, an Audio model that is 8x faster than its predecessor running natively on constrained hardware, and a Vision-Language Model (VLM) that boosts multi-image, multilingual vision understanding and instruction following. The family comprises Base, Instruct, Japanese, Vision-Language, and Audio-Language variants, with extended pretraining from 10T to 28T tokens and significantly scaled up post-training pipeline with reinforcement learning.

The models feature a hybrid architecture that enables extremely fast inference speed on CPUs with a low memory profile compared to similar-sized models. The Audio-Language model processes audio natively rather than using pipelined approaches, eliminating information barriers between components and dramatically reducing end-to-end latency. It includes a custom LFM-based audio detokenizer with significantly reduced latency that is 8x faster than the previous generation.

LFM2.5 enables access to private, fast, and always-on intelligence on any device, making it suitable for on-device use cases like local copilots, in-car assistants, and local productivity workflows. The models unlock new deployment scenarios across various devices including vehicles, mobile devices, laptops, IoT devices, and embedded systems.

The models are open-weight and available through multiple deployment frameworks including LEAP (Liquid's Edge AI Platform), llama.cpp for CPU inference, MLX for Apple Silicon, vLLM for GPU-accelerated serving, and ONNX for cross-platform inference. They support both CPU and GPU acceleration across Apple, AMD, Qualcomm, and Nvidia hardware, with optimized models available through partnerships with AMD and Nexa AI for NPU deployment.

LFM2.5

LFM2.5

Key Features

Use Cases

Who is this for?

Comments