
Liquid AI, a startup founded by former MIT computer scientists, has unveiled its most compact language model to date, LFM2.5-230M. This new 230-million-parameter foundation model is specifically engineered for agentic workflows on edge devices, making it an attractive option for enterprises looking to implement efficient data extraction and on-device processing capabilities for smartphones, laptops, and robotics.
The model’s diminutive size allows for near-ubiquitous deployment, a key differentiator that Liquid AI highlights in its release blog. Impressively, it claims that LFM2.5-230M surpasses models more than four times its size on specific benchmarks, particularly excelling in data extraction tasks where it outperformed Alibaba’s 800-million-parameter Qwen3.5-0.8B (Instruct) and Google’s 1-billion-parameter Gemma 3 1B.

The model is specifically targeted at developers and engineers involved in creating lean data extraction pipelines and autonomous edge systems.
Operating under a dual-use commercial license, LFM2.5-230M is available free of charge to individuals and companies with annual revenues below $10 million. Larger corporations will require a paid enterprise agreement for its commercial deployment.
This release sets itself apart by leveraging the LFM2 architecture to achieve high inference speeds without the substantial memory footprint typically associated with large transformer models. While major AI players continue to push the boundaries with models boasting hundreds of billions, or even trillions, of parameters to achieve peak performance, a parallel development trend is focusing on efficiency for edge and local deployments.
Liquid AI’s introduction of LFM2.5-230M signifies a strategic emphasis on architectural optimization over sheer scale. By consolidating 19 trillion tokens of pre-training into a 230-million-parameter model, the company demonstrates that complex, multi-step agentic workflows can be executed on edge devices without requiring immense computational power or constant cloud connectivity.
LFM2.5-230M Architecture and Operation
The LFM2.5-230M model departs from conventional transformer architectures, employing the LFM2 framework. This unique architecture integrates gated short-range convolutions with grouped-query attention mechanisms for highly efficient information processing.
For those monitoring advancements in efficient AI architectures, Liquid’s approach aligns with the goal of effectively managing long contexts and sequential data on edge hardware, circumventing the quadratic memory complexities inherent in pure attention mechanisms. The model supports an extensive 32K context window, enabling it to process large documents or continuous streams of data, such as robotic telemetry.
Performance metrics reveal the architectural efficiency of LFM2.5-230M. The model maintains a memory footprint under 400MB while delivering prefill and decode speeds that surpass comparable models like Gemma 3 1B IT and Granite 4.0-H-350M. On a Samsung Galaxy S25 Ultra with a Qualcomm Snapdragon Gen4 CPU, the model achieves a decode speed of 213 tokens per second. Even on a resource-constrained Raspberry Pi 5, it maintains a decode rate of 42 tokens per second. Furthermore, internal benchmarks indicate that its GPU inference stack offers lower end-to-end latency than competing small models across various concurrency levels.
Enterprise Significance of LFM2.5-230M
The necessity of a compact model like LFM2.5-230M becomes clear when examining current enterprise data management practices. Traditionally, organizations have relied on static, rule-based Extract, Transform, Load (ETL) scripts, which are often fragile and prone to failure when document layouts or schemas change.
The industry is now shifting towards “AI ETL,” where machine learning models infer data mappings, detect schema drift, and adapt autonomously. In this modern approach, an AI model ingests unstructured data from sources like PDFs, emails, or web forms and structures it into formats such as JSON, eliminating the need for rigid, hardcoded rules.
Employing a massive, high-end model like Claude Opus 4.6 (priced at $5.00 per million input tokens) for routine tasks such as parsing invoices, formatting addresses, or processing telemetry data is economically unfeasible for most enterprises. This is precisely where models like LFM2.5-230M become indispensable. Engineered as an efficient extraction engine, it enables companies to automate repetitive data formatting and parsing tasks at a significantly lower compute cost and latency, operating directly on local hardware instead of relying on costly, continuous cloud API calls.
Small Model Performance: LFM vs. 3B Class Competitors
The AI landscape in mid-2026 is witnessing a resurgence of “small” models, though the definition of “small” varies considerably. Recently, the open-weight community was impressed by Weibo’s VibeThinker-3B, a 3-billion-parameter model built on a Qwen2-style architecture, which achieved an exceptional score of 94.3 on the AIME 2026 math benchmark, rivaling much larger models through extensive data curation and reinforcement learning.
Similarly, Google’s Gemma 4 family, which has surpassed 200 million downloads, brings advanced AI capabilities to the edge, including the E2B model (2 billion parameters) designed for mobile and IoT applications.
In contrast, Liquid AI’s LFM2.5-230M operates in a distinctly different parameter class. At just 230 million parameters, it is approximately one-tenth the size of Google’s smallest Gemma 4 model and VibeThinker-3B. Consequently, LFM2.5-230M is not intended for computationally intensive tasks like advanced reasoning, coding, or creative writing—a limitation that Liquid AI openly acknowledges.
However, within its specialized domains of data extraction and tool calling, the model demonstrates exceptional performance for its size. Liquid AI’s benchmarks show LFM2.5-230M scoring 43.26 on the BFCLv3 tool-use benchmark, surpassing IBM’s Granite 4.0-350M (39.58) and significantly outperforming larger models like Google’s Gemma 3 1B IT (16.61).

On the CaseReportBench for data extraction, LFM2.5-230M achieves a score of 22.51, significantly outperforming Qwen3.5-0.8B (Instruct). This demonstrates that while 3-billion-parameter models are pushing the envelope in complex reasoning tasks, a 230-million-parameter model like LFM2.5-230M offers a superior, highly optimized solution for efficient tool calling and maintaining agentic pipelines on constrained hardware.
Applications in Advanced Research and Robotics
LFM2.5-230M’s proficiency in tool calling makes it an effective skill-selection layer for complex robotic operations. Liquid AI showcased this capability by deploying the model on a Unitree G1 humanoid robot, where it operated entirely on-device using the robot’s onboard NVIDIA Jetson Orin compute module.
The model successfully processed intricate environmental commands. As detailed in the company’s technical blog, LFM2.5-230M can take a natural language instruction, such as *”Hold still for 2 seconds, then walk forward at 1 meter per second for 3 meters, hold a forward one-leg kneel for 5 seconds, and walk backward at 0.5 meters per second for 3 meters,”* and automatically translate it into a structured, multi-step execution plan. This plan leverages pre-trained low-level skills provided by NVIDIA’s SONIC framework.
Both the base and fine-tuned versions of the model are readily available on Hugging Face. The model offers native, day-one support across a wide range of inference ecosystems, including llama.cpp (GGUF), MLX, vLLM, SGLang, and ONNX.
Dual-Use LFM Open License Framework
Liquid AI distributes LFM2.5-230M under the LFM Open License v1.0. While labeled “open,” this license is not compliant with Open Source Initiative (OSI) standards. Instead, it functions as a restricted, dual-use commercial framework.
For individual developers, academic researchers, and nascent startups, the license effectively mirrors open-source software. Users are granted a perpetual, worldwide, royalty-free license to reproduce, modify, and distribute the model, provided they include original copyright notices and clearly state any modifications made.
However, the license imposes a strict “Commercial Use Limitation.” Any legal entity generating $10 million or more in annual revenue forfeits the right to use the model commercially under this specific agreement. Enterprises exceeding this revenue threshold must negotiate a separate, paid commercial license with Liquid AI for production deployment. This licensing strategy aims to protect Liquid AI’s intellectual property from appropriation by major tech corporations while simultaneously fostering widespread adoption at the developer level.
Business Style Takeaway: The launch of Liquid AI’s LFM2.5-230M signals a significant trend towards highly efficient, smaller AI models optimized for edge computing and specific tasks like data extraction. This approach offers enterprises a cost-effective and performant alternative to massive cloud-based models for routine operations, potentially unlocking new automation possibilities on-device.
Source: : venturebeat.com
