Velvet continues to grow with a new generation of Italian-made models

Artificial Intelligence

1 December 2025

The Velvet family is growing. Almawave’s fully homegrown, made-in-Italy AI is expanding with two new models and enhanced versions of the existing ones—strengthening the European Large Language Model landscape and delivering accessible, multilingual AI that’s powerful and adaptable across a wide range of use cases.

Introducing Velvet 25B and Velvet Speech 2B, along with the new version of Velvet 2B, an even more compact and efficient model.

These new models boost the text processing capabilities of the entire Velvet family, making it easier to handle long, complex documents thanks to significantly expanded analysis capacity.

Velvet isn’t a single model, but a full family of models designed to deliver scalable solutions that can adapt to different needs in terms of performance, use case, infrastructure, and language coverage.

In this article, we’ll dive into the features of each model, highlight how they differ from the already established ones, and explore how the company is driving a project firmly rooted in sustainability, specialization, and innovation.

More powerful models, with smaller footprints

One of Velvet’s core design choices is to combine high performance with a deliberately lightweight architecture.

This approach, already clear in the 14B and 2B models, is now even more pronounced: instead of focusing solely on ever-larger models, Almawave is also investing in more compact, high-quality versions that deliver efficiency, consistency, and speed.

The benefit is twofold: lower energy consumption and reduced operating costs, without sacrificing answer quality or the ability to handle complex texts.

Here, “compact” definitely doesn’t mean limited: models like 2B offer strong reasoning capabilities and multilingual text analysis, all with a smaller infrastructure footprint.

It’s no coincidence that the new Velvet models are small LMs: compact architectures designed to work efficiently with limited resources. This means they can run on a single GPU—a common graphics processor in many servers—cutting both energy usage and operating costs.

In practice, this makes AI accessible not only to large enterprises, but also to local public administrations, small and medium-sized businesses, and teams with lean infrastructures.

Within the Velvet family, there are models that can run in the cloud and others that can be deployed on-premise, directly on an organization’s own servers. This gives companies with strict data control requirements the option to keep everything in-house, while those looking for more flexibility in terms of performance and scalability can opt for the cloud.

Changing needs, changing models: why diversification is essential

Public administration, healthcare, security, finance, and transportation are just a few of the areas where Velvet models are already being applied.

Rather than building a single, one-size-fits-all model, Almawave chose to develop multiple models for different needs—flexible enough to adapt to almost any scenario, from highly structured use cases to very targeted ones.

All Velvet models are “bespoke ready,” meaning they’re designed for fast and effective customization. They can be adapted to different industries using domain-specific data and terminology. In practice, they start from a strong general foundation and can then be fine-tuned and trained for any use case, while maintaining consistency, accuracy, and compliance.

This way, each organization can choose the model that best fits its needs and industry.

For example, those who need to use AI inside small devices or in very specialized, niche contexts will benefit from lean, high-performing models that are both fast and lightweight—like Velvet 2B.

These models have a major advantage: they can be easily retrained to handle new tasks and workflows over time.

We’ve already explored the features of the 14B and 2B models, but how do the new models stand out in today’s market?

Velvet 25B: The family’s largest language model

25 billion parameters: built to manage even the most complex documents

Velvet 25B stands out especially in long-context scenarios—when it needs to handle very long texts—because its architecture is designed to keep strong attention performance over extended content.

The model uses 25 billion parameters, almost twice as many as Velvet 14B.

Put simply, if you think of it as a brain, Velvet 25B can rely on 25 billion “neural connections” to link words and information, allowing it to generate more coherent, detailed, and nuanced responses.

Through its large context window (128,000 tokens), Velvet 25B can analyze documents such as legal texts, scientific reports, or legislative acts, and still maintain coherence and accuracy even when it needs to relate sections of the text that are very far apart from one another.

High-quality data to optimize training and specialization

For Velvet 25B, the training dataset is significantly larger than the one used for 14B: we’re talking about 7 trillion tokens.

A careful data filtering and cleaning process was applied to reduce duplicates, noise, and low-quality content, which in turn improves the model’s stability and the accuracy of its responses.

This dataset expansion went hand in hand with strong domain specialization, using many targeted examples from sectors such as healthcare, European law and public administration, education and culture, manufacturing and industry, and customer care. This way, the model is exposed to more real-world scenarios and industry-specific terminology, enabling it to better understand context and generate more relevant, domain-aware outputs.

Dynamic multistep reasoning and agentic orchestration

With Velvet 25B, existing capabilities are enhanced and an important new feature is introduced: dynamic multi-step reasoning (“thinking”).

What does that mean in practice? It means the model can decide on its own how much it needs to “think through” a request and which steps to take before answering, instead of replying immediately with a single, linear response.

In practical terms, the model can:

break down a request into consistent intermediate steps

adjust the number and type of steps based on how complex the task is

perform a partial self-check before returning the final output

This approach allows it to handle more complex activities more reliably—for example, analyzing long documents, connecting information across multiple sources, or checking data and legal references. The result is improved coherence, better traceability of the reasoning process, and more reliable outcomes.

To steer these capabilities toward enterprise use cases, Velvet uses post-training techniques—including reinforcement learning—to guide the model’s decision-making, cut down on unnecessary iterations, and increase relevance for each specific application.

Another key differentiator is agentic orchestration: the system’s ability to select and coordinate specialized agents or tools (for example, for data extraction, document search, summarization, or quality checks) and merge their contributions into a single, unified response.

The result is greater operational efficiency and tighter control over the workflow, especially in scenarios that require multiple areas of expertise and a series of sequential steps.

A model built for Europe, fluent across 24 languages

Compared to the earlier models, Velvet 25B also delivers a clear step up in language performance.

It supports all 24 official European languages and, unlike many competitors, it doesn’t treat English as the default reference language. Instead, it relies on training methods that are designed to ensure high quality and consistency even for less widely spoken languages.

Ready-to-use integration with AIWave

Velvet 25B is natively integrated into AIWave, Almawave’s multi-agent platform, which already uses several fine-tuned versions of Velvet 14B.

The platform includes multiple ready-to-use vertical solutions, is available both in the cloud and on-premise, and allows users to create conversational agents in a no-code/low-code environment.

Velvet Speech 2B: voice-powered interaction in multilingual environments

Velvet Speech 2B is the first multimodal model in the Velvet family, combining speech and text capabilities in a single system. Its development is no accident: it draws on over a decade of expertise from our speech-recognition labs, now applied with success to LLM technologies.

Speech 2B keeps the same strengths as the text-based 2B model—fast and lightweight—while adding new capabilities:

Automatic Speech Recognition (ASR)

Spoken Query & Question Answering

Voice interaction with the same performance level as the text model, so answers stay consistent whether the question comes via text or speech

Speech 2B can recognize and transcribe spoken language in real time, understand and respond in mixed Italian–English conversations (spoken translation with language switching), and includes speech emotion recognition features, meaning it can classify emotions in the speaker’s voice.

In practice, input can be either text or voice, while the output is always text.

These new capabilities open the door to a wide range of applications across many sectors. For example, in public administration, it could be used to transcribe city council meetings or public hearings into written reports, complete with summaries and key points.

In healthcare, the model could support hospital pre-triage: patients answer structured questions, and the system automatically compiles a pre-triage form based on their responses. Doctors could also benefit from having a written summary of the operator–patient conversation.

Velvet 2B: compact, up to date, ready for edge deployment

Velvet 2B represents an improved generation of the model, without any increase in size: the number of parameters stays the same, but its knowledge base is updated through June 2025, and the way the model runs on hardware is optimized. In practice, this means greater efficiency and lower energy consumption—even on a single GPU or compact devices.

This combination makes it ideal for bringing AI exactly where it’s needed, with minimal latency and maximum control over privacy and business continuity.

Thanks to edge deployment, the model can run directly on local devices or gateways instead of remote data centers or cloud environments. That includes compact PCs, branch servers, industrial sensors, kiosks, smart home panels, and more—contexts where protecting sensitive user or patient data is critical.

To give a concrete example, Velvet 2B 1.5 would be an excellent fit in healthcare, running on portable ward devices that provide step-by-step instructions and protocol summaries, while keeping all data processing local

A vision focused on real-world applications

With Velvet, Almawave’s goal is to bring AI into real-world operations, delivering end-to-end solutions that address the specific needs of public administration, healthcare, transportation, finance, industry, and customer care.

With AIWave—the multi-model, multi-agent platform—Velvet models naturally fit even complex use cases: document analysis and summarization, citizen and customer support, operational and administrative simplification, conversational search, and knowledge navigation.

The Velvet models continue to evolve to meet the increasingly complex needs of every sector—optimizing internal processes, strengthening communication between organizations and people, and transforming AI into a tool that genuinely serves the entire community.

Wondering what Velvet can do for your organization?

WE ARE

Highlights Almawave – April 2026

Almawave’s CEO receives the Italia Informa Award for her contribution to Italian Excellence

Central Government

Finance & Banking

Healthcare

Tourism

Municipality

Energy & Utilities

Infrastructure & Transportation

Telco & Media

Highlights

Velvet

AIWave

Generative AI

RAG

Velvet

NLQ

Group Platforms

AIWave

DataPortal.AI

D/AI Destinations

SWMS

SGMS

AIWave Cognitive Services

Omnichannel Exchange

Conversation

Speech & Voice

Discovery

Comprehension

AIWave AI Applications

Case Automation

Conversation Studio

Discovery Experience

Interaction Analytics

Data & GIS

Data & GIS

The Data Appeal Company

Sistemi Territoriali

Trusted Knowledge

Trusted Knowledge

OBDA Systems

Velvet

AIWave

RAG

Velvet

NLQ

AIWave

DataPortal.AI

D/AI Destinations

SWMS

SGMS

Omnichannel Exchange

Conversation

Speech & Voice

Discovery

Comprehension

Case Automation

Conversation Studio

Discovery Experience

Interaction Analytics

Data & GIS

The Data Appeal Company

Sistemi Territoriali

Trusted Knowledge

OBDA Systems

Highlights Almawave – April 2026

Almawave’s CEO receives the Italia Informa Award for her contribution to Italian Excellence

IBM and Almawave: Technology Agreement to Accelerate the Adoption of AI and Data Governance in Italian Enterprises

Velvet continues to grow with a new generation of Italian-made models

More powerful models, with smaller footprints

Changing needs, changing models: why diversification is essential

Velvet 25B: The family’s largest language model

25 billion parameters: built to manage even the most complex documents

High-quality data to optimize training and specialization

Dynamic multistep reasoning and agentic orchestration

A model built for Europe, fluent across 24 languages

Ready-to-use integration with AIWave

Velvet Speech 2B: voice-powered interaction in multilingual environments

Velvet 2B: compact, up to date, ready for edge deployment