TL;DR
Portugal announced AMÁLIA, a €5.5 million investment into an open-source LLM for European Portuguese. The project is progressing, but model weights and datasets are not yet publicly available. Its success could shape future Portuguese NLP efforts.
Portugal’s government revealed in December 2024 the development of AMÁLIA, a €5.5 million initiative to create a fully open-source large language model specifically for European Portuguese. This marks a significant step in advancing NLP capabilities for the language and signals government support for local AI research.
AMÁLIA is a collaborative project involving top Portuguese universities and research labs, including NOVA, IST, IT, and FCT. It builds upon the EuroLLM model, with modifications to training parameters, and emphasizes Portuguese data, especially from Arquivo.pt, which constitutes about 5.5% of the training tokens. The model aims to outperform state-of-the-art models like Qwen 3-8B on Portuguese benchmarks, which it currently does, though some benchmarks like ALBA still favor larger models.
Despite the project’s progress, the model weights, training data, and benchmark datasets remain unreleased publicly. The project has developed new benchmarks for European Portuguese, focusing on grammar, syntax, knowledge, and bias towards Brazilian Portuguese, but lacks specific measures of how well the model understands Portugal-specific content.
Why It Matters
This development is important because it represents a concerted effort to create NLP tools tailored to European Portuguese, a language with limited large-scale models compared to global languages like English or Chinese. It also demonstrates Portugal’s investment in local AI research, which could influence language-specific NLP development in other smaller linguistic communities.
However, the limited openness—particularly the absence of model weights and datasets—raises questions about the accessibility and reproducibility of the project. The model’s current capabilities suggest promising performance, but the full potential depends on future releases and broader community engagement.

Portuguese Flash Cards – Learn Portuguese Language Vocabulary Words and Phrases – Basic Language for Beginners – Gift for Travelers, Kids, and Adults by Travelflips
- Language: Portuguese flash cards for beginners
- Learning Features: Phonetic pronunciation and English translation
- Quality: Durable, high-quality cards with portable box
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background
Prior to AMÁLIA, Portugal’s NLP efforts included smaller models and benchmarks, but none with the scale or focus on European Portuguese seen in this project. The initiative follows similar efforts in Italy with Minerva and aligns with a broader trend of national investments in language-specific LLMs. The project is part of a growing recognition that smaller languages require dedicated resources to achieve competitive NLP performance.
“AMÁLIA aims to treat European Portuguese as a first-class citizen in NLP, focusing heavily on Portuguese data and benchmarks.”
— Research team member
“While AMÁLIA is impressive, the lack of publicly available weights and datasets limits immediate community engagement and assessment.”
— Hacker News author

Fado Portugues – Songs from the Soul of Portugal
- Includes Melody, Lyrics, and Chords: Melody, lyrics, and chords included
- Number of Pages: 144 pages
- Instrumentation Details: Features melody, lyrics, and chords
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What Remains Unclear
It remains unclear when model weights, datasets, and training logs will be publicly released. The extent of Portuguese data in the current model and how it compares to larger models’ knowledge about Portugal are also not fully known. Additionally, the impact of these limitations on the model’s real-world performance is still uncertain.

AI Language Translator Device, 2026 Upgraded VORMOR Translator No WiFi Needed, Support ChatGPT, Instant Two-Way 150 Languages Translation, Offline/Photo Translation for Business Travel
- Supports 150 Languages: Instant, accurate translation in 150 languages
- Fast Response Time: Translation response within 0.5 seconds
- Supports ChatGPT Integration: Includes ChatGPT and unit/currency conversion
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What’s Next
Next steps include potential release of model weights and datasets, further benchmarking, and community engagement. Monitoring official updates from the project team will clarify the project’s trajectory and impact in the coming months.

Portuguese Flash Cards – Learn Portuguese Language Vocabulary Words and Phrases – Basic Language for Beginners – Gift for Travelers, Kids, and Adults by Travelflips
- Language: Portuguese flash cards for beginners
- Learning Features: Phonetic pronunciation and English translation
- Quality: Durable, high-quality cards with portable box
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
Will the AMÁLIA model weights be publicly available?
As of December 2024, the weights have not been released. Future updates may include public access, but no official timeline has been announced.
How does AMÁLIA compare to other Portuguese NLP models?
AMÁLIA currently outperforms models like Qwen 3-8B on most Portuguese benchmarks, but some benchmarks like ALBA still favor larger models. Its performance will improve with more Portuguese data and potential future releases.
What are the main challenges in developing an LLM for European Portuguese?
The primary challenge is limited training data, especially high-quality, Portuguese-specific datasets. Additionally, balancing open access with proprietary or privacy concerns complicates model release.
Why is open sourcing important for models like AMÁLIA?
Open sourcing allows broader community testing, validation, and improvement, which accelerates development and ensures the model serves the language community effectively.
What impact could AMÁLIA have on Portuguese NLP and AI?
If fully released, AMÁLIA could significantly enhance NLP applications for Portugal, support local research, and inspire similar efforts for other small languages.