TL;DR
Norway’s National Library is training a Norwegian-language large language model (LLM) with 2 PB of Huawei flash storage. The project aims to create a sovereign AI reflecting Norwegian culture and language, amid technical and governance challenges.
Norway’s National Library is using 2 petabytes of Huawei OceanStor Dorado flash storage to develop a sovereign large language model (LLM) that understands Norwegian, marking a significant step in local AI development.
The project was discussed by Marius Husnes, Head of IT Platform at the Norwegian National Library, at Huawei’s ID Forum 2026 in Paris. The library aims to create a Norwegian-specific LLM because no commercial provider offers a local-language model, which Husnes said puts Norway at a disadvantage in AI applications related to its culture and history.
The library’s extensive digital collection, accumulated since 2005, includes 20 petabytes of unique data, encompassing books, newspapers, web content, and multimedia, stored across a 60-petabyte preservation system. The challenge lies in efficiently moving this data through the AI training pipeline, which involves data cleaning, deduplication, and normalization using an Nvidia DGX H200 system paired with Huawei’s all-flash arrays, totaling 2 PB of storage.
The training itself occurs on Norway’s Sigma2 Olivia supercomputer, equipped with 448 GPUs and a 5.3 PB Cray ClusterStor storage system. Husnes highlighted that the main bottleneck is not compute power but data quality and pipeline throughput, especially in transferring large datasets from the archival storage to training systems. The project also involves addressing technical issues like low-latency data access, data governance, and evaluation tools suited for the Norwegian language, which has multiple dialects and historical forms.
Why It Matters
This development underscores the strategic importance for nations to build sovereign AI capabilities, especially in non-English languages. It demonstrates how local data and infrastructure, supported by Huawei’s storage solutions, are critical in creating culturally relevant AI models. The project also highlights technical challenges in managing PB-scale datasets for AI training, which are relevant globally as countries seek to develop their own AI ecosystems.
Furthermore, Norway’s initiative signals a broader trend of smaller nations aiming for AI independence, emphasizing data sovereignty, cultural preservation, and governance issues. The involvement of Huawei’s storage technology indicates its growing role in the European AI infrastructure landscape, raising questions about supply chain dependencies and international tech alliances.

fanxiang 1TB PCIe 5.0 NVMe M.2 SSD,Up to 14000 MB/s,High Performance Solid State Drive for 8K Video Editing, AI Training,Gaming, PC, Laptop
- High-Speed PCIe 5.0 Interface: Up to 14000 MB/s read speeds
- Broad Compatibility: Supports PCIe 5.0/4.0/3.0 M.2 slots
- Efficient Dynamic Cooling: Real-time thermal management for stability
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background
Norway’s National Library has been digitizing its collection since 2005, creating one of the largest digital archives of Norwegian cultural content. The project reflects a global push for sovereign AI, driven by concerns over data control, bias, and cultural representation. Similar efforts are underway in other countries, but Norway’s approach is notable for its scale and integration of high-performance computing and advanced storage solutions.
Previous developments include the library’s legal agreements to use copyrighted content for training, and the technical groundwork of digitization and metadata management. The ongoing challenge is translating this vast, complex dataset into an effective LLM that accurately reflects Norway’s language and culture.
“No private company has this.”
— Marius Husnes
“The bottleneck was not compute; it was data quality, cleaning and pipeline throughput.”
— Marius Husnes

10Gtek PCIe Gen5 MCIO to 2xMCIO High-Speed Cable, 8X to Dual 4X 85-ohm, Server Storage Cable for NVMe Backplanes, Gen5 HBAs & All-Flash Arrays, 0.3-m(1ft)
- Connector Type: MCIO SFF-TA-1016 8i and 4i
- Cable Length: 0.3 meters (1 foot)
- Compatibility: Supports PCIe 5.0 Gen5
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What Remains Unclear
It is not yet clear how the Norwegian LLM will perform in practice, how evaluation metrics will be standardized, or how governance and access control will be managed long-term. The project is still in progress, and many technical and policy issues remain unresolved.

HHCJ6 Dell NVIDIA Tesla K80 24GB GDDR5 PCI-E 3.0 Server GPU Accelerator (Renewed)
- Product Model: Dell Nvidia Tesla K80 GPU
- Memory Capacity: 24GB GDDR5 RAM
- CUDA Cores: 4992 CUDA cores
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What’s Next
The next steps involve completing the data pipeline optimization, refining evaluation tools tailored for Norwegian, and addressing governance questions. The project aims to finalize the LLM training and assess its capabilities, with broader deployment and policy discussions likely to follow.

Data for AI: Data Infrastructure for Machine Intelligence
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
Why is Norway developing its own LLM?
Norway aims to create a sovereign AI that understands Norwegian language, culture, and history, addressing the limitations of foreign, English-centric models and ensuring cultural preservation.
What role does Huawei storage play in this project?
Huawei’s OceanStor Dorado flash storage provides the high-capacity, low-latency data infrastructure necessary for processing and training the large datasets involved in the LLM development.
What challenges are involved in this project?
Major challenges include managing PB-scale datasets, ensuring data quality, pipeline throughput, and developing evaluation and governance frameworks suitable for the Norwegian language and cultural context.
Will this model be available publicly?
It is not yet clear whether the Norwegian LLM will be publicly released or restricted to governmental and research use, as governance and policy questions are still under discussion.
Source: Hacker News