What is Federated Learning and How Does It Influence the European NOUS Project?
Federated Learning is a Machine Learning and Deep Learning approach that enables collaborative model training using data that remains distributed across various devices or servers. Unlike traditional learning, it is not necessary to centralize data, which significantly improves privacy and security. Our colleague Guillermo Gonzalez explained this during his presentation at the event “Driving Business Models and Solutions with EdgeAI”, where he also presented the progress of the European project NOUS.
- Initialising the global model
- Distributing the parameters (weights and biases) of the global model (initially random) to local models
- Training each local model with its own data, sourced from Edge devices such as smart meters, vehicles, mobile phones, hospital machinery, among others
- Sending the updated weights and biases from each local model back to the global model
- Aggregating the weights into the global model and repeating the process
The goal is to create a global model that learns from the contributions of all local models. This approach achieves better generalisation without compromising data, as the data never leaves the local devices — only the model parameters are transmitted. It’s an ideal technique for sectors such as healthcare, finance, or telecommunications, where data privacy is paramount.
Benefits of Federated Learning
- Data privacy
- User/client sovereignty
- Scalability (easy to add more clients – local models)
- Security through decentralisation
- Robustness, as training doesn’t depend on a single model
- Edge-based analysis, resulting in lower latency and faster response times
Federated Learning in the NOUS Project
Led by the AIR Institute in collaboration with 21 partners from 11 different countries, the NOUS project is working to build a large-scale European cloud infrastructure with the ultimate aim of enabling information exchange among multiple clients and training computationally intensive algorithms.
It is based on three core components:
- Compute → Providing infrastructure for high-cost algorithms
- Edge → Storing and training models locally on user devices
- Data → A secure central hub using DLTs and Blockchain
Local devices store data in local databases, train models, and send the model weights to the global model hosted in the central cloud hub.
Simultaneously, the data is hashed (encrypted via an algorithm), and this hash is sent to the DAG system and the blockchain to enhance security. This allows references to the data to be stored, rather than the data itself.
If a client requires access to another's data, traceability mechanisms are implemented to allow request-based access, but direct access to the data is never permitted — at least not within the Federated Learning framework.
The global model receives parameters from many local models and must determine how to aggregate them. The most widely used method is FedAvg, which weights each model’s parameters according to the size of the dataset it was trained on — giving more influence to models trained on larger data volumes.
Use Case: Energy Consumption Prediction
An interesting use case involves smart meters in different homes collecting electricity usage data. A model is trained locally in each household using its own data, and then each one sends its parameters to the global model.
Overall, the benefits of using a Federated Learning approach include increased accuracy by training multiple models instead of just one, enhanced data privacy as the data never leaves the local environment, easier scalability, and most importantly, enabling indirect data collaboration without compromising security.