How Federated Learning can improve ML?

11 min readJun 14, 2022

As with most life changes, there will be positive and negative impacts on society as Machine Learning (ML) continues to transform our world. They have had significant advancements in vision, speech recognition and generation, Natural Language Processing, image, and video generation, etc.

People today are using ML to dictate on their phones, get recommendations, enhance images and video backgrounds, and so much more. The core technology behind most of the visible advances is ML which has undoubtedly created tremendous longevity and thrilling material progress achieving incredible heights of intelligence.

One of such recent yet dramatic progress in ML is a newly revoluted concept known as Federated Learning (FL). It is simply a decentralized form of ML born at the intersection of on-device ML and edge computing/IoT.

To fully understand how FL works, we will go over current advancements and drawbacks of ML and how FL addresses these challenges with use cases and examples.

Current advancements in ML and its impact

ML is continuously transforming the way we live and get things done. It operates through the simulation of human intelligence in machines programmed to think like humans and mimic their actions, exhibiting learning, analyzing, comprehending, and problem-solving traits. The ideal characteristic of ML is its ability to rationalize and take steps with the optimum chance of achieving a specific goal or defined operations.

Advances in ML have gone beyond a few machines performing basic calculations. The expanse of the human mind has brought about developments using a cross-disciplinary approach in Computer Science, Mathematics, Linguistics, Psychology, and more. It is applied in many sectors and industries to generate the maximum output.

Some examples include ML in the healthcare industry for dosing drugs and treatments in patients, surgical procedures, etc. In addition, machines utilizing ML contain self-driving cars, automated systems, and much more.

Moreover, ML in the financial industry detects and flags malicious activities, such as unusual credit card usage and large account deposits. ML applications are also leveraged in the trading industry to simplify processes in supply, demand, and pricing of securities and to make estimates by building comprehensive analysis algorithms. To further understand the technology and its applications, let’s delve into more details on the impact of ML in various sectors.

Machine Learning in Business

Most businesses rely heavily on real-time reporting, accuracy, and processing of large volumes of quantitative data to make informed decisions. Therefore, with the efficiency and effectiveness desired, it becomes necessary to implement ML.

The adaptive intelligence in chatbots and automation helps smoothen the way for online help centers. For example, getting immediate answers to queries or solutions to problems happens with robotic process automation. It reduces the repetitive tasks performed by humans. In addition, the algorithms integrated into analytics and CRM platforms uncover information on how to serve customers better.

Machine Learning in Marketing

One of the most fantastic applications of ML is the marketing industry, with critical areas for improvement and the latest trends in ML. The early 2000s were not so great in ML’s implementation in the marketing domain online. E-commerce has always existed along with the internet, but the search wasn’t that great. With improvements in ML, intelligent suggestions and inquiries are way more effective now.

Moreover, ML has also made its way into many software and hardware components used by marketing personnel to help calibrate and analyze vast amounts of data. Big Data and ML have been some of the major players. They have shined and effectively elevated the various processes involved in handling data. Thus, eliminating the load of performing monotonous and mundane tasks. It is safe to say that ML’s implementation in the marketing sector has elevated the productivity of the domain up to many notches.

Machine Learning in Social Media

Today’s world is changing with social media platforms, including Instagram, Snapchat, Facebook, Twitter, and the likes. People are using these apps to stay connected with the world. Today, social media has established itself as an essential element for the current generation, think Gen Zs and a significant number of millennials. As a result, we have been generating an immeasurable amount of data through chats, tweets, posts, etc.

ML influences almost everything in the social media space, from notifications to upgrades. These algorithms would store and analyze data from previous web searches, behaviors, interactions, and much more to yield a personalized experience by designing feeds based on specific interests. ML in social media can also be associated with Big Data and AI. For example, deep learning extracts every minute detail from an image using many deep neural networks.

Machine Learning in Security & Surveillance

A human operative usually conducts traditional security monitoring. However, humans are susceptible to making mistakes for various reasons, and errors in this domain can be a dangerous affair. Like the above applications of ML, supervised exercises can train it, developing security algorithms, identification protocols, and much more to receive input from security cameras. Eventually, ML models can identify potential threats and warn human security officers to investigate further.

ML has significantly evolved in the surveillance domain and can identify threats such as intruders, invalid access, unidentified individuals on defined premises, etc. Although limited in its capabilities, the security and surveillance domain expects ML and even AI to be a significant asset globally in the coming years.

Drawbacks of Machine Learning

As ML becomes increasingly embraced in more industries, there’s a growing attempt to achieve the delicate harmony of efficiently using its utility while safeguarding users’ privacy. A common best practice of ML is transparency in its use and how it attains specific outcomes. However, some positives and negatives could arise. Here are some of the drawbacks of ML in maintaining privacy, security, and protecting users’ data.

Privacy

One of the most relevant arguments against ML and its transparency is the potential lack of privacy. ML plays a vital role in our daily lives, such as Google searches, Maps, Alexa, and personalized recommendations on portals like Facebook, Netflix, Amazon, YouTube, etc. However, the negative aspect poses great privacy and social risk, primarily how some organizations collect and process enormous amounts of user data without adequate knowledge or consent.

Additionally, ML implementation of tracking individuals’ online behavior containing certain critical information can be acquired, including data about:

Race or ethnicity
Political beliefs
Religious affiliations
Gender
Sexual orientation
Health conditions, etc.

Even when people choose not to share such sensitive information online, breaches may still occur due to ML capabilities.

Security

ML techniques greatly enhance security while aiding cybercriminals in penetrating systems with little to no human intervention.

ML and AI in cybersecurity create emerging threats to digital security, especially in launching more sophisticated attacks using complex and adaptive software. In addition, adversarial ML/AI involving the development and use of ML/AI for malicious purposes also poses significant dangers. With this technique, models misinterpret inputs into the system and behave favorably to the attacker. For example, misidentifying or misclassifying objects are exploited due to intentionally modified inputs.

Another risk in cybersecurity concerns can be human complacency. Considering ML and AI as part of its cybersecurity strategy, there’s a higher risk that employees may lower their guard and lose focus even with potential threats.

User data protection

Numerous consumer products, from smart home appliances to computer applications, tend to have features that make them susceptible to data exploitation by ML. To heighten issues, people are often unaware of how much data their software and devices generate, process, or share. And as we become more dependent on digital technology in our daily activities, the potential for exploitation will keep increasing rapidly.

Privacy and data protection concerns emerge as ML-fuelled algorithms consume vast consumer and vendor data to create new bits of critical information unknown to the consumers.

ML/AI requires a vast amount of data, including personal data, to train models. However, as data privacy and security depict a growing critical concern, new ML methodologies like FL are cropping up to address these concerns.

Google brought FL into existence in a paper first published in 2016. The critical ingredient is that data scientists can train shared statistical models based on decentralized devices or servers with a local data set. Although data scientists use the same model to train, there is no need to upload private data to the cloud or swap data with other scientists or research teams. As ML uses centralized techniques that require data sets to reside on a single server, FL handles the above challenges by maintaining local data stores.

Let us dive into more details on how can FL tackles the drawbacks of ML.

Federated Learning for Privacy

FL provides a variety of privacy advantages out of the box. A new decentralized ML procedure to train models using multiple data providers. Rather than gathering data on a single server, the data remains locked on servers as the algorithms and only the predictive models travel between the servers. This approach aims for each participant to benefit from a larger data pool than their own, resulting in increased ML performance while respecting data ownership and privacy.

The traditional ML model adopts a centralized approach that requires the training data to be aggregated on a single machine or in a data center. It is pretty much what tech giants like Google, Facebook, and Amazon have done. These companies have been collecting an enormous amount of data and storing it in data centers to train ML models. However, this centralized training approach is privacy-intrusive, especially for mobile phone users. These phones may contain the owners’ privacy-sensitive data. As a result, mobile phone users have to trade their privacy by sharing their data with the cloud.

However, FL enables geographically distributed devices to train an ML model while preserving privacy. As a result, users can now benefit from obtaining a well-trained ML model without transferring privacy-sensitive data to the cloud.

This concept of FL addresses data ownership and privacy issues by ensuring the data never leaves the distributed node devices. At the same time, as the central model updates, it is shared with all nodes in the network. The sites or devices with data receive copies of ML models. The model’s training happens locally. Finally, the updated neural network weights transport back to the leading repository to implement the updates. Therefore, various nodes help develop a universal, robust ML model iteratively through a shuffled central model sharing, local optimization, local update sharing, and solid model updates.

Federated Learning for Data Security and Protection

FL is fast becoming a well-sought-out solution to ensuring individuals’ privacy rights. FL can build more competent ML models by being transparent about the usage and guaranteeing secure data storage.

However, these local updates are susceptible to privacy attacks without adequate protection. The FL process’s primary security and data protection methods are Global Differential Privacy, Model Encryption, and Secure Multi-Party Computation (SMPC).

Global Differential Privacy (GDP)

A framework in which an algorithm is differentially private only when adding a singular instance into the training set does not cause a statistically substantial change to the algorithm’s output. GDP is the most prevalent privacy protection method due to its simple algorithmic convenience, information guarantee, and lesser overhead cost. Differential privacy is accomplished by unsystematically disturbing the parameters of the local model before aggregation and incorporation into the global model.

Model Encryption

Another helpful method in security and data protection in FL is model encryption. The global model parameters are encrypted before distributing them to the participating sites for local training. Once the local models get the parameters, they deliver local gradients. The global model then combines all local angles and decrypts them to update the central system.

Secure Multi-Party Computation (SMPC)

This method uses a specialized privacy preservation method requiring only trusted parties permitted to obtain the output of a function using the input of their private data. These devices do not acquire information about the model other than the received output. Homomorphic encryption can mask local model updates and help achieve such privacy. The trusted devices perform computations while the model is encrypted, revealing no information about the global model.

FL has received extensive attention in handling the challenge of data protection and security by cooperatively learning a global model without impacting data privacy. However, despite its potential benefit, the additional computational burden and having insufficient supporting frameworks and libraries can hinder the technology and its methods like homomorphic encryption, making it more challenging to implement.

Challenges in Federated Learning

Two significant challenges in FL are:

Communication bandwidth: For example, FL on mobile phones relies on wireless communication to train an ML model collaboratively. Although the compute resources of mobile phones are becoming increasingly powerful, wireless communication bandwidth has not increased much. It would result in long communication latency due to limited communication bandwidth and thus could significantly slow down the convergence time of the FL process.
Another challenge to address is the reliability of end devices participating in the FL process. As an iterative process, FL relies heavily on the participating end devices to continuously communicate over iterations until the learning process consolidates. However, in real-world deployments, not all end devices may fully participate in the complete iterative process from start to finish due to various practical reasons. As a result, the training iterations cannot fully utilize their data for end devices that drop out mid-way in the FL process, jeopardizing the learning quality.

Everyday use cases of Federated Learning

Federated Learning in Healthcare

The healthcare industry has multiplied with regulations, including the implementation of HIPAA (Health Insurance Portability and Accountability Act). As a result, it is becoming increasingly difficult for organizations to implement new technologies due to the scale and complexity of healthcare regulations. Notably, the healthcare industry’s decline in resources is evident.

FL allows vast volumes of heterogeneous data from several organizations to be integrated into the model building while adhering to local clinical data regulations. As a result, clinicians would get access to a more solidified ML algorithm that they would not have encountered locally based on data from larger demography of patients. They’d also be able to contribute to the algorithm’s ongoing training at any time.

With FL offering a secure way to learn from more diverse algorithms, healthcare companies can rapidly bring ground-breaking inventions to the market. On the other hand, rather than focusing on the managed supply of free datasets, research institutes can now channel their efforts to actual clinical needs based on a wide range of real-world data.

Federated Learning in Fintech

Whether it’s mobile banking or payment apps, data security has become a critical concern in the Internet world. Businesses that rely on Fintech confront several challenges. These concerns include obtaining clearance and lawful consent, data preservation, and the time and expense of gathering and transporting data across networks.

FL enables cooperative ML training on decentralized data without data transmission between participants. It provides solutions for Fintech, for example, by looking for data breaches and ATO (Account Takeover) Fraud. Moreover, it analyzes credit scores and comprehends a user’s digital footprint to prevent fraudulent actions without sharing data to the cloud.

Therefore, FL makes it possible for Fintech to mitigate risks. In addition, it develops new and inventive techniques for customers and enterprises and establishes better trust between the two parties.

Federated Learning for Autonomous Vehicles

Furthermore, FL can give a better and safer self-driving car experience with real-time data and predictions. Autonomous vehicles require real-time traffic data and continuous training for better real-time decision-making. FL allows the models to improve over time with input from other cars that might self-train locally based on the streets they run on.

Conclusion

FL can provide various solutions to the existing ML challenges while not compromising efficiency and goals.

In this article, we went over the current advancements in ML with real-life examples. Then, we further delved into the drawbacks of ML and how FL is tackling these issues. Finally, we identified the challenges in FL and everyday use cases of FL with examples.

It is eminent that FL presents numerous potentials. For example, it secures user-sensitive information, aggregates results, and identifies common patterns from multiple users, making the model robust.

It trains itself based on the user data, keeps the data secure, and comes back as a more intelligent model. FL is revolutionizing the industry, whether training, testing or information privacy. It has created a new era of ML as businesses, consumers, researchers, scientists, etc., benefit hugely from the FL models.

References

Author: Joy Nwaiwu