SGD phase diagrams in batch size (B) and learning rate (η) for fully-connected (5-hidden layers) on parity MNIST. Credit: Proceedings of the National Academy of Sciences (2024). DOI: 10.1073/pnas.2316301121

In an era where artificial intelligence (AI) is transforming industries from health care to finance, understanding how these digital brains learn is more crucial than ever. Now, two researchers from EPFL, Antonia Sclocchi and Matthieu Wyart, have shed light on this process, focusing on a popular method known as Stochastic Gradient Descent (SGD).

At the heart of an AI’s learning process are algorithms: sets of rules that guide AIs to improve based on the data they’re fed. SGD is one of these algorithms, like a guiding star that helps AIs navigate a complex landscape of information to find the best possible solutions a bit at a time.

However, not all learning paths are equal. The EPFL study, published in Proceedings of the National Academy of Sciences reveals how different approaches to SGD can significantly affect the efficiency and quality of AI learning. Specifically, the researchers examined how changing two key variables can lead to vastly different learning outcomes.

The two variables were the size of the data samples the AI learns from at a single time (this is called the “batch size”) and the magnitude of its learning steps (this is the “learning rate”). They identified three distinct scenarios (“regimes”), each with unique characteristics that affect the AI’s learning process differently.

In the first scenario, like exploring a new city without a map, the AI takes small, random steps, using small batches and high learning rates, which allows it to stumble upon solutions it might not have found otherwise. This approach is beneficial for exploring a wide range of possibilities but can be chaotic and unpredictable.

The second scenario involves the AI taking a significant initial step based on its first impression, using larger batches and learning rates, followed by smaller, exploratory steps. This regime can speed up the learning process but risks missing out on better solutions that a more cautious approach might discover.

The third scenario is like using a detailed map to navigate directly to known destinations. Here, the AI uses large batches and smaller learning rates, making its learning process more predictable and less prone to random exploration. This approach is efficient but may not always lead to the most creative or optimal solutions.

The study offers a deeper understanding of the tradeoffs involved in training AI models, and highlights the importance of tailoring the learning process to the particular needs of each application. For example, medical diagnostics might benefit from a more exploratory approach where accuracy is paramount, while voice recognition might favor more direct learning paths for speed and efficiency.

More information:
Antonio Sclocchi et al, On the different regimes of stochastic gradient descent, Proceedings of the National Academy of Sciences (2024). DOI: 10.1073/pnas.2316301121

Journal information:
Proceedings of the National Academy of Sciences

Provided by
Ecole Polytechnique Federale de Lausanne

Post Disclaimer

The information provided in our posts or blogs are for educational and informative purposes only. We do not guarantee the accuracy, completeness or suitability of the information. We do not provide financial or investment advice. Readers should always seek professional advice before making any financial or investment decisions based on the information provided in our content. We will not be held responsible for any losses, damages or consequences that may arise from relying on the information provided in our content.

Charting new paths in AI learning: How changing two variables leads to vastly different outcomes

Post Disclaimer

Top 5 Integration Platform as a Service (iPaaS) Vendors in 2025: Comprehensive Analysis, Rankings, and Use Cases

Infrastructure as Code (IaC): How Corporations Thrive in 2025

AI-Driven Identity and Access Management Application Manufacturers in 2025: Vendor Analysis, Market Leaders, and Industry Insights

Machine Identity Management Application Manufacturers 2025: Comprehensive Vendor Analysis and Industry Leaders

Explainable AI Trends 2025: Boosting Transparency and Trust in Artificial Intelligence

Revolutionary AI Agent Technology for 2025

The Role of AI in Medical Imaging: Current Trends and the Future Outlook for 2025

Most Popular

Top Gen AI Trends Transforming Supply Chain Operations 2025

What’s ahead with Ai for the Supply Chain Industry

Understanding the Duty and Duties of a Supply Chain Supervisor

The Rise of Supply Chain as a Service (SCaaS): Unlocking Efficiency and Resilience in Modern Businesses

Recent Comments

EDITOR PICKS

The Rising Tide of AI and Machine Learning in Cybersecurity

Navigating the Web 3.0: A Guide to Harnessing Its Power in 2024

The Future of Payments: How AI and Machine Learning are Revolutionizing Account-to-Account (A2A) Transactions

POPULAR POSTS

Machine Identity Management Application Manufacturers 2025: Comprehensive Vendor Analysis and Industry Leaders

What is Scalable Computing?

Were you caught up in the latest data breach? Here’s how to tell

POPULAR CATEGORY

ABOUT TECH ONLINE NEWS

FOLLOW US