Stochastic Gradient Descent (SGD) is an increasingly popular method for optimizing the training of machine learning models.
Gradient Descent itself is a method of optimizing and subsequently quantifying the improvement that a model is making during training.
SGD has become the most popular optimization algorithm for fitting neural networks. One configuration of SGD that is becoming dominant in new AI/ML research papers is the choice of the Adaptive Moment Estimation (ADAM, introduced in 2015) optimizer.
152
238 reads
CURATED FROM
IDEAS CURATED BY
The Math Of Machine Learning
“
The idea is part of this collection:
Learn more about computerscience with this collection
The differences between Web 2.0 and Web 3.0
The future of the internet
Understanding the potential of Web 3.0
Related collections
Read & Learn
20x Faster
without
deepstash
with
deepstash
with
deepstash
Personalized microlearning
—
100+ Learning Journeys
—
Access to 200,000+ ideas
—
Access to the mobile app
—
Unlimited idea saving
—
—
Unlimited history
—
—
Unlimited listening to ideas
—
—
Downloading & offline access
—
—
Supercharge your mind with one idea per day
Enter your email and spend 1 minute every day to learn something new.
I agree to receive email updates