Gradient descent visualization

weinzierl | 263 points

These are nice animations. However I've always hesitated to get too enamored with these simple 2D visualizations of gradient descent, because one of the strange takeaways from deep learning is that behavior in high dimensions is very different from behavior in low dimensions.

In a 2D problem with many local minima, like the "eggholder functions" [1], gradient descent will be hopeless. But neural net optimization in high dimensions really is a similar situation with many local minima, except gradient descent does great.

Gradient descent in high dimensions also seems to have the ability to "step over" areas of high loss, which you can see by looking at the loss of a linear interpolation between weights at successive steps of gradient descent. This, again, seems like extremely strange behavior with no low-dimensional analogue.

[1] https://www.sfu.ca/~ssurjano/egg.html

levocardia | 12 days ago

Note that such a 2D parameter space gives often the wrong intuition when thinking about applying gradient descent on high-dimensional parameter space.

Also, mini-batch stochastic gradient descent behaves more stochastic than just gradient descent.

albertzeyer | 12 days ago

Looks great!

Killer feature: allow arbitrary surfaces to be used (choice of 2D projection if in higher dimensions). And integrate in Python to allow use in production! Allow arbitrary zoom and level of detail of surface.

DrNosferatu | 12 days ago

Here is a project I created for myself (some time ago) to help visualize the gradient as a vector field.

https://github.com/GistNoesis/VisualizeGradient

Probably best used as a support material with someone teaching along the way the right mental picture one should have.

A great exercise is to have the student (or yourself) draw this visualization with pen and paper, (both in 2d and 3d), for various functions. And you can make the connection to the usual "tangent" on the curve of a derivative.

GistNoesis | 12 days ago

Ignoring the dimensionality issue for a moment, wouldn't this also represent neural network training on a single training example/batch rather than on the whole trainset?

Still a beginner in this area, but my understanding is that the standard training procedure for neural networks used today involves computing the loss ("height" of the ball) and gradient ("slope" below the ball) for one instance, then moving the ball one step in the gradient's direction, then repeating the process with a different instance.

(You will also typically group a small number of instances together in a batch and process them in the same iteration - but that's for efficiency reasons, not because it's essential to the train algorithm)

So if you wanted to visualise the entire training procedure, I'd imagine an animation of a smoothly rolling ball but with a rapidly changing "landscape" underneath.

Would that be correct?

(Writing this out of genuine curiosity, not as a criticism of the project. I think a lot more work should be done on useful visualisations in this area - and this is an incredibly helpful start.)

xg15 | 10 days ago

I love creating things to solidify my intuition about topics. When I learned about gradient descent, I saw this repo and was inspired to create my own toy Python package for visualizing gradient descent.

My package, which uses PyVista for visualization:

https://github.com/JacobBumgarner/grad-descent-visualizer

j_bum | 12 days ago

I thought gradient descent always followed the steepest slope, this looks like a physics simulation where marbles are rolling down hills. In mathematical gradient descent do you really oscillate around the minimum like a marble rocking back and forth in a pit?

Edit: Oh, I see. The animations are "momentum" gradient descent

umvi | 12 days ago

This is great!

Which open source license is this under? (Absence of license by default implies 'copyrighted', which in this case could be in conflict with the Qt open source license. Note: I am not a lawyer.)

alok-g | 12 days ago

As a extremely visual thinker/learner, thank you for creating this!

Many things that were too abstract on paper and formulas are MUCH easier to understand this way.

can16358p | 12 days ago