GAMM2025

Name: GAMM2025
Start: 2025-04-07T08:00:00+02:00
End: 2025-04-11T22:00:00+02:00
Location: Lecture and Conference Centre

7–11 Apr 2025

Lecture and Conference Centre

Europe/Warsaw timezone

Convergence and Implicit Bias: Analyzing Diagonal Linear Networks with Gradient Descent

11 Apr 2025, 08:30

20m

Room 1.22

S25: Machine Learning and Data Science in Applied Mathematics and Mechanics

Wiebke Bartolomaeus

In deep learning, one often operates in a (highly) over parametrized regime. Meaning we have significantly more trainable parameters than available training data. Nevertheless, experiments show that the generalization error after training with (stochastic) gradient descent is still small, while one would expect over fitting, i.e. small training error and relatively large test error.

This suggests the existence of an implicit bias towards learning networks that generalize well, in settings where infinitely many networks can achieve zero training loss.

To investigate this phenomenon, we analyze the training dynamics of deep diagonal linear networks. Alternatively, this can be interpreted from the perspective of recovering sparse signals from linear measurements.

We propose a method to show convergence of the gradient descent and to fully characterize its limit. Using techniques inspired by Mirror Gradient Descent and a Lojasiewicz type of inequality.

Wiebke Bartolomaeus Holger Rauhut

There are no materials yet.

GAMM2025

Convergence and Implicit Bias: Analyzing Diagonal Linear Networks with Gradient Descent

Room 1.22

Speaker

Description

Co-authors

Presentation materials

Choose timezone

GAMM2025

Speaker

Description

Co-authors

Presentation materials