Up a level |
(2023) Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction.
(2023) Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees.
(2023) Special Properties of Gradient Descent with Large Learning Rates.
(2023) On the effectiveness of partial variance reduction in federated learning with heterogeneous data.
(2023) Stochastic distributed learning with gradient quantization and double-variance reduction.
(2022) Preserving privacy with PATE for heterogeneous data.
(2022) Decentralized Local Stochastic Extra-Gradient for Variational Inequalities.
(2022) Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning.
(2022) ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training.
(2022) ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!
(2022) Masked Training of Neural Networks with Partial Gradients.
(2021) The Peril of Popular Deep Learning Uncertainty Estimation Methods.