Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep Network Losses.

Neural Comput

Redwood Center for Theoretical Neuroscience and Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94720, USA; and Biological Systems and Engineering Division and Computational Research Division, Lawrence Berkeley National Lab, Berkeley, CA 94720, U.S.A.

Published: May 2021


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Despite the fact that the loss functions of deep neural networks are highly nonconvex, gradient-based optimization algorithms converge to approximately the same performance from many random initial points. One thread of work has focused on explaining this phenomenon by numerically characterizing the local curvature near critical points of the loss function, where the gradients are near zero. Such studies have reported that neural network losses enjoy a no-bad-local-minima property, in disagreement with more recent theoretical results. We report here that the methods used to find these putative critical points suffer from a bad local minima problem of their own: they often converge to or pass through regions where the gradient norm has a stationary point. We call these gradient-flat regions, since they arise when the gradient is approximately in the kernel of the Hessian, such that the loss is locally approximately linear, or flat, in the direction of the gradient. We describe how the presence of these regions necessitates care in both interpreting past results that claimed to find critical points of neural network losses and in designing second-order methods for optimizing neural networks.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8919680PMC
http://dx.doi.org/10.1162/neco_a_01388DOI Listing

Publication Analysis

Top Keywords

network losses
12
critical points
12
gradient-flat regions
8
neural networks
8
neural network
8
regions gradient
8
critical
4
critical point-finding
4
point-finding methods
4
methods reveal
4

Similar Publications

Thrips can damage over 200 species across 62 plant families, causing significant economic losses worldwide. Their tiny size, rapid reproduction, and wide host range make them prone to outbreaks, necessitating precise and efficient population monitoring methods. Existing intelligent counting methods lack effective solutions for tiny pests like thrips.

View Article and Find Full Text PDF

Challenges such as a downward trend in cultivation and post-harvest losses lead to increased gap in cocoa bean supply and demand. This review deals with the recent AI models used in farming, processing, and supply chain of cocoa beans. Farming models viz.

View Article and Find Full Text PDF

The increasing adoption of the Internet of Things (IoT) in energy systems has brought significant advancements but also heightened cyber security risks. Virtual Power Plants (VPPs), which aggregate distributed renewable energy resources into a single entity for participation in energy markets, are particularly vulnerable to cyber-attacks due to their reliance on modern information and communication technologies. Cyber-attacks targeting devices, networks, or specific goals can compromise system integrity.

View Article and Find Full Text PDF

Herein, a fluoroalkyl side chain modified A-DA'D-A small molecule acceptor, Y18-F9, was developed to optimize the bulk heterojunction morphology in organic solar cells. The introduction of fluorocarbon chains promotes self-assembly into nanoscale fibrous networks, while the low surface energy drive favorable vertical phase segregation. These synergistic effects lead to enhanced molecular packing, improved charge transport and collection, and reduced recombination losses.

View Article and Find Full Text PDF

Introduction: The ripening process of tomato fruits involves many complex changes. The elucidation of the ripening pathways contributes to the reduction of post-harvest losses and improvement of fruit quality. However, much is unknown about how tomato plants precisely synchronize metabolic regulation and fruit maturation.

View Article and Find Full Text PDF