IBS Publications Repository: α-stable convergence of heavy-/light-Tailed infinitely wide neural networks

BROWSE

Related Scientist

's photo.

: 수리및계산과학연구단

ITEM VIEW & DOWNLOAD

IBS Publications RepositoryPioneer Research Center for Mathematical and Computational Sciences(수리 및 계산과학 연구단)1. Journal Papers (저널논문)

α-stable convergence of heavy-/light-Tailed infinitely wide neural networks

DC Field	Value	Language
dc.contributor.author	Jung, Paul	-
dc.contributor.author	Lee, Hoil	-
dc.contributor.author	Lee, Jiho	-
dc.contributor.author	Hongseok Yang	-
dc.date.accessioned	2024-01-16T22:00:22Z	-
dc.date.available	2024-01-16T22:00:22Z	-
dc.date.created	2023-08-02	-
dc.date.issued	2023-12	-
dc.identifier.issn	0001-8678	-
dc.identifier.uri	https://pr.ibs.re.kr/handle/8788114/14629	-
dc.description.abstract	We consider infinitely wide multi-layer perceptrons (MLPs) which are limits of standard deep feed-forward neural networks. We assume that, for each layer, the weights of an MLP are initialized with independent and identically distributed (i.i.d.) samples from either a light-Tailed (finite-variance) or a heavy-Tailed distribution in the domain of attraction of a symmetric <![CDATA[ $\alpha$ ]]>-stable distribution, where <![CDATA[ $\alpha\in(0,2]$ ]]> may depend on the layer. For the bias terms of the layer, we assume i.i.d. initializations with a symmetric <![CDATA[ $\alpha$ ]]>-stable distribution having the same <![CDATA[ $\alpha$ ]]> parameter as that layer. Non-stable heavy-Tailed weight distributions are important since they have been empirically seen to emerge in trained deep neural nets such as the ResNet and VGG series, and proven to naturally arise via stochastic gradient descent. The introduction of heavy-Tailed weights broadens the class of priors in Bayesian neural networks. In this work we extend a recent result of Favaro, Fortini, and Peluchetti (2020) to show that the vector of pre-Activation values at all nodes of a given hidden layer converges in the limit, under a suitable scaling, to a vector of i.i.d. random variables with symmetric <![CDATA[ $\alpha$ ]]>-stable distributions, <![CDATA[ $\alpha\in(0,2]$ ]]>. © The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust.	-
dc.language	영어	-
dc.publisher	Cambridge University Press	-
dc.title	α-stable convergence of heavy-/light-Tailed infinitely wide neural networks	-
dc.type	Article	-
dc.type.rims	ART	-
dc.identifier.wosid	001168005000006	-
dc.identifier.scopusid	2-s2.0-85165334144	-
dc.identifier.rimsid	81390	-
dc.contributor.affiliatedAuthor	Hongseok Yang	-
dc.identifier.doi	10.1017/apr.2023.3	-
dc.identifier.bibliographicCitation	Advances in Applied Probability, v.55, no.4, pp.1415 - 1441	-
dc.relation.isPartOf	Advances in Applied Probability	-
dc.citation.title	Advances in Applied Probability	-
dc.citation.volume	55	-
dc.citation.number	4	-
dc.citation.startPage	1415	-
dc.citation.endPage	1441	-
dc.description.journalClass	1	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordAuthor	Heavy-Tailed distribution	-
dc.subject.keywordAuthor	infinite-width limit	-
dc.subject.keywordAuthor	Keywords:	-
dc.subject.keywordAuthor	multi-layer perceptrons	-
dc.subject.keywordAuthor	stable process	-
dc.subject.keywordAuthor	weak convergence	-