ELM's Learning Theories

New Learning Theory - Learning can be made without iteratively tuning (artificial) hidden nodes (or hundred types of biological neurons) even though the modeling of biological neurons may be unknown as long as they are nonlinear piecewise continuous, and such a network can approximate any continuous target function with any small error and can also separate any disjoint regions without tuning hidden neurons.

G.-B. Huang, et al., “Universal approximation using incremental constructive feedforward networks with random hidden nodes,” IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006.
G.-B. Huang and L. Chen, “Convex Incremental Extreme Learning Machine,” Neurocomputing, vol. 70, pp. 3056-3062, 2007.
G.-B. Huang, et al, “Extreme Learning Machine for Regression and Multiclass Classification,” IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, vol. 42, no. 2, pp. 513-529, 2012.

G.-B. Huang, “An Insight into Extreme Learning Machines: Random Neurons, Random Features and Kernels,” Cognitive Computation, vol. 6, 2014.

G.-B. Huang, “What are Extreme Learning Machines? Filling the Gap between Frank Rosenblatt's Dream and John von Neumann's Puzzle,” Cognitive Computation, vol. 7, 2015.

ELM and Neumann Puzzle:

J. von Neumann, Father of Computers’ Puzzles [Neumann 1951, 1956]:
Why ``an imperfect (biological) neural network, containing many random connections, can be made to perform reliably those functions which might be represented by idealized wiring diagrams”

Answered by ELM Learning Theory[Huang, et al 2006, 2007, 2008]
“As long as the output functions of hidden neurons are nonlinear piecewise continuous and even if their shapes and modeling are unknown, (biological) neural networks with random hidden neurons attain both universal approximation and classification capabilities, and the changes in finite number of hidden neurons and their related connections do not affect the overall performance of the networks.” [Huang 2014]

ELM Learning Theory[Huang, et al 2006, 2007, 2008, 2014, 2015]

  • ELM can be used to train wide type of multi hidden layer of feedforward networks:
  • Each hidden layer can be trained by one single ELM based on its role as feature learning, clustering, regression or classification.
  • Entire network as a whole can be considered as a single ELM in which hidden neurons need not be tuned.
  • ELM slice can be ``inserted” into many local parts of a multi hidden layer feedforward network, or work together with other learning architectures / models.
  • A hidden node in an ELM slice (a ``generalized” SLFN) can be a network of several nodes, thus local receptive fields can be formed.
  • In each hidden layer, input layers to hidden nodes can be fully or partially randomly connected according to different continuous probability distribution function.
  • From ELM theories point of view, the entire multi layers of networks are structured and ordered, but they may be seemingly ``messy” and ``unstructured” in a particular layer or neuron slice. ``Hard wiring” can be randomly built locally with full connection or partial connections.
  • Co-existence of globally structured architectures and locally random hidden neurons happen to have fundamental learning capabilities of compression, feature learning, clustering, regression and classification.
  • Biological learning mechanisms are sophisticated, we believe that ``learning without tuning hidden neurons” is one of fundamental biological learning mechanisms in many modules of learning systems. Furthermore, random hidden neurons and ``random wiring” are only two specific implementations of such ``learning without tuning hidden neurons” learning mechanisms.

ELM's Biological Evidences

more and more biological evidences coming

Direct biological evidences:

  1. Neurons in rats’ olfactory systems later found random and without being tuned in 2013 [Barak, et al 2013, Rigotti, et al 2013, Fusi, et al 2015]


    O. Barak, et al, "The importance of mixed selectivity in complex cognitive tasks," Nature, vol.497, 2013
    M. Rigotti, et al, "The sparseness of mixed selectivity neurons controls the generalization-discrimination trade-off," Journal of Neuroscience, vol. 33, no. 9, 2013
    S. Fusi, E. K Miller, and M. Rigotti, "Why neurons mix: high dimensionality for higher cognition," Current Opinion in Neurobiology, vol. 37, 2015

  2. Neurons in relevant regions of monkeys remain un-tuned in different trials of experiments [J. Xie and C. Padoa-Schioppa 2016, E. L Rich and J. D Wallis 2016]


    J. Xie and C. Padoa-Schioppa, “Neuronal remapping and circuit persistence in economic decisions,” Nature Neuroscience, vol. 19, 2016
    E. L Rich and J. D Wallis, “What stays the same in orbitofrontal cortex,” Nature Neuroscience, vol. 19, no. 6, 2016

Indirect biological evidences:

  1. One-shot learning capability of ELM consistent to human behaviours tested [R. I. Arriaga, et al 2015]
    Direct biological evidence:


    R. I. Arriaga, et al. “Visual Categorization with Random Projection,” Neural Computation, vol. 27, 2015

Biological alike neurons:

  1. IBM smart material based neurons: stochastic neurons


    T. Tuma, et al, “Stochastic phase-change neurons,” Nature Nanotechnology, vol. 11, 2016