Neural Network and The Universal Approximation Theorem
By Xah Lee. Date: .
Neural Network and The Universal Approximation Theorem
- We want a machine to learn something.
- In math speak, this means, find a function f.
- This f is typically from vector space to vector space.
- Neural Network is basically generating a function g that approximates f.
- The means it generate this g, is via n nested application of functions of m parameters.
- The n is roughly the number of layers, m is number of neurons in a layer.
- Total parameters m x n is 175 billions for chatGPT.
- Finding the actual values of the parameters is the training process.
- The Universal Approximation Theory is saying, with enough n and m , the result g approximating f, is possible.