Monday 4:30 p.m.–5 p.m.
A Beginner's Guide to Deep Learning
- Audience level:
What is deep learning? It has recently exploded in popularity as a complex and incredibly powerful tool. This talk will present the basic concepts underlying deep learning in understandable pieces for complete beginners to machine learning. We will review the math, code up a simple neural network, and provide contextual background on how deep learning is used in production now.
# Intro: why do we care? Neural networks and deep learning have powered astounding advances in machine learning. [insert a few examples] As models become more complex, they can become intimidating to parse and explain—especially one like deep learning which has no simple analogy. With advances in computing power, complex systems can be effortlessly spun up, allowing for more powerful machine learning models than ever seen before. # Back to basics: Perceptrons Deep learning rests on advances gained previously, including perceptrons and neural networks. In this section, we’ll outline the linear combination of coefficients and values that make up a perceptron. We’ll also explain the step function (if a > 0.5, then 1 else 0) versus a more probabilistic function like the sigmoid function. This portion will involve coding up a basic perceptron so that the audience can get a hands-on feeling to rather abstract concepts. # Neural Networks One perceptron does not a neural network make. The next step up the ladder is neural networks, which are made of layers of perceptrons. A simple neural network has two layers: input and output. Traditionally, the example used for neural networks is a digit recognizer where the input is the grayscale intensity of the pixel and the output is the digit (0–9). We can also add more layers to give our model more flexibility. Training our model involves some mathematical propagation of values based on a pre-labeled dataset. (I do not plan to prove the mathematical validity of the forward and back propagation formulas.) Now that we’ve gotten through the math, it’s time to actually see some more Python. We’ll throw together a neural network from scratch and run it on some training data. We will also cover how to use scikit-learn, the most widely used library for machine learning, for models in the future. # Deep Learning With the fundamentals covered, we will finally reach deep learning stage. Questions like "does the picture have two ears near the sides?" and "does the picture have a mouth near the bottom" may depend on each other in an interconnected way to decide this. Deep neural networks refer to neural networks with many hidden layers, which extends nicely from the above example. We can imagine trying to decide if the picture is a human face. Deep belief networks have undirected connections between some of the layers whereas. The undirected layers in the DBN are called Restricted Boltzmann Machines. This layers can be trained using an unsupervised learning algorithm (Contrastive Divergence) that is very fast. It is important that we don’t get too deep into the latest deep learning research since the landscape is changing quickly. Instead, I plan to provide a rough overview through illustrations and high-level descriptions, demonstrate some examples of applications, and point in the direction of how to build your own model. Additionally the limitations of deep learning should be clearly stated. We do not have a way of interpreting why the model may work the way it does, but the results are so powerful that we can overcome the lack of understanding. A classic example is deciding why a cat is labelled a cat. Is it the ears? Is it the nose? The model does not provide a clear way to decipher it.