Hello and welcome to my new course “Computer Vision & Deep Learning in Python: From Novice to Expert”
Making a computer classify an image using Deep Learning and Neural Networks is comparatively easier than it was before. Using all these ready made packages and libraries will few lines of code will make the process feel like a piece of cake.
Its just like driving a big fancy car with an automatic transmission. You just only have to know how to use the basic controls to drive it. But, if you are a true engineer, you will also be fascinated about the internal working of the engine. In an expert level, you should be able to build your own version of that car from the scratch using the available basic components. Even-though the performance may not match the commercial production line version, the experience knowledge you gain from it cannot be explained in words.
And only because of this we have our course divided into exactly two halves. In the first half we will learn the working concepts of image recognition using computer vision and deep learning and will try to implement the simple versions of popular algorithms and techniques using plain python code. In the next half we will use the popular packages and libraries to implement more complex deep learning image classification models.
Here is a quick list of sessions that are included in this course.
The first three sessions will be theory sessions in which we will have overview about the concepts of deep learning and neural networks. We will also discuss the basics about a digital image and its composition.
Then we will prepare your computer by installing and configuring Anaconda, the free and open-source Python data science platform and the other dependencies to proceed with our exercises.
If you are new to python programming, don’t worry. The next four sessions will be covering the basics of python program with simple examples.
And here comes the aforementioned first half with our own custom code and libraries.
In the coming two theory sessions we will be covering the basics of image classification and the list of datasets that we are planning to cover in this course.
Then we will do a step by step custom implementation of The k-nearest neighbours (KNN) algorithm. It is a simple, easy-to-implement supervised machine learning algorithm that can be used to solve both non-linear classification and regression problems. We will use our own created classes and methods without using any external library. The theory sessions involve learning the KNN basics. Then we will go ahead with downloading the dataset, loading, preprocessing and splitting the data. We will try to train the program and will do an image classification among the three set of animals. Dogs, cats and pandas prediction using our custom KNN implementation.
Now we will proceed with Linear Classification. Starting with the Concept and Theory, we will proceed further with building our own scoring function and also implementing it using plain python code. Later we will discuss about the loss function concepts and also the performance optimization concepts and the terminology associated with it.
Then will start with the most important optimization algorithm for deep learning which is the Gradient Decent. We will have separate elaborate sessions where we will learn the concept and also implementation using the custom code for Gradient Decent. Later we will proceed with the more advanced Stochastic Gradient Decent with its concepts in the first sessions, later with implementing it using the custom class and methods we created.
We will then look at regularization techniques that can also be used for enhancing the performance and also will implement it with our custom code.
In the coming sessions, we will have Perceptron, which is a fundamental unit of the neural network which takes weighted inputs, process it and is capable of performing binary classifications. We will discuss the working of the Perceptron Model. Will implement it using Python and also we will try to do some basic prediction exercises using the preceptron we created.
In deep learning, back-propagation is a widely used algorithm in training feed-forward neural networks for supervised learning. We will then have a discussion about the mechanism of backward propagation of errors. Then to implement this concept, we will create our own classes and later implementation projects for a simple binary calculation dataset and also the MNIST optical character recognition dataset.
And with all the knowledge from the pain of making custom implementations. We can now proceed with the second half of deep learning implementation using the libraries and packages that are used for developing commercial Computer Vision Deep Learning programs
We will be using Keras which is an open-source neural-network library written in Python. It is capable of running on top of TensorFlow, Theano and also other languages for creating deep learning applications.
At first we will build a simple Neural Network implementation with Keras using the MNIST Optical Character Recognition Dataset. We will train and evaluate this neural network to obtain the accuracy and loss it got during the process.
In deep learning and Computer Vision, a convolutional neural network is a class of deep neural networks, most commonly applied to analysing visual imagery. At first we will have a discussion about the steps and layers in a convolutional neural network. Then we will proceed with creating classes and methods for a custom implementation of Convolutional neural network using the Keras Library which features different filters that we can use for images.
Then we will have a quick discussion about the CNN Design Best Practices and then will go ahead with ShallowNet. The basic and simple CNN architecture. We will create the common class for implementing ShallowNet and later will train and evaluate the ShallowNet model using the popular Animals as well as CIFAR 10 image datasets. Then we will see how we can serialize or save the trained model and then later load it and use it. Even-though a very shallow network, we will try to do prediction for an image we give using shallowNet for both the Animals and CIFAR 10 dataset.
After that we will try famous CNN architecture called ‘LeNet’ for handwritten and machine-printed character recognition. For LeNet also, will create the common class and later will train, evaluate and save the LeNet model using the MNIST dataset. Later we will try to do prediction for a hand written digit image.
Then comes the mighty VGGNet architecture. We will create the common class and later will train, evaluate and save the VGGNet model using the CIFAR-10 dataset. After hours of training, later we will try to do prediction for photos of few common real-life objects falling in the CIFAR-10 categories.
While training deep networks, it is helpful to reduce the learning rate as the number of training epochs increases. We will learn a technique called as Learning Rate Scheduling in our next session and implement it in our python code.
Since we are spending hours to train a model, if we don’t checkpoint our training models at the end of a job, there is a great chance that we’ll have lost all of our hard earned results! We will see how we can efficiently do that in the coming sessions.
Enough with training using our little computer. Lets go ahead with popular Deep learning models already pre-trained for us which are included in Keras library. They are trained on Imagenet data which is a collection of image data containing 1000 categories of images.
The first pre-trained model that we are dealing with is the VGGNet-16, we will download the already trained model and then do the prediction. Later will go a bit deeper with VGGNet-19 pre-trained model and will do the image classification prediction.
The next pre-trained model that we are using is the ResNet, which can utilize a technique called skip connections, or shortcuts to jump over some layers. We will do the image classification prediction with this network too.
Finally, we will get the Inception and Xception models. Which are convolutional neural networks trained on more than a million images from the ImageNet database. They learn by using Depthwise Separable Convolutions. We will download the weights and do the image classification prediction with this network too.
Overall, this course will be the perfect recipe of custom and ready-made components that you can use for your career in Computer Vision using Deep Learning.
All the example code and sample images with dataset can be downloaded from the link included in the last session or resource section of this course.
We will also provide you with a course completion certificate once you are done with all the sessions and it will add great value to your career.
So best wishes and happy learning. See you soon in the class room.
Bibliography & Reference Credits:
CS231M ・ Stanford University, CS231N ・ Stanford University
Pyimagesearch blog by Dr. Adrian Rosebrock, PhD.
Andrej Karpathy. CS231n: Convolutional Neural Networks for Visual Recognition.
Andrej Karpathy. LinearClassification
Machine Learning is Fun! Adam Geitgey
Andrew Ng. Machine Learning
Andrej Karpathy. Optimization
Karen Simonyan and Andrew Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition”
Intro Background Video Credits:
Machine Learning: Living in the Age of AI