Posts

Transforming the landscape of ML using Transformers

Image
By now you must have heard about ''Transformers'' — not the movie franchise, but the machine learning model that forms part of the Chat-GPT acronym.  The GPT in Chat-GPT stands for Generative Pre-trained Transformer. This article is about transformers, how they revolutionized, not only the field of Natural Language Processing (NLP) but the whole machine learning landscape. One of the goals is to give you an intuition of what ''attention'' blocks in Transformers actually achieve. Through this I hope you also get an intuition for how technologies like Chat-GPT are stretching the boundaries of what AI can currently achieve. The Role of Auto-Encoders A key idea that has enabled this sudden rise of capability is a class of ML models called auto-encoders. The advantage they bring to the table is that, they are a kind of unsupervised ML technique. Meaning that they do not require each training sample to be associated with a ''label'' that then b...

The Integrated Gradients Technique for Model Interpretability

Image
In a previous article we saw why interpretability is important in machine learning and surveyed existing techniques. In this article we shall look into one specific technique called Integrated Gradients . Let's try to recall what problems such techniques address. Suppose we have a model that has been trained to classify images correctly into one among several classes. Now, given an input image, the model might correctly predict it's class or it might fail to. If it correctly predicts the class, we could ask which of the pixels or group of pixels contributed the most to the model's prediction. This is where techniques such as Integrated Gradients come to play. You can think of it as a technique to create a saliency map from a given input image. Applications of the IG technique Let's get a taste of the technique by looking at some cases where they have been successfully applied. The integrated gradients technique can be applied to various types of neural networks. We sho...

Does Machine Learning Struggle with Explainability?

It is quite common to hear the phrase that "AI/ML models are black boxes".  In this article, let's try to analyze how true this is and if the state-of-affairs can be improved?  Why seek explainability in ML models? You might be tempted to ask how it matters that ML/AI models are difficult to explain as long as they work? Let's start by answering this particular question.  Why do we seek explainability in Machine Learning? Satisfying natural human curiosity - These can be asnwers to the following questions  - Why do ML models work where traditional methods fail? Why do classical ML algorithms like random forest algorithms or SVMs show superior performance over deep neural networks in certain areas? All such questions stem from our natural curiosity to understand things. Adding to scientific knowledge - If something works and one is not able to explain why it has worked. Then you might be adding nothing to the existing scientific knowledge. However if the model works ...

A guide to using Conda for managing virtual environments

You should always use a virtual environment for coding in Python, period. This might not be an instruction necessary for people who are on Windows or Mac. This is because Python is not something installed in the system by default and one of the ways to install Python is using something like the Anaconda distribution that comes with virtual environments by default. But for people who use Linux, there is always a system Python , an executable file you can find in `/usr/bin/python`.  However even on Linux it is still better to use something like Anaconda for your Python needs.  Difference between Pip and Conda Pip is an acronym for Pip Installs Packages. It is the official package manager for python packages. All packages that you install using the Pip command can be found in the Python Packaging Index (PyPI).  Conda is also a  package manager that can be used to install Python packages. But then, why do you need conda when you have Pip? The answer is that, there are...

Neural Network from Scratch using Python

In [1]: import numpy as np def sigmoid ( x ): return 1.0 / ( 1 + np . exp ( - x )) def sigmoid_derivative ( x ): return x * ( 1.0 - x ) class NeuralNetwork : def __init__ ( self , x , y ): self . input = x self . weights1 = np . random . rand ( self . input . shape [ 1 ], 4 ) self . weights2 = np . random . rand ( 4 , 1 ) self . y = y self . output = np . zeros ( self . y . shape ) def feedforward ( self ): self . layer1 = sigmoid ( np . dot ( self . input , self . weights1 )) self . output = sigmoid ( np . dot ( self . layer1 , self . weights2 )) def backprop ( self ): # application of the chain rule to find derivative of the loss function with respect to weights2 and weights1 d_weights2 = np . dot ( self . layer1 . T , ( 2 * ( self ...

Multi-layer Perceptrons or MLPs in Python

Image
In this article, I provide code in the form of a notebook that can be used to understand how Multi-Layer Perceptrons can be implemented in Python. The code also visualizes the decision boundary learnt by the model from the data. You can also play with the actual code on  colab . You can definitely implement a  neural network from scratch in Python  but in this article we make use of the MLP classifier implemented in the scikit-learn Python package. You can get an idea of the default parameters used, like the number of layers, activation function etc. by going to the  documentation  in their website. In the code that follows, we use a single hidden layer with 100 neurons. The activation function used is `relu` which is the default. The model optimizes the log-loss function using stochastic gradient descent. For the purpose of this tutorial we also make use of a synthetic dataset provided by the scikit-learn team and generated by the function `make_moons`. ...