Transforming the landscape of ML using Transformers
By now you must have heard about ''Transformers'' — not the movie franchise, but the machine learning model that forms part of the Chat-GPT acronym. The GPT in Chat-GPT stands for Generative Pre-trained Transformer. This article is about transformers, how they revolutionized, not only the field of Natural Language Processing (NLP) but the whole machine learning landscape. One of the goals is to give you an intuition of what ''attention'' blocks in Transformers actually achieve. Through this I hope you also get an intuition for how technologies like Chat-GPT are stretching the boundaries of what AI can currently achieve. The Role of Auto-Encoders A key idea that has enabled this sudden rise of capability is a class of ML models called auto-encoders. The advantage they bring to the table is that, they are a kind of unsupervised ML technique. Meaning that they do not require each training sample to be associated with a ''label'' that then b...