AI Revolution – Transformers and Large Language Models (LLMs)

This post is by Elad Gil from Elad Blog

NLP & AI Revolution – Transformers and Large Language Models (LLMs)

Part of the challenge of “AI” is we keep raising the bar on what it means for something to be a machine intelligence. Early machine learning models have been quite successful in terms of real world impact. Large scale applications of machine learning today include Google Search and ads targeting, Siri/Alexa, smart routing on mapping applications, self-piloting drones, defense tech like Anduril, and many other areas. Some areas, like self-driving cars, have shown progress but seem to continuously be “a few years” away every few years. Just as all the ideas for smart phones existed in the 1990s but didn’t take place until the iphone launched in 2007, self-driving cars are an inevitable part of the future.

In parallel, the machine learning (ML) / artificial intelligence (AI) world has been rocked in the last decade by a series of advancements over time in voice recognition (hence Alexa), image recognition (iphone unlock and the erm, non-creepy, passport controls at Airports). Sequential inventions and discovery include CNNs, RNNs, various forms of Deep Learning, GANs, and other innovations. One of the bigger breakthroughs of recent times was the emergence of Transformer models in 2017 for natural language processing (NLP). Transformers were invented at Google, but quickly adopted and implemented at OpenAI to create GPT-1 and more recently GPT-3. This has been followed by other companies or open source groups building transformer models such as Cohere (Read more…)