InsightsTopicsContact
Eckher
Your guide to what's next.
Eckher
Insights
Topics
Home › Eckher Insights › A technical introduction to OpenAI's GPT-3 language model
Jan 20, 2021

A technical introduction to OpenAI's GPT-3 language model

An overview of the groundbreaking GPT-3 language model created by OpenAI.

Introduced in May 2020, Generative Pre-trained Transformer 3 (GPT-3) is OpenAI's groundbreaking third-generation predictive language model. Widely considered the world's most powerful NLP technology, it has sparked many discussions in the tech and business communities about its potential use cases and impact on existing business processes and applications.

While some aspects of the GPT-3 model described in the paper "Language Models are Few-Shot Learners" may seem too technical to non-AI researchers, it is worth zooming in on some of the key features of the model in order to better understand what it does and how it can be used in practice.

What is GPT-3?

Strictly speaking, GPT-3 is a family of autoregressive language models which include GPT-3 Small, GPT-3 Medium, GPT-3 Large, GPT-3 XL, GPT-3 2.7B, GPT-3 6.7B, GPT-3 13B, and GPT-3 175B. Introduced in the paper "Language Models are Few-Shot Learners", these models share the same transformer-based architecture similar to that of their predecessor GPT-2. All GPT-3 models are trained on a mixture of datasets consisting of the Common Crawl, WebText2, Books1 and Books2, and English-language Wikipedia datasets.

GPT-3 175B is the largest of all GPT-3 models and is commonly referred to as "the GPT-3". With 175 billion trainable parameters, it is about two orders of magnitude larger than the 1.5 billion parameter GPT-2.

According to OpenAI's paper, GPT-3 175B outperforms other large-scale models on a number of NLP tasks. Being a meta-learning model, it is capable of both recognizing and rapidly adapting to the desired task at inference time after having developed a broad set of skills and pattern recognition abilities during unsupervised pre-training.

OpenAI API, or GPT-3-as-a-Service

In June 2020, OpenAI launched the API product that can be used to access the AI models developed by the company, including those based on GPT-3. Available in a private beta, the OpenAI API is equipped with a general purpose text in–text out interface and enables users to experiment with GTP-3-based models, explore its strengths and weaknesses, and integrate it into their own products.

Bottom line

The GPT-3 autoregressive language model made its debut in May 2020 and marked an important milestone in NLP research. Trained on a large internet-based text corpus, it boasts 175 billion parameters and is two orders of magnitude larger than its predecessor GPT-2.

A number of models based on GPT-3 are available via OpenAI API, OpenAI's commercial product released in private beta in June 2020.

Cover
See also
Navigating unstructured data: The rise of question answering
Question answering technologies are key to efficiently dealing with overwhelming amounts of unstructured data.
Document understanding: Modern techniques and real-world applications
How document understanding helps bring order to unstructured data.
Towards more linked lexicographical data: Lexemes on Wikidata
A glimpse into the meaning and other properties of words described with structured and linked data.
Harnessing the power of the Oxford English Dictionary for linguistic research and NLP applications
How the OED Text Annotator may help bring text mining and natural language processing technologies to the next level.
What does a knowledge engineer do?
An overview of knowledge engineering and the core competencies and responsibilities of a knowledge engineer.
Eckher
Your guide to what's next.
Copyright © 2021 Eckher. Various trademarks held by their respective owners.