Interesting innovations from OpenAI in 2021


OpenAI is not even a decade old and has made a name for itself worldwide as a leading AI research laboratory. It gave the world GPT-3 in 2020 – a breakthrough innovation that uses deep learning to deliver human-like text to us. GPT-3 was a stepping stone for other technology giants to get inspiration and bring out their own innovations in the large language model space.

Also this year, OpenAI tested its limits and continued on its way to bring out algorithms and models that can have a huge impact. Let’s take a look at some of these at the end of the year.


OpenAI published Codex via an API in private beta. It translates natural language into code and is the backbone of GitHub Copilot. It can interpret simple commands in natural language and execute them on behalf of the user, which makes it possible to build a natural language interface to existing applications. OpenAI said, “OpenAI Codex has much of the natural language understanding of GPT-3, but it produces code that works.” You can issue commands in English to any software with an API. OpenAI Codex is a universal programming model (can be applied to any programming task).

For more details click here.


Earlier this year, OpenAI released DALL · E, a 12 billion parameter version of GPT-3 that was trained to generate images from text descriptions using a data set of text-image pairs. DALL · E is a Transformer language model that receives text and images as a single data stream containing up to 1280 tokens. The maximum probability is trained to generate all tokens one after the other. DALL · E can render an image from scratch and can also modify aspects of an image using text prompts. OpenAI said that DALL · E can create plausible images for a series of sentences that explore the compositional structure of language.

For more details click here.


GLIDE (Guided Language to Image Diffusion for Generation and Editing) is a text-to-image generation model with 3.5 billion parameters that is even better than DALL-E. The paper published by OpenAI states that the researchers found that samples from the model they generated using classifier-free guidance are both photorealistic and reflect diverse world knowledge. In terms of performance, OpenAI said the samples they generated were preferred to DALL-E’s 87% of the time when judging for photorealism and 69% of the time when human judges rated the similarity of captions.

For more details click here.

Triton 1.0

OpenAI has released the open source Python-like programming language Triton 1.0, which helps researchers with no CUDA (Compute Unified Device Architecture) experience write highly efficient GPU code. OpenAI claimed, “Triton makes it possible to get maximum hardware performance with relatively little effort.” It is said that Triton has been used to produce kernels that are up to 2x more efficient than comparable Torch implementations. Modern GPUs have three critical components when it comes to architecture – DRAM, SRAM, and ALUs. OpenAI said that Triton is committed to fully automating these optimizations. This will lead developers to focus more on writing high-level logic of their parallel code.

For more details click here.


During the DALL · E release, OpenAI also released CLIP (Contrastive Language-Image Pre-Training), which builds on extensive work on zero-shot transfer, natural language supervision and multimodal learning. OpenAI demonstrated that scaling a simple pre-training task is enough to achieve competitive zero-shot performance on a wide variety of image classification datasets. This method uses available surveillance sources – the text paired with images from the Internet.
For more details click here.


Comments are closed.