SciTech

DALL-E, the OpenAI project that can generate images from words

Named after animated robot WALL-E and painter Salvador Dali, DALL-E is the newest AI technology that both inspires and concerns researchers. DALL-E is a technology made by OpenAI, an artificial intelligence research lab co-founded by Elon Musk and many others. DALL-E can generate images upon written request — if you give it a phrase such as "a painting of a fox sitting in a field at sunrise in the style of Claude Monet,” it will give you exactly that — a never-before seen painting of a fox at sunrise in the style of Claude Monet, completely computer-generated. It can also edit photos, adding objects and adjusting light and saturation to make the addition seamless, which they call "in-painting." And when given an image, whether photo or art, it can generate countless other images similar to the original but in different styles, from different angles, or with slightly different objects.

DALL-E generates these pictures through a process called diffusion. Basically, it takes training photos and samples them to create a "noisy" photo — you can imagine trying to merge photos of different people and getting a face that's blurry and chaotic. A diffusion model then takes the "noisy" photo and tries to remove the noise to create a realistic image. DALL-E uses a diffusion model called GLIDE, which modifies the image according to the conditions in the text input.

While initially released in January 2021, OpenAI's newest DALL-E 2 released in early April is much more accurate than the first version, to an alarming extent. It is already raising pre-existing concerns about AI, such as its potential to be used to create disinformation (like deep-fake technology), or gender and racial biases. For example, when asked to generate images of lawyers, results are often exclusively male, while flight attendants generated are often female. DALL-E 2 does have a Risks and Limitations document, which explains the potential risks for DALL-E to be used to create harmful content, as well as how limitations and biases in training data can lead it to reinforce these biases.

However, OpenAI still finds its creation valuable in providing another tool to allow individuals to express their creativity in ways they were unable to before, creating images of potential futures or pasts that never existed. The creators also hope that the project can help us better understand the way AI perceives the world, more clearly identifying biases that we already know exist in AI-enabled systems.