Midjourney
What is Midjourney?
Who founded the Midjourney lab and when?
How does the Midjourney tool generate images?
What was a major update in Midjourney’s capabilities in 2023?
What improvements were made in Midjourney version 7?
News •
Midjourney, generative artificial intelligence (AI) tool that creates images from user-generated text prompts. It can produce images in a photorealistic, painterly, cinematic, or surrealistic style.
History
Midjourney, Inc., was founded in San Francisco in August 2021 by entrepreneur David Holz, who had previously founded the sensor company Leap Motion (which was bought and rebranded by the British company Ultrahaptics in 2019). Holz, with a team of 10 engineers, set up an independent lab to develop the Midjourney app. In February 2022 the lab launched its Discord server to the public. Discord, a social media platform, allows users to chat in online communities. The Midjourney Discord server, which had about 20 million registered users by the end of 2024, was part of Holz’s vision for a highly collaborative AI tool. Although users initially had to have a Discord account to use the app, Midjourney, Inc., launched an independent website for the app in 2024.
Deep learning process
The Midjourney tool runs on closed-source software created with custom algorithms. The technology generates images by employing a large language model (LLM) and a diffusion model that together run algorithms on training data consisting of massive libraries of images paired with text descriptions. Midjourney’s model was trained with data from the Internet and from an open-source dataset provided by the German nonprofit LAION.
Although the Midjourney model is a sophisticated image generator, Midjourney, Inc., originally decided that the model would not stray from the lab’s vision of generating artistic images. Founder David Holz, in a 2022 interview with the tech news website The Verge, spoke of the Midjourney model’s limitations: “We have a default style and look, and it’s artistic and beautiful, and it’s hard to push [the model] away from that, meaning you can’t really force it to make a deepfake right now.” However, the software went through an algorithm update in mid-2023 that made it easier for users to create more-convincing deepfakes.
Midjourney uses an LLM to interpret text prompts by breaking them down into their key concepts. It then converts the concepts into a latent vector, which is a numeric code with image details, such as color palette, shape, and style. Midjourney then uses a diffusion model for the final stages of the process. Such models are named for their resemblance to the concept of diffusion in physics, a process in which random molecular movement causes a net flow of matter from a region of high concentration to a region of low concentration. Diffusion models, however, are trained to apply diffusion in reverse. They add “noise,” or random values (which appear as static in an image), to make an original dataset unrecognizable; they must then “reverse” the noise in order to re-obtain the data in the form of a high-quality image.
Versions and improvements
Since February 2022 Midjourney, Inc., has released several versions of its AI tool that generated abstract and painterly images with low cohesion (indicating a lack of consistency across images). Later versions of the tool introduced upscaling and variation buttons to give users more control over the images generated. Other updates included further image customization, enhanced by knowledge of life-forms, places, and objects. In later releases, images were also improved for quality and realism. This led to controversy, especially after an image of Pope Francis wearing a puffer coat went viral on the Internet. Although Midjourney, Inc., banned the word pope following the incident, users may still create deepfakes of other public figures.
A significant advancement came in December 2023 with the release of version 6. This was the first version to allow for the integration of text directly within images. Released in April 2025, version 7 introduced a draft mode that produces prototype images at 10 times the speed and half the cost of standard mode images.