At a glance, the above three images look just like me. Look closer and you might notice that my skin is too smooth, my clothes distorted in places — details that might be dismissed as an overdone Photoshop edit.
Thing is, I never posed for these pictures. Nor have I ever sported shoulder-length hair or a cowboy hat, for that matter. These images are entirely the product of artificial intelligence, utilizing a cutting-edge technology developed by Google scientists called DreamBooth.
Since its release in late August, DreamBooth has already advanced the field of AI art by leaps and bounds. In a nutshell, it gives AI the ability to study what individuals or objects look like, then synthesizes a “photorealistic” image of the subject in a completely new context.
The development has some AI researchers and ethicists concerned about its use in so-called “deepfakes” — media manufactured via AI deep-learning to depict fake events. People are already mapping the faces of unconsenting women onto the bodies of pornstars, making political figures deliver statements that never occurred in reality and creating impossible ads featuring dead celebrities.
Currently, most deepfakes can be distinguished on close inspection. However, as multiple experts told me, DreamBooth may soon facilitate the creation of deepfakes indistinguishable from real photos or videos, at scales never before seen.
What is DreamBooth and how can I use it?
DreamBooth is an AI model that fine-tunes existing text-to-image generators like Midjourney or DALL-E to produce “personalized” images. It’s also open source, meaning the code is freely available to anyone to modify and redistribute. According to the original research paper, the model only needs to train on three to five photos of a subject before it can reproduce that subject’s likeness in a variety of styles and contexts.
Using DreamBooth by itself can be both daunting for coding novices and VRAM-intensive — it requires a great deal of graphics resources. In response, a number of apps harnessing DreamBooth technology have emerged, offering an accessible entry into image creation.
To produce the fake photos of myself, I enlisted the help of Alfred Wahlforss, a 26-year-old graduate student studying data science at Harvard University. Wahlforss is one of the developers behind BeFake, a new app utilizing DreamBooth to generate AI art of its users in a variety of styles and settings.
BeFake uses DreamBooth in conjunction with another popular AI image generator called Stable Diffusion, Wahlforss said: “To get it perfectly right, you have to add algorithms on top or fine tune it in a certain way.”
To create my images, I sent Wahlforss 20 close-up photos of myself in varying backgrounds and clothing. The higher quality, the better, he said: “Garbage in, garbage out.”
After the AI was trained on these pictures, Wahlforss asked it to create an image of me using different prompts. He could have it reproduce me in the style of famous painters or photographers, dress me up in different outfits and change my hair or even my gender.
The same goes when using his app — users are asked to submit 14 to 20 pictures of themselves, the more the better. People can then select different prompts to recreate themselves. The first training set is free; users will have to pay for more.
“The user-submitted images are used to train the model and then deleted after we have trained it,” Wahlforss said. “We don’t do anything other than train the model with the user images.”
At the moment, BeFake is broken — a viral Reddit post last week sent thousands of people to the app, crashing the server of its cloud provider, Wahlforss said. He believes the app will be back in service later this week.
“There are clear ways to misuse this technology,” Wahlforss said. “On our end, we have a lot of checks and balances, and that’s one of the algorithms that I was talking about … obviously we’re very cautious around generating anything that is not safe for work, either violence or nudity.”
“But I don’t want to go into too much detail because then, obviously, you’re opening yourself up to attacks.”
The team behind DreamBooth did not respond to the Star’s requests for comment before publication.
According to Tom Mason, the chief technology officer of Stability. AI, which created AI image generator Stable Diffusion, DreamBooth is “just the beginning of a new wave of generative models in image, language, video, audio and 3D.”
The technology is not limited to 2D images, he said — image generation models are already being used in animation and “techniques for fine-tuning such as Dreambooth will absolutely be possible with other modalities, further extending the options.”
In response to the many people making use of the open source technology, Mason said: “We encourage ethical use of the models and a sensible open discussion with transparency on the techniques and applications that the community is developing.”
The future of deepfakes?
Abhishek Gupta, founder, principal researcher and director of the Montreal AI Ethics Institute, said DreamBooth represents a “step forward in a bad direction.”
“It’s maybe a leap forward in terms of the potential negative impacts that are going to arise, because it’s now allowed outputs that are a lot more convincing. They’re a lot easier to produce. They can be produced at scale,” he said.
For example, DreamBooth could be used to copy signatures or official signage to fake documents, create misleading photos or videos of politicians, manufacture revenge porn of individuals and more, Gupta continued.
At the moment, DreamBooth can be relatively resource intensive to run, he said, limiting the scale at which deepfakes can proliferate. At the rapid rate AI is advancing, however, Gupta believes the technology will soon become faster and easier to operate and access.
“Because it’s an automated process, you can have multiple targets and attack them in not just a targeted manner, but in a scaled up manner where you can generate many, many outputs that are convincing,” Gupta said.
The technology’s ease of use for malicious actors also creates an “asymmetry of power,” Gupta said. At the moment, it’s easy to target people with DreamBooth and victims have little ability to do anything about it.
“There aren’t, at the moment, any strong legal protections because this is such an evolving space,” Gupta said. “We’re seeing efforts in this space in terms of being able to combat the negative impacts, but those are nascent. Everybody’s still trying to figure out what does it even mean to have generative outputs and how to combat them.”
A specific issue with DreamBooth and Stable Diffusion is that they’re open source, Gupta continued. Unlike centralized AI-generation models that can impose regulations and barriers to image creation, the decentralized models like DreamBooth mean anyone can access and improve on the technology.
“When you release a model open source first, a lot of those protections are bypassed,” Gupta said. Even if it had protections originally, “they can be bypassed by people who are savvy — and then you can have a re-release downstream that people can do whatever they want with.”
On the flip side, the decentralized nature of DreamBooth has sparked an “explosion” in innovation and progress in AI, according to Lewis Hackett, a UK-based artist that incorporates AI image generation into his work. Hackett helps run the main Discord server for AI researchers and hobbyists exploring and expanding on DreamBooth.
“In the (Discord server), we have constantly, all day, people developing new things, testing new things with DreamBooth and Stable Diffusion,” Hackett said. His server currently has over 8,000 members. “That kind of opened the doors to this explosion in access to image tools.”
What’s the next step for AI art?
“Rather than it being a big centralized group that’s controlling the research of these different tools, it’s now the entire community and they all work together,” Hackett said. “Back then, this (level of) democratization was pretty unheard of in the software community … I think it’s why the field is growing at a phenomenal rate.”
Originally, DreamBooth took about an hour and a half and 40 gigabytes of VRAM to train a model, Hackett said. Once released to the public, people in his Discord began developing their own versions based on and building off the paper.
“It’s at the point now where it takes about 12 gigabytes, so any consumer graphics card that has 12 gigabytes would be able to do our training,” he said, adding that they’re rapidly improving on the image generation time too. “ … It really shows what’s possible with open source and how beneficial it is for people to be able to distribute and work on these things.”
Hackett admits the technology’s open source nature leaves it vulnerable to malicious actors: “Obviously, anyone can get the software and anyone can alter the software in any way they want to make it do whatever they want.
“But I think it’s a net positive to have open source, because there’s always going to be people that are misusing tools,” he continued. “If more researchers have access to these tools, they can actually research this stuff outside and actually try to develop harm reduction techniques.”
There’s a great potential for good in DreamBooth and AI art — For example, it can help disadvantaged people create art and serve as a valuable tool for artists everywhere, Hackett said.
“There’s always going to be bad actors out there who are trying to use these technologies,” Hackett said. “That doesn’t necessarily mean that the entire community is bad and that the tool is bad itself.”
“I just think there are so many positives, it outweighs the negative aspects of it,” Hackett said. “But that doesn’t mean that it should go unchecked.”
How can we stop deepfakes?
While some experts recommended regulations to reign in bad actors in the AI community, Hackett is skeptical this approach would work: “The cat’s already out of the bag,” he said.
“I think the larger companies can (issue regulations). But, because these tools are already in the hands of the community, they’re going to do what they want regardless,” he continued.
“I genuinely think education is the only solution.”
At the moment, AI-generated deepfakes can still be told apart from real photographs, Hackett said. Therefore, it’s critical to spread awareness of AI art and its differences from real photographs.
In the future, however, AI generated art may become indistinguishable from reality to the human eye, Hackett and other experts say. But the difference could be clear to other AI.
Just as AI has the capacity to create deepfakes, it also has the ability to detect them, according to Wael Abd-Almageed, an associate research professor and founding director of the Visual Intelligence and Multimedia Analytics Laboratory at the University of Southern California.
For now, “the technology is not bullet proof. Like any technology, it can make mistakes, false alarms and false negatives … but even assuming that the technology works between 60 per cent to 70 per cent accuracy, it’s certainly better than absolutely nothing,” he said.
That said, AI-detection software has been rapidly improving, aided by incentives from the industry. For example, Stability. AI is about to launch a deepfake detection contest with a $200,000 prize to “encourage the development of tools in the community that can be used to identify deep fakes,” its CTO said.
Because deepfakes have the potential to spread viral misinformation on social platforms, Abd-Almageed has been advocating for social media companies to adopt AI countermeasures. At the moment, little is being done, he said.
If we keep doing nothing, the proliferation of realistic deepfakes may soon “erode the line between what’s real and what’s not,” he continued.
“At some point, everything will suddenly become suspicious. We will not be able to believe any information,” Abd-Almageed said. “ … And the biggest problem with these deepfakes is that once people believe them, it becomes extremely difficult to reverse that effect.”
“Can you imagine, for example, in the next election, a few hours before polls close, a viral video of Biden comes out saying ‘I am sick and dying?’” he continued.
“ … With all the advances (in AI), it can harm not just individual people now. It can harm society and the stock market, and it can harm democracies as well.”
JOIN THE CONVERSATION