In an age where big tech dominates the artificial intelligence arena, it’s rare to find upstarts who are making a name for themselves. We have seen companies such as huggingface who are pretty successful in making state-of-the-art machine models easy to implement, the higher-level abstractions can make someone with relative knowledge of what machine learning is enter into the space and get results.
Now we have Stability AI, making a name for itself in the field of text-to-image generation and is growing pretty significantly so far. With text-to-image generation, a machine or a program in this instance can be able to read and interpret natural language text and can be able to create an image out of it.
This field has come a far way and it is now seeping into our everyday life. The use cases are also growing, in the initial stages it had a play-like feel to it that any child can use to generate silly images that weren’t rendered properly but get the job done.
Today such a tool can be used to create industrial design mockups of a product, architecture, etc. It can be used to photo-search, photo editing, and other image manipulations. It can also create realistic photos from text, as can be seen from the examples below.
Stability AI was created by Emad Mostaque an oxford graduate with a master’s in mathematics and computer science. His career mainly centered around finance as he had been working as an analyst for various hedge funds.
In 2020 when he had a lot of free time on his hands, he developed this technology.
He was very frustrated with the state of things within the open-source community for Artificial Intelligence. Currently, he is working on mainly open-source projects so that these technologies can be used by the public.
“Nobody has any voting rights except our employees — no billionaires, big funds, governments, or anyone else with control of the company or the communities we support. We’re completely independent,” Mostaque told TechCrunch in a previous interview. “We plan to use our compute to accelerate open source, foundational AI.”
Last October the company raised USD $101 million from Coatue, Lightspeed Venture Partners with participation from O’Shaughnessy Ventures LLC. He is going to need all the money to pull this off given the fact that these models are very expensive to run. Business Insider reports that Stability AI’s operations and cloud expenditures exceed USD $50 million.
The company is betting on making money from the privatization of models for paying users and acting as an infrastructure layer for users. It also has an API called DreamStudio, through which its models can be accessed by users.
This is very similar to huggingface and other AI companies. APIs (Application Program Interfaces) help many users to use the models from various platforms.
Stable diffusion has more than 10 million daily users across all the different touchpoints and the open-source version is downloaded hundreds of thousands of times and it continues to grow.
Despite the use cases, there are drawbacks, the company has had its own share of controversy as the government doesn’t know how to manage open-source models that are left out in the wild for anyone to use. The concerns are certainly warranted but a viable solution has not been developed yet.
We have seen various open-source AI companies release a stripped-down model for the public in fear of unintended consequences from malicious actors.
As in the case of Stability AI, there have been cases where users have been using it to generate pornographic images, graphic violent images, and deepfakes which have taken on a life of their own over the years.
The company has also been experimenting with AI models for generating audio, language, 3D, and even video. The most successful one out of those thus far is dance diffusion which generates clips of music from a large music dataset.
Overall Stability AI seems to be going in the right direction and if they continue on the same track to innovate its impact will only grow.