What is Generative AI?
Generative AI is a technology that can generate high-quality text, images, and videos. Generative AI is different from other forms of AI in that the Generative AI models are trained to create new data instead of predicting outcome from given data.
Where can I play with it?
https://chatgpt.com/ is the most common example of Gen AI where a user can chat with an AI model and receive high quality text responses. The AI model is aware of a wide variety of knowledge from around the world. It can do basic and some advanced math, it can write and run programs, and can even write poetry and stories. Gen AI models like Dall-E and Sora can also generate high quality images and videos respectively.
https://www.midjourney.com/imagine is another example of Generative AI where you can enter a text prompt and generate high-quality images. Giving the same prompt to two different models results in quite different responses. Therefore it’s important to understand the nuances and take the time to play with a few models before committing to one for your project.
How does it work?
Generative AI text models work by converting the user prompt into corresponding word vectors. The word vectors are then fed into a Large Language Model which then generates word vectors in response. These word vectors are then converted to text and sent to the user. The Large language models (LLMs) are trained on a very large corpus of text, and the models are very large in size - requiring billions of parameters to train. During training, the model learns the recurrence patterns of large chunks of text and predicts the next text.
Can I build one?
Yes - and no. Building an enterprise-level model requires detailed understanding of various ML technologies, access to a large amount of text and images/ videos, and computes. These requirements make it virtually impossible for an individual to create a high-quality Generative AI model from scratch. However, one can build a toy Generative AI model to get familiar with some of the technologies involved.
Here are some essential components one will need to build text to image model:
What is the timeline of Generative AI evolution?
Late 1980s: Introduction of Recurrent Neural Networks (RNNs) for processing sequential data.
1997: Long Short-Term Memory (LSTM) networks enhance order dependence handling.
2014: Introduction of Generative Adversarial Networks (GANs) for high-quality image generation.
2014+: Development of variational autoencoders (VAEs), diffusion models, and flow-based models for improved generative processes.
2017: Transformer models - allowing the model to process all parts of the text at the same time.
2018: OpenAI creates Generative Pre-trained Transformer (GPT) model.
2022: OpenAI releases ChatGPT.
2023: Meta releases Llama (Large language model Meta AI) - a family of models.
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.