Unique challenges of generative models
Given the powerful applications that generative models are applied to, what are the major challenges in implementing them? As described, most of these models utilize complex data, requiring us to fit large models with sufficiently diverse inputs to capture all the nuances of their features and distribution. That complexity arises from sources including:
- Range of variation: The number of potential images generated from a set of three color channel pixels is immense, as is the vocabulary of many languages
- Heterogeneity of sources: Language models, in particular, are often developed using a mixture of data from several websites
- Size: Once data becomes large, it becomes more difficult to catch duplications, factual errors (such as mistranslations), noise (such as scrambled images), and systematic biases
- Rate of change: Many developers of LLMs struggle to keep model information current with the state of the world and thus provide relevant answers to user prompts
This has implications both for the number of examples that we must collect to adequately represent the kind of data we are trying to generate, and the computational resources needed to build the model. Throughout this book, we will use cloud-based tools to accelerate our experiments with these models. A more subtle problem that comes from having complex data, and the fact that we are trying to generate data rather than a numerical label or value, is that our notion of model “accuracy” is much more complicated—we cannot simply calculate the distance to a single label or scores. We will discuss, in Chapter 3 and Chapter 4, how deep generative models such as VAE and GAN algorithms take different approaches to determining whether a generated image is comparable to a real-world image. Finally, our models need to allow us to generate both large and diverse samples, and the various methods we will discuss take different approaches to controlling the diversity of data.