You're reading from Generative AI with Python and PyTorch Navigating the AI frontier with LLMs, Stable Diffusion, and next-gen AI applications

Product type Paperback

Published in Mar 2025

Publisher Packt

ISBN-13 9781835884447

Length 450 pages

Edition 2nd Edition

Languages

Python

Tools

PyTorch

Concepts

Artificial Intelligence

Authors (2):

Joseph Babcock

Raghav Bali

View More author details

Table of Contents (18) Chapters

Preface

1. Introduction to Generative AI: Drawing Data from Models

2. Building Blocks of Deep Neural Networks FREE CHAPTER

3. The Rise of Methods for Text Generation

4. NLP 2.0: Using Transformers to Generate Text

5. LLM Foundations

6. Open-Source LLMs

7. Prompt Engineering

8. LLM Toolbox

9. LLM Optimization Techniques

10. Emerging Applications in Generative AI

11. Neural Networks Using VAEs

12. Image Generation with GANs

13. Style Transfer with GANs

14. Deepfakes with GANs

15. Diffusion Models and AI Art

16. Other Books You May Enjoy

17. Index

Implementing generative models

While generative models could theoretically be implemented using a wide variety of machine learning algorithms, in practice, they are usually built with deep neural networks, which are well suited to capture the complex variation in data such as images or language. In this book, we will focus on implementing these deep-learning-based generative models for many different applications using PyTorch. PyTorch is a Python programming library used to develop and produce deep learning models. It was open-sourced by Meta (formerly Facebook) in 2016 and has become one of the most popular libraries for the research and deployment of neural network models. We’ll execute PyTorch code on the cloud using Google’s Colab notebook environment, which allows you to access world-class computing infrastructure including graphic processing units (GPUs) and tensor processing units (TPUs) on demand and without the need for onerous environment setups. We’ll also leverage the Pipelines library from Hugging Face, which provides an easy interface to run experiments using a catalog of some of the most sophisticated models available.

In the following chapters, you will learn not only the underlying theory behind these models but also the practical skills to implement them in popular programming frameworks. In Chapter 2, we’ll review how, since 2006, an explosion of research in “deep learning” using large neural network models has produced a wide variety of generative modeling applications. Innovations arising from this research included variational autoencoders (VAEs), which can efficiently generate complex data samples from random numbers that are “decoded” into realistic images, using techniques we will describe in Chapter 11. We will also describe a related image generation algorithm, the generative adversarial network (GAN), in more detail in Chapters 12-14 of this book through applications for image generation, style transfer, and deepfakes. Conceptually, the GAN model creates a competition between two neural networks.

One (termed the generator) produces realistic (or, in the case of the experiments by Obvious, artistic) images starting from a set of random numbers that are “decoded” into realistic images by applying a mathematical transformation. In a sense, the generator is like an art student, producing new paintings from brushstrokes and creative inspiration. The second network, known as the discriminator, attempts to classify whether a picture comes from a set of real-world images, or whether it was created by the generator. Thus, the discriminator acts like a teacher, grading whether the student has produced work comparable to the paintings they are attempting to mimic. As the generator becomes better at fooling the discriminator, its output becomes closer and closer to the historical examples it is designed to copy. In Chapter 11, we’ll also describe the algorithm used in Théâtre D’opéra Spatial, the latent diffusion model, which builds on VAEs to provide scalable image synthesis based on natural language prompts from a human user.

Another key innovation in generative models is in the domain of natural language data—by representing the complex interrelationship between words in a sentence in a computationally scalable way, the Transformer network and the Bidirectional Encoder from Transformers (BERT) model built on top of it present powerful building blocks to generate textual data in applications such as chatbots and large language models (LLMs), which we’ll cover in Chapters 4 and 5. In Chapter 6, we will dive deeper into the most famous open-source models in the current LLM landscape, including Llama. In Chapters 7 and 8.

Before diving into further details on the various applications of generative models and how to implement them in PyTorch, we will take a step back and examine how exactly generative models are different from other kinds of machine learning. This difference lies with the basic units of any machine learning algorithm: probability and the various ways we use mathematics to quantify the shape and distribution of data we encounter in the world. In the rest of this chapter, we will cover the following:

How we can use the statistical rules of probability to describe how machine learning models represent the shapes of the datasets we study
The difference between discriminative and generative models, based on the kinds of probability rules they embody
Examples of areas where generative modeling has been applied: image generation, style transfer, chatbots and text synthesis, and reinforcement learning

The rest of the chapter is locked

You're reading from Generative AI with Python and PyTorch Navigating the AI frontier with LLMs, Stable Diffusion, and next-gen AI applications

Table of Contents (18) Chapters

Implementing generative models

Unlock this book and the full library FREE for 7 days

Authors (2)

Personalised recommendations for you