Develop, deploy, and host AI agents on Cloud Run
Stay organized with collections Save and categorize content based on your preferences.

Use Cloud Run to host AI agents. AI agents can be implemented as Cloud Run services and perform tasks and provide information to users in a conversational manner. Cloud Run provides automatic scaling and high scalability without provisioning resources, while only billing for actual usage. AI agents can be used for a variety of purposes, such as customer service, virtual assistants, and content generation.

You can use a Cloud Run service as a scalable API endpoint to process prompts from end users. Your service runs an AI orchestration framework, such as LangChain, LangGraph, or Genkit, which orchestrates calls to:

AI models such as Gemini API, Vertex AI endpoints, or another GPU-enabled Cloud Run service.
Vector databases like Cloud SQL for PostgreSQL or AlloyDB for PostgreSQL with the pgvector extension.
Other services or APIs.

You can stream the agent response back to the client using WebSockets.

For a more detailed architecture, see Infrastructure for a RAG-capable generative AI application using Vertex AI and AlloyDB for PostgreSQL.

Learn how to deploy Genkit to Cloud Run in the Genkit documentation.

Learn how to build and deploy a LangChain app to Cloud Run by working through a codelab or watching "Building generative AI apps on Google Cloud with LangChain".

Develop, deploy, and host AI agents on Cloud Run Stay organized with collections Save and categorize content based on your preferences.

Develop, deploy, and host AI agents on Cloud Run
Stay organized with collections Save and categorize content based on your preferences.