Introducing New Fine-tuning Techniques and Capabilities in Azure OpenAI Service

Microsoft

Dec 17, 2024

As we continue to push the boundaries of AI capabilities, we are thrilled to announce several new fine-tuning features in Azure OpenAI Service. Fine-tuning is crucial for meeting customer-specific needs, as it allows organizations to adapt pre-trained models to their unique datasets and requirements. This customization enhances performance, reduces token costs, and ensures that AI solutions are aligned with business goals.

In recent months, we have seen a tremendous growth in the consumption of Azure OpenAI Service fine-tuning. More and more organizations are recognizing the value of fine-tuning to create AI models that are tailored to their specific use cases. This trend highlights the increasing demand for flexible and efficient AI solutions that can be easily customized to meet diverse business needs.

Introducing o1-mini Reinforcement Fine-Tuning

We’re excited to announce the private preview of reinforcement fine-tuning for the o1-mini model. Reinforcement fine-tuning is particularly beneficial for optimizing model behavior in highly complex or dynamic environments, enabling the model to learn and adapt through iterative feedback and decision-making.

Designed for exceptional reasoning capabilities particularly in the STEM fields at a fraction of the cost, o1-mini has become a trusted solution for businesses across industries. With fine-tuning now available, you can customize o1-mini to address your specific needs, unlocking new efficiencies and opportunities.

For example, financial services providers can optimize the model for faster, more accurate risk assessments or personalized investment advice. In healthcare and pharmaceuticals, o1-mini can be tailored to accelerate drug discovery, enabling more efficient data analysis, hypothesis generation, and identification of promising compounds.

Fine-tuning empowers you to align o1-mini with your goals while preserving its hallmark cost-efficiency. Start customizing your AI solutions and see how o1-mini fine-tuning can transform your business.

Announcing Direct Preference Optimization

Direct Preference Optimization (DPO) is another new alignment technique for large language models, designed to adjust model weights based on human preferences. Unlike Reinforcement Learning from Human Feedback (RLHF), DPO does not require fitting a reward model and uses binary preferences for training. This method is computationally lighter and faster, making it equally effective at alignment while being more efficient. DPO is especially useful in scenarios where subjective elements like tone, style, or specific content preferences are important. We’re excited to announce the public preview of DPO in Azure OpenAI Service, starting with GPT-4o-2024-08-06; GPT-4o-mini-2024-07-18 will be available soon.

Distillation: Enhancing Efficiency and Performance for Fine-Tuning

We are announcing the public preview of Stored completions, which allows developers to capture and store input-output pairs from models like GPT-4o, building datasets with production data for evaluating and fine-tuning models through a technique called distillation. The Stored completions currently supports GPT-4o-0806 in the Sweden Central region. We plan to expand this feature to include additional models and regions in the future.

The comprehensive distillation process includes collecting live traffic from Azure OpenAI endpoints, filtering and subletting that traffic in the Stored Completions UI, exporting it to the Evaluation UI for quality scoring, and finally, fine-tuning from the collected data or a subset based on evaluation scoring.

Prompt Caching for Fine-tuned Models

We are excited to announce the support for prompt caching in fine-tuning, available for models GPT-4o-0806 and GPT-4o-mini. Prompt caching significantly reduces request latency and costs by reusing recently seen input tokens, which is especially beneficial for longer prompts with identical initial content. This feature ensures faster processing times and offers a 50% discount on input token pricing for Standard deployment types. This is the first time prompt caching for fine-tuning is being highlighted with differentiated pricing, providing substantial benefits to our customers. Here is the pricing table for the above models:

Global Standard Deployment for Fine-Tuned Models

We are excited to announce the public preview of our global standard deployment for Azure OpenAI fine-tuning! This new deployment option offers developers a cost-effective way to deploy custom models with the same rate limits. Custom model weights might be stored outside the selected GEO while inferencing, providing more choices for custom model deployments and making it ideal for experimentation purposes. Starting with GPT-4o-0806 and GPT-4o-mini, this deployment option offers the flexibility needed for your applications, making it easier to manage and deploy fine-tuned models. Please note that Global Standard fine-tuning deployments currently do not support vision and structured outputs. Azure OpenAI Service is committed to providing a wider range of deployment options to better serve customer needs. Stay tuned for further updates!

These new features in Azure OpenAI Service fine-tuning demonstrate our commitment to providing robust, flexible, and efficient AI solutions. With advancements like o1 mini fine-tuning, stored completions, provisioned and global standard deployments, and direct preference optimization, developers have the tools they need to create high-quality, customized AI models. We invite you to explore these new features and take your AI projects to the next level.

Stay tuned for more updates and join us in this exciting journey of innovation!

Updated Dec 17, 2024

Version 1.0

Microsoft

Joined October 26, 2022

View Profile

AI - Azure AI services Blog

Follow this blog board to get notified when there's new activity