Seamlessly switch between and combine different AI models, enabling flexible, robust, and guardrail driven AI workflows.
Model Name | Version | Context | Strengths | Use Cases |
---|---|---|---|---|
GPT-4o | gpt-4o | 128K | Multimodal understanding, complex reasoning, code generation | Complex analysis, creative tasks, multimodal applications |
gpt-4o-2024-11-20 | 128K | Latest November 2024 version | Production workloads | |
gpt-4o-2024-08-06 | 128K | August 2024 version | Stable deployments | |
gpt-4o-2024-05-13 | 128K | May 2024 version | Legacy compatibility | |
GPT-4o-mini | gpt-4o-mini | 128K | Fast responses, cost-effective, good for simple tasks | Chatbots, simple queries, high-volume processing |
gpt-4o-mini-2024-07-18 | 128K | July 2024 version | Cost-optimised workloads | |
GPT-4 Turbo | gpt-4-turbo | 128K | Enhanced GPT-4 with vision | Document analysis, complex reasoning with vision |
gpt-4-turbo-2024-04-09 | 128K | April 2024 version | Stable vision applications | |
gpt-4-turbo-preview | 128K | Preview version | Early access features | |
GPT-4 Classic | gpt-4 | 8K | Original GPT-4 model | Proven reliability for production workloads |
gpt-4-0613 | 8K | June 2023 stable version | Stable production deployments | |
gpt-4-0314 | 8K | March 2023 version | Legacy compatibility | |
GPT-3.5 Turbo | gpt-3.5-turbo | 16K | Fast, cost-effective model | High-speed responses, cost-sensitive applications |
gpt-3.5-turbo-0125 | 16K | January 2024 version | Latest GPT-3.5 features | |
gpt-3.5-turbo-1106 | 16K | November 2023 version | Stable GPT-3.5 deployment | |
O1 Series | o1-preview | Standard | Complex problem-solving, mathematical reasoning | Scientific research, complex analysis |
o1-mini | Standard | Faster reasoning at lower cost | Code debugging, logical problems |
Model Name | Version | Context | Strengths | Use Cases |
---|---|---|---|---|
Claude 3.5 Sonnet | claude-3-5-sonnet-20241022 | 200K | Best balance of intelligence and speed, excellent coding | Code generation, complex analysis, creative writing |
claude-3-5-sonnet-20240620 | 200K | June 2024 version | Stable Claude 3.5 deployment | |
Claude 3.5 Haiku | claude-3-5-haiku-20241022 | 200K | Lightning-fast responses, cost-effective | Real-time applications, high-volume processing |
Claude 3 Opus | claude-3-opus-20240229 | 200K | Complex reasoning, nuanced understanding | Research, complex document analysis |
Claude 3 Sonnet | claude-3-sonnet-20240229 | 200K | Balanced Claude 3 model | General purpose, good price-performance |
Claude 3 Haiku | claude-3-haiku-20240307 | 200K | Fastest Claude 3 model | High-speed, cost-sensitive applications |
Model Name | Version | Context | Strengths | Use Cases |
---|---|---|---|---|
Anthropic Claude on Bedrock | anthropic.claude-3-5-sonnet-20241022-v2:0 | 200K | AWS integration, enterprise security, compliance | Enterprise Claude deployments |
anthropic.claude-3-5-sonnet-20240620-v1:0 | 200K | AWS-native Claude 3.5 | AWS-integrated applications | |
anthropic.claude-3-5-haiku-20241022-v1:0 | 200K | Fast Claude with AWS benefits | High-speed AWS applications | |
anthropic.claude-3-opus-20240229-v1:0 | 200K | Most capable Claude with AWS | Complex AWS workloads | |
anthropic.claude-3-sonnet-20240229-v1:0 | 200K | Balanced Claude with AWS | General AWS applications | |
anthropic.claude-3-haiku-20240307-v1:0 | 200K | Fast Claude with AWS | Cost-effective AWS applications | |
Meta Llama 3.1 | meta.llama3-1-70b-instruct-v1:0 | 8K | Large Llama 3.1 model, open-source heritage | Custom deployments, fine-tuning base |
meta.llama3-1-8b-instruct-v1:0 | 8K | Efficient Llama 3.1 model | Lightweight applications | |
Meta Llama 3 | meta.llama3-70b-instruct-v1:0 | 8K | Large Llama 3 model | High-performance open-source needs |
meta.llama3-8b-instruct-v1:0 | 8K | Small Llama 3 model | Cost-effective open-source | |
Amazon Titan | amazon.titan-text-premier-v1:0 | 8K | Premium Titan model, AWS-native | AWS-integrated applications, Amazon-specific tasks |
amazon.titan-text-express-v1 | 8K | Fast Titan model | High-speed AWS applications | |
Cohere Command | cohere.command-r-plus-v1:0 | Standard | Advanced Command model, retrieval-augmented generation | RAG applications, document search |
cohere.command-r-v1:0 | Standard | Standard Command model | Enterprise search applications | |
Mistral | mistral.mistral-large-2402-v1:0 | 32K | Large Mistral model, European model | Multilingual applications, European compliance |
mistral.mixtral-8x7b-instruct-v0:1 | 32K | MoE architecture, efficient inference | Efficient multilingual processing |
Feature | Description | Use Cases |
---|---|---|
Model Availability | Same models as OpenAI with Azure enterprise capabilities | Enterprise applications, Microsoft ecosystem integration |
Regional Deployments | Deploy models in specific Azure regions | Data residency compliance, low latency applications |
Private Endpoints | Multi-authentication support for secure enterprise access | Secure enterprise deployments |
Enterprise SLAs | Guaranteed uptime and support | Mission critical applications |
Content Filtering | Built-in content moderation | Compliance and safety requirements |
Model Type | Description | Use Cases |
---|---|---|
Custom Model Deployments | Deploy any model from Azure AI catalog | Custom ML pipelines, specialised applications |
Fine-tuned Models | Deploy your custom fine-tuned models | Domain specific applications |
Open-Source Models | Llama, Mistral, Falcon, and more | Open source AI development |
Specialised Models | Domain specific models for healthcare, finance, etc. | Industry specific applications |
Model Name | Version | Context | Strengths | Use Cases |
---|---|---|---|---|
Gemini 1.5 Pro | gemini-1.5-pro | 2M | Most capable Gemini model, massive context, multimodal, video understanding | Long document analysis, video processing |
Gemini 1.5 Flash | gemini-1.5-flash | 1M | Fast, efficient model, speed, cost-effectiveness, multimodal | Real time applications, high volume processing |
Gemini 1.0 Pro | gemini-1.0-pro | Standard | Previous generation Pro model | Stable, proven performance |
Gemini 1.0 Pro Vision | gemini-1.0-pro-vision | Standard | Vision enabled version | Image analysis applications |
Gemini Experimental | gemini-exp-1121 | Standard | November 2024 experimental | Testing cutting edge capabilities |
gemini-exp-1114 | Standard | November 2024 experimental | Early access to new features |
Feature | Description | Use Cases |
---|---|---|
Model Availability | Same Gemini models as Google AI with enterprise features | GCP native applications, enterprise deployments |
Private Endpoints | VPC Service Controls | Secure enterprise deployments |
Regional Deployments | Data residency compliance | Compliance requirements |
Model Garden | Access to 100+ open source models | Open source AI development |
AutoML Integration | Custom model training | Custom ML model development |
Provider | Model | Context | Speed | Cost | Best For |
---|---|---|---|---|---|
OpenAI | gpt-4o | 128K | Medium | High | Complex reasoning, multimodal |
gpt-4o-mini | 128K | Fast | Low | Simple tasks, high volume | |
gpt-4-turbo | 128K | Medium | High | Vision tasks, analysis | |
gpt-3.5-turbo | 16K | Fast | Low | Quick responses, chatbots | |
o1-preview | Standard | Slow | High | Complex reasoning | |
Anthropic | claude-3-5-sonnet | 200K | Fast | Medium | Coding, analysis |
claude-3-5-haiku | 200K | Fastest | Very Low | Real-time apps | |
claude-3-opus | 200K | Slow | Very High | Complex research | |
gemini-1.5-pro | 2M | Medium | Medium | Massive documents | |
gemini-1.5-flash | 1M | Fast | Low | High-speed processing | |
Bedrock | llama3-1-70b | 8K | Medium | Low | Open-source needs |
titan-premier | 8K | Fast | Low | AWS integration | |
mistral-large | 32K | Medium | Medium | European compliance |