·
Managed API, Serverless Inference, Dedicated Endpoints, Fine-tuning endpoints
Meta, Mistral, Google, Microsoft, NVIDIA, IBM, Alibaba, OpenAI, Cohere
Hugging Face is the leading community and data science platform for open-source machine learning, hosting over 800,000 models. Its Enterprise Hub provides managed infrastructure and advanced security for businesses to host and share models internally or publicly.
Serverless API, Managed Compute, Pay-as-you-go
Microsoft, OpenAI, Anthropic, Cohere, Meta, Mistral AI, DeepSeek, xAI, Stability AI, NVIDIA, Hugging Face
Azure AI Foundry (formerly Azure AI Studio) is Microsoft's primary enterprise hub for model discovery, offering over 11,000 models from both Microsoft and third-party partners. It supports diverse access modes including serverless APIs and managed compute with unified security and compliance.
Managed API, Serverless Inference, Fine-tuning endpoints, Batch Inference, Marketplace Subscription
OpenAI, Anthropic, Meta, Mistral, Cohere, Google, Amazon
Databricks Mosaic AI is an integrated platform for building, deploying, and governing generative AI applications. It offers Mosaic AI Model Serving, which allows businesses to serve open-source and proprietary foundation models with enterprise-grade quality and reliability.
Managed API, Serverless Inference, Marketplace Subscription, Dedicated Instances, Fine-tuning endpoints
Anthropic, Meta, Mistral, AI21 Labs, CAMB.AI
Google Vertex AI Model Garden provides a centralized repository for discovering and deploying both Google-made (Gemini, PaLM) and third-party foundation models. It is designed for enterprise-grade ML development with deep integration into the Google Cloud ecosystem.
Managed API, Serverless Inference, Provisioned Throughput, Marketplace Subscription, Fine-tuning
Anthropic, Meta, Mistral, Cohere, AI21 Labs, Stability AI
Amazon Bedrock is AWS's fully managed service that offers a choice of high-performing foundation models from leading AI companies via a single API. It integrates tightly with other AWS services like SageMaker and provides features for model evaluation, guardrails, and knowledge bases.
Managed API, Serverless Inference, Fine-tuning endpoints
OpenAI, Anthropic, Meta, Mistral AI, DeepSeek, Google, Reka
Snowflake Cortex is a managed service that provides instant access to foundation models and LLM-based functions within the Snowflake Data Cloud. It enables users to perform complex AI tasks on their governed data without moving it out of the Snowflake security perimeter.
Managed API, Serverless Inference, Dedicated Instances, Fine-tuning endpoints
Meta, Mistral, DeepSeek, Anthropic, OpenAI
IBM watsonx.ai is an enterprise studio that enables developers to train, validate, tune, and deploy both IBM's Granite models and third-party foundation models. It emphasizes AI governance and ethical AI across the model lifecycle.
Managed API, On-demand Inference, Dedicated Instances, Fine-tuning endpoints
Cohere, Google, Meta, OpenAI, xAI
Oracle Cloud Infrastructure (OCI) Generative AI is a fully managed service that provides a set of state-of-the-art foundation models for various use cases. It offers on-demand and dedicated AI cluster options for serving third-party models from vendors like Cohere and Meta.
Managed API, Serverless Inference, Tokens-as-a-Service
Meta, Mistral AISource: 📄 **https://azure.microsoft.com/en-us/products/ai-foundry/models** This is the Trace Id: 58dc8f6f02056ada2b8c6fffa4078e2f Skip to main content Introducing MAI models in Microsoft Foundry. [Read the blog](https://go.microsoft.com/fwlink/?linkid=2359711&clcid=0x409)  # Foundry Models Find the right model from exploration to deployment all in one place.  Accelerate innovation with popular models from Microsoft, OpenAI, Anthropic, Cohere, DeepSeek, Mistral AI, Meta and more. [Get started with Azure](https://azure.microsoft.com/en-us/pricing/purchase-options/azure-account/) [Create with Foundry Models](https://go.microsoft.com/fwlink/?linkid=2335239&clcid=0x409)  [Watch video: Azure AI Foundry Model video](https://azure.microsoft.com/en-us/products/ai-foundry/models#modal-1)  OVERVIEW ## Smarter model selection starts here - ### Find the right model for every use case Innovate faster using more than 11,000+ models packed for out-of-the-box-use and shared computer resources. [Learn more](https://go.microsoft.com/fwlink/?linkid=2335239&clcid=0x409)  - ### Deploy models where you need them Easily integrate AI models into your applications without having to provision or manage infrastructure. [Learn more](https://go.microsoft.com/fwlink/?linkid=2335239&clcid=0x409)  - ### Optimize model selection Analyze model metrics with standard datasets. Deploy model router to optimize costs and performance at runtime. [Learn more](https://go.microsoft.com/fwlink/?linkid=2271933&clcid=0x409)  - ### Swap and compare models easily Easily switch models and compare performance with the Azure AI model inference API. [Learn more](https://go.microsoft.com/fwlink/?linkid=2316452&clcid=0x409)   MODELS ## Choose from more than 11,000+ models Foundry Models offer a rich and diverse collection of models designed to meet every enterprise AI need. [Browse the catalog](https://go.microsoft.com/fwlink/?linkid=2335239&clcid=0x409) Foundry ModelsModels from partners and community Previous Next  ### OpenAI Foundation models that exceed benchmark performance across image, video, and text. [Learn more](https://go.microsoft.com/fwlink/?linkid=2293618&clcid=0x409)  ### Anthropic Anthropic models are designed to deliver high-quality reasoning, summarization, and dialogue capabilities for enterprise use. [Learn more](https://go.microsoft.com/fwlink/?linkid=2340918&clcid=0x409)  ### Cohere A leading large language model for retrieval-augmented generation capabilities. [Learn more](https://go.microsoft.com/fwlink/?linkid=2341022&clcid=0x409)  ### Meta Pre-trained, open language models ranging from 7 billion to 70 billion parameters. [Learn more](https://go.microsoft.com/fwlink/?linkid=2316551&clcid=0x409)  ### Mistral AI Accelerate AI innovation and achieve state-of-the-art reasoning performance. [Learn more](https://go.microsoft.com/fwlink/?linkid=2316454&clcid=0x409)  ### DeepSeek DeepSeek is a Chinese [artificial intelligence](https://go.microsoft.com/fwlink/?linkid=2316455) company that trains models at a significantly lower cost. DeepSeek R1 is now available on Foundry and GitHub. [Learn more](https://go.microsoft.com/fwlink/?linkid=2316650&clcid=0x409)  ### xAI Supercharge enterprise AI with deep reasoning, domain expertise, and blazing-fast scalability with Grok. [Learn more](https://go.microsoft.com/fwlink/?linkid=2335633&clcid=0x409)  ### Black Forest Labs Harness the power of industry-leading image generation capabilities with the Flux family of models. [Learn more](https://go.microsoft.com/fwlink/?linkid=2321428&clcid=0x409)  ### Nixtla Pre-trained, generative AI transformer models for time-series analysis. [Learn more](https://go.microsoft.com/fwlink/?linkid=2271935&clcid=0x409)  ### Bria Bria is the developer of Visual Generative AI solutions designed for commercial use across business, product, and technology departments. [Learn more](https://go.microsoft.com/fwlink/?linkid=2335325&clcid=0x409)  ### NTT Data A high-performance, lightweight Japanese and English SLM with fine-tuning for secure hybrid deployment. [Learn more](https://go.microsoft.com/fwlink/?linkid=2316549&clcid=0x409)  ### Core42, a G42 company Leading Arabic language model JAIS accelerates the growth of a vibrant Arabic language AI ecosystem. [Learn more](https://go.microsoft.com/fwlink/?linkid=2316653&clcid=0x409)  ### NVIDIA NIM Microservices NVIDIA NIM is a set of easy-to-use microservices designed to accelerate the deployment of generative AI across enterprises. [Learn more](https://go.microsoft.com/fwlink/?linkid=2316456&clcid=0x409)  ### Stability AI Deliver exceptional text-to-image generation with superior quality and prompt adherence. [Learn more](https://go.microsoft.com/fwlink/?linkid=2316652&clcid=0x409)  ### Phi Small language models for building generative AI applications with better latency and lower costs. [Learn more](https://azure.microsoft.com/en-us/products/phi/)  ### Hugging Face Thousands of models spanning categories from text generation to image analysis. [Learn more](https://go.microsoft.com/fwlink/?linkid=2316552&clcid=0x409) Back to tabs [Browse the catalog](https://go.microsoft.com/fwlink/?linkid=2335239&clcid=0x409) Security ## Embedded security and compliance 34,000 > Full-time equivalent engineers dedicated to security initiatives at Microsoft. [Learn more](https://www.microsoft.com/en-us/security/security-insider/intelligence-reports/microsoft-digital-defense-report-2024?msockid=3248c14e3bdd62323e09d2f03a67633d) 15,000 > Partners with specialized security expertise. [Learn more](https://www.microsoft.com/en-us/security/security-insider/intelligence-reports/microsoft-digital-defense-report-2024?msockid=3248c14e3bdd62323e09d2f03a67633d) >100 > Compliance certifications, including over 50 specific to global regions and countries. [Learn more](https://go.microsoft.com/fwlink/?linkid=2339139&clcid=0x409) [Learn more about security on Azure](https://azure.microsoft.com/en-us/explore/security/)  Pricing ## Flexible pricing options Microsoft Foundry offers a range of flagship models—including Azure OpenAI, Anthropic Claude, Meta, Mistral AI, DeepSeek, xAI, Cohere, HuggingFace, NVIDIA, and more—available through serverless pay-as-you-go or managed compute offerings. [See Foundry Models pricing](https://azure.microsoft.com/en-us/pricing/details/phi-3/#pricing)  BENEFITS ## Accelerate AI innovation Previous Slide 1. [Slide 1 indicator](https://azure.microsoft.com/en-us/products/ai-foundry/models#carousel-oc2206-0) 2. [Slide 2 indicator](https://azure.microsoft.com/en-us/products/ai-foundry/models#carousel-oc2206-1) 3. [Slide 3 indicator](https://azure.microsoft.com/en-us/products/ai-foundry/models#carousel-oc2206-2) 4. [Slide 4 indicator](https://azure.microsoft.com/en-us/products/ai-foundry/models#carousel-oc2206-3) Next Slide  ### Task-centric model discovery Explore AI models by task and use the playground to experiment with sample queries. [Learn more](https://go.microsoft.com/fwlink/?linkid=2293620&clcid=0x409)  ### Ready-to-use fine-tuning Accelerate AI projects with ready-to-use fine-tuning pipelines—no setup needed. [Learn more](https://go.microsoft.com/fwlink/?linkid=2272209&clcid=0x409)  ### Evaluate using your own data Assess model performance using your own datasets, compare metrics, and measure improvements. [Learn more](https://go.microsoft.com/fwlink/?linkid=2316553&clcid=0x409)  ### Effortless AI deployment Experience hassle-free managed instances with automatic scaling, seamless traffic management, and secure hosting. [Learn more](https://go.microsoft.com/fwlink/?linkid=2293233&clcid=0x409) Back to BENEFITS section CUSTOMER STORIES ## See who’s innovating with Foundry Models [View all Azure AI stories](https://www.microsoft.com/en-us/ai/ai-customer-stories) Previous Slide 1. [](https://azure.microsoft.com/en-us/products/ai-foundry/models#carousel-ocbd56-0) 2. [](https://azure.microsoft.com/en-us/products/ai-foundry/models#carousel-ocbd56-1) 3. [](https://azure.microsoft.com/en- [Content truncated - use continue_reading with url "https://azure.microsoft.com/en-us/products/ai-foundry/models" to see more] [Tool Call ID: tc_7],, Google, DeepSeek, xAI, Alibaba, SambaNova Systems (maybe?)
GroqCloud is a high-speed AI inference platform powered by Groq's Language Processing Units (LPUs). It provides serverless API access to leading open-source foundation models from vendors like Meta and Mistral, emphasizing low latency and cost-effectiveness for real-time enterprise AI applications.
Marketplace Subscription, Managed API, Embedded AI Services
OpenAI, Anthropic, Google, Meta
Salesforce Einstein Trust Layer (powering Agentforce) is an enterprise AI gateway that provides secure access to third-party foundation models within the Salesforce ecosystem. It features data masking, toxicity detection, and audit trails to ensure compliance and privacy for generative AI applications.
Managed API, Marketplace Subscription, Orchestration Framework
OpenAI, Anthropic, Google, Meta, Mistral AI
SAP AI Foundation (including SAP AI Core) is a central hub for managing and orchestrating foundation models within the SAP Business Technology Platform (BTP). It enables businesses to access and integrate third-party LLMs into their business processes while ensuring governance and enterprise readiness.
Managed API, NIM Microservices (Containers), GPU Instances
Meta, Mistral AI, Google, DeepSeek, MiniMax, Zhipu AI (GLM), Moonshot AI (Kimi)
NVIDIA AI Foundation Models (accessed via NVIDIA NIM) is a collection of over 80 community and NVIDIA-built models optimized for performance on NVIDIA infrastructure. It provides a standardized API for enterprises to discover and deploy models in the cloud or on-premises using NIM microservices.
Managed API, Serverless Inference, Fine-tuning endpoints, Marketplace Subscription
Meta, Mistral AI, Google (Gemma), Stability AI, DeepSeek, Qwen (Alibaba)
Fireworks AI is a high-performance inference platform that provides low-latency access to the latest open-source foundation models. It offers an enterprise-grade API with support for fine-tuning and serverless deployment, and is increasingly available as a third-party marketplace offering on major clouds like Azure.
Managed API, Serverless Inference
Meta, Mistral AI, DeepSeek, Google (Gemma), Zhipu AI (GLM), Moonshot AI (Kimi), MiniMax, Stability AI (SDXL)
DeepInfra is an AI inference provider focusing on cost-effective and scalable access to over 100 open-source foundation models. It is positioned as a 'budget champion' with a broad catalog of the latest models, though it primarily offers inference without advanced enterprise governance or fine-tuning.
Managed API, Serverless Inference, Ray Serve Endpoints, Private Endpoints (VPC)
Meta (Llama), Mistral AI (Mistral/Mixtral), Hugging Face (Zephyr)
Anyscale Endpoints is an AI model serving platform from the creators of Ray, offering cost-effective and scalable API access to popular open-source foundation models. It is designed for production-scale AI workloads, providing both public and private endpoints with deep integration into the Ray distributed computing ecosystem.
Managed API, Cerebras Cloud REST API, Marketplace Subscription (AWS Bedrock), Cerebras AI Model Studio
Meta (Llama), Mistral AI (Mistral), Zhipu AI (GLM), Amazon (Nova)
Cerebras Inference (available via Cerebras Cloud and AWS Bedrock) is a high-speed AI inference platform powered by Cerebras Wafer-Scale Engine (WSE) chips. It provides extremely low-latency access to leading open-source foundation models like Llama and GLM, aimed at enterprises requiring real-time performance at production scale.
Managed API, Serverless Inference, Dedicated Instances, Fine-tuning endpoints
Meta, Mistral AI, Qwen, DeepSeek, Gemma, DBRX, Llama
Together AI is a cloud platform optimized for open-source foundation models, offering over 200 models for text, image, and video. It provides serverless inference and dedicated GPU clusters for both research and enterprise production workloads.
Managed API, Serverless Inference, Fine-tuning endpoints
Black Forest Labs, Meta, Stability AI, Mistral AI, Google
Replicate (acquired by Cloudflare in 2026) provides a cloud API for running and fine-tuning over 50,000 open-source and community models. It is known for its simplicity and 'one line of code' deployment, now deeply integrated into the Cloudflare Workers AI ecosystem.
Managed API, Serverless Inference, Compute Orchestration Engine
Microsoft (Phi), Meta (Llama), Mistral AI, Hugging Face community models
BentoCloud is a unified AI inference management platform that allows teams to deploy and scale any machine learning model as a production-ready API. It features an open model catalog and emphasizes efficiency with optimized model loading and sub-second cold starts.
Managed API, Serverless Inference, Exclusive Partner Endpoints
ByteDance (Kling, Seedance), Alibaba (WAN), Black Forest Labs (Flux), Google (Imagen)
WaveSpeed AI is a specialized cloud platform for visual AI, offering exclusive international API access to ByteDance's flagship Kling and Seedance video/image models. It hosts over 600 visual foundation models with an emphasis on high-performance inference, zero cold starts, and early access to models from Asian AI leaders.
Managed API (NemoClaw), Serverless Inference, Dedicated GPU Cloud
NVIDIA (Nemotron, Dynamo), Meta (Llama), Mistral AI
Vultr AI Model Stack is a specialized AI-native cloud infrastructure optimized for production-scale inference. It features the 'NemoClaw' agentic framework and provides integrated access to NVIDIA's Nemotron model family and other leading open-source LLMs on a globally distributed GPU stack.
Serverless Inference, Agent Development Kit, Managed API
OpenAI (Open Weights), Anthropic, NVIDIA, Meta (Llama), Mistral AI, Google (Gemma)
DigitalOcean AI Platform is an AI-native cloud service providing access to over 70 open-source and frontier models via a centralized Model Catalog. It emphasizes day-zero access to new releases, intelligent model routing, and serverless inference for developers and growing businesses.
Serverless Inference (Model Library), Managed API, Web Endpoints (Webhooks)
Z.ai (GLM), Meta (Llama), Mistral AI, OpenAI (Whisper)
Modal is a serverless high-performance infrastructure platform that enables developers to serve AI models with minimal configuration. It features a curated Model Library and optimized runtimes for low-latency inference, supporting sub-second cold starts and instant autoscaling for diverse foundation models.
Managed API (Sonar), Agent API (Third-party Orchestration), Enterprise Max Subscription
OpenAI, Anthropic, Meta (Llama)
Perplexity Enterprise is an AI-powered research and orchestration platform that provides secure access to leading foundation models through its Agent API. It uniquely combines LLM reasoning with real-time web search, allowing businesses to build research-intensive applications and agents using a variety of first-party Sonar models and third-party presets.
Managed API, Serverless Inference (Turbo LoRA), Fine-tuning endpoints
Meta (Llama), Mistral AI, Google (Gemma), Microsoft (Phi), Alibaba (Qwen), DeepSeek
Predibase is a developer-focused platform specialized in fine-tuning and serving small to medium-sized language models. It provides officially supported base models and a high-performance 'Turbo LoRA' inference engine for serving custom adapters at scale, now part of Rubrik's enterprise data pipeline.
Made with Webhound · Ask questions about this research, build on it, or start your own
Ask Webhound about this research, build on it, or start your own
Start free