LLM Selection Guide: Providers vs Self-Hosting

Choosing a Large Language Model (LLM) for an AI project is not a one-size-fits-all situation. With all the options available today, the decision making process can be daunting. This implementation-neutral guide will provide insights into the advantages and potential pitfalls when making such a decision.

Whether you're looking to quickly implement a solution for a specific application, or seeking to build an LLM that aligns with your organization's needs, the questions below will steer you in the right direction. By the end, you'll be better equipped to evaluate the needs of your project, and which LLM to use.

About LLM Providers and Self-Hosting

By far the biggest distinction between LLM offerings is where and how those are hosted.

LLMs can be obtained through a provider, or online service. These may bundle up additional capabilities and services around the LLM itself. They operate on a software-as-a-service (SaaS) business model, not unlike more conventional Cloud and Data services. Provider offerings can be a fast path to gaining access to LLMs, enabling developers to start experimenting and integrating AI capabilities into their projects without the effort of setting up and managing infrastructure.

Self-hosting a Large Language Model involves running the model on infrastructure within your own environment. As data stays within the environment, security and data can be less of an issue, in comparison to using a provider. The models themselves can be obtained online, but must be hosted and managed with suitable infrastructure, either on-premesis or in the Cloud.

Provider-only Questions

The following only applies to Provider-based LLM solutions.

Data Security and Governance

Review the terms of service; check if customer data is excluded from training the provider's product.
Which region is the service hosted in, and can it be selected based on company requirements?
Is the data sent through the provider considered low risk?
Healthcare or customer data may require extra controls: Does the provider have security controls in place? A new provider may not have achieved SOC 2 or other compliance certifications.

Customization

Customizing models, also known as fine-tuning, can be effortless as many providers offer code-free solutions.
An important part of fine-tuning is the data. The provider will require access to this data as part of the process. Having company data approved for training early avoids delays.

Availability and Scalability

Providers may set API rate limits, which may cause issues if exceeded. Forecast your usage to avoid issues.
Dedicated instances may be more effective if the forecast for usage is high, and request response speed may also be faster.

Trust and Safety

Does the provider offer content filtering for inappropriate language and topics?
Has the provider reviewed prompt safety over time and are reports available?

Self-Hosted-only Questions

The following only applies to self-hosted solutions.

Initial Investment

Setting up infrastructure for LLMs can be costly, especially without affordable access to high-end hardware. This is a major consideration for projects with budget constraints.
Production LLMs can be complex, necessitating an operational team to manage both the software and the infrastructure.

Choosing the Right Hardware

CPU, GPU, TPU, Inference Cards: All options have cost and performance benefits with research being the key to selecting the right option for your use case.
Cloud Managed Products or Self Managed Instances.
Consider the ability to scale hardware for the LLM.

Availability and Scalability

Self-hosted solutions may not offer the same level of scalability as cloud-based services, impacting the availability of applications.
Setting up and maintaining servers can be challenging and may limit scalability.
Consider the size of the LLM used as it affects meeting performance and response time goals.

Customization, Privacy and Security

Fine-tuning on data can be completed without involving third parties.
Existing data agreements may not require changes.
Hosting models in your environment avoids data leaving your network, enhancing data privacy.
Implement a token solution for access and authentication.
The self hosting approach is beneficial for regulated industries.

Trust and Safety

Create a strong system prompt to set boundaries in the LLM.
Ensure you have resources to validate factual and test safety.
Setup filters for offensive language and topics.

Additional considerations

The following apply to both provider-based and self-hosted LLM solutions.

Cost Analysis

The cost of LLMs may be difficult to quantify, but is a worthwhile concern for both the short and long term use of the product. A useful tool for discovering the cost of some Platform implementations is: LLM Price.

https://llm-price.com/

Case Studies

LLM Case studies are a great way to gather more information to help drive decision making. They typically real-world examples of companies that have successfully implemented both provider and self-hosting solutions.

For example, an analysis of four such studies was featured in Carnegie Mellon's Software Engineering Institute blog, back in 2023. It contains some insights about ChatGPT 3.5, and about the studies themselves.

Performance Benchmarks

Look into detailed performance benchmarks for your models. Consider directly comparing provider-hosted LLMs with self-hosted ones. Important metrics include latency, throughput, and scalability.

One such approach was conducted by LMSys Org, that used an ELO raking system. We wrote about this back in May of this year: Has Anthropic Surpassed OpenAI?

Integration and Compatibility

Consider LLMs integrate with existing systems and workflows, for both provider-based and self-hosting scenarios. Look for documented compatibility issues and solutions for integrating LLMs with popular platforms and tools.

Llama Index has one such compatibility matrix for a few popular Paid LLM APIs and some popular tools: LLM compatibility tracking

Security Considerations

As LLMs have access to, or are themselves, data that needs to be secured. Consider comparisons of known risks for models in scope, and the different security postures of provider and self-hosted offerings.

OWASP has distilled a lot of general guidance around security and LLM into a useful document: OWASP Top 10 for LLM applications

Regulatory Compliance

Depending on your business, you may be bound by special legal and industry regulations. Understanding how an LLM-based solution complies or aggravates that compliance is crucial to success. This is most present in self-hosted solutions, but can apply to provider-based LLMs as well.

Future Trends and Innovations

LLMs are a big part of the fast-moving AI space, with new ideas and technologies emerging around LLMs all the time. Projecting where your solution will be in the years to come is challenging, but worth looking into. That in turn may impact the decision to self-host or to use a provider-based solution.

Innovations around LLMs can also alter how a given product may perform. Concepts like federated learning and edge computing can dramatically change how an LLM performs, costs, and sets the stage for the future.

Conclusion

Given the phases of design, build, compliance, cybersecurity and scaling to production operation, there are many new considerations to make for the first time.

As a team of professionals, we assist companies to deliver successful solutions. If you're setting up a project or require further information like this, feel free to contact us today.

Latest Articles

Read more about the latest and greatest work Rearc has been up to.

Building with AI: Use the Right Tools for the Job

Don't build a monolithic AI brain. Use the right tools, and build a manager.

LLM

Chatbot

Model Context Protocol (MCP) Bridge for AI

Overview of what is MCP

MCP

AGENTIC AI

Seven Default Databricks Settings to Change

Review seven default settings you should change in Databricks

databricks

data

infrastructure

Databricks’ Gartner Win and the DAIS 2025 Announcements That Explain It

Databricks' recent Gartner Magic Quadrant leadership win is explained by their comprehensive DAIS 2025 announcements including MLFlow 3.0, AgentBricks, Lakebase, and serverless GPU support that create a unified platform for enterprise GenAI applications.

DAIS 2025

Databricks

GenAI

Next steps

Ready to talk about your next project?

Tell us more about your custom needs.

We’ll get back to you, really fast

Kick-off meeting

Let's Talk

LLM Selection Guide: Providers vs Self-Hosting

About LLM Providers and Self-Hosting

Provider-only Questions

Data Security and Governance

Customization

Availability and Scalability

Trust and Safety

Self-Hosted-only Questions

Initial Investment

Choosing the Right Hardware

Availability and Scalability

Customization, Privacy and Security

Trust and Safety

Additional considerations

Cost Analysis

Case Studies

Performance Benchmarks

Integration and Compatibility

Security Considerations

Regulatory Compliance

Future Trends and Innovations

Conclusion

Latest Articles

Building with AI: Use the Right Tools for the Job

Model Context Protocol (MCP) Bridge for AI

Seven Default Databricks Settings to Change

Databricks’ Gartner Win and the DAIS 2025 Announcements That Explain It

Ready to talk about your next project?

Newsletter