Blog

LLM Selection Guide: Providers vs Self-Hosting

Choosing a Large Language Model (LLM) for an AI project is not a one-size-fits-all situation. With all the options available today, the decision making process can be daunting. This implementation-neutral guide will provide insights into the advantages and potential pitfalls when making such a decision.

Whether you're looking to quickly implement a solution for a specific application, or seeking to build an LLM that aligns with your organization's needs, the questions below will steer you in the right direction. By the end, you'll be better equipped to evaluate the needs of your project, and which LLM to use.

About LLM Providers and Self-Hosting

By far the biggest distinction between LLM offerings is where and how those are hosted.

LLMs can be obtained through a provider, or online service. These may bundle up additional capabilities and services around the LLM itself. They operate on a software-as-a-service (SaaS) business model, not unlike more conventional Cloud and Data services. Provider offerings can be a fast path to gaining access to LLMs, enabling developers to start experimenting and integrating AI capabilities into their projects without the effort of setting up and managing infrastructure.

Self-hosting a Large Language Model involves running the model on infrastructure within your own environment. As data stays within the environment, security and data can be less of an issue, in comparison to using a provider. The models themselves can be obtained online, but must be hosted and managed with suitable infrastructure, either on-premesis or in the Cloud.

Provider-only Questions

The following only applies to Provider-based LLM solutions.

Data Security and Governance

  • Review the terms of service; check if customer data is excluded from training the provider's product.
  • Which region is the service hosted in, and can it be selected based on company requirements?
  • Is the data sent through the provider considered low risk?
  • Healthcare or customer data may require extra controls: Does the provider have security controls in place? A new provider may not have achieved SOC 2 or other compliance certifications.

Customization

  • Customizing models, also known as fine-tuning, can be effortless as many providers offer code-free solutions.
  • An important part of fine-tuning is the data. The provider will require access to this data as part of the process. Having company data approved for training early avoids delays.

Availability and Scalability

  • Providers may set API rate limits, which may cause issues if exceeded. Forecast your usage to avoid issues.
  • Dedicated instances may be more effective if the forecast for usage is high, and request response speed may also be faster.

Trust and Safety

  • Does the provider offer content filtering for inappropriate language and topics?
  • Has the provider reviewed prompt safety over time and are reports available?

Self-Hosted-only Questions

The following only applies to self-hosted solutions.

Initial Investment

  • Setting up infrastructure for LLMs can be costly, especially without affordable access to high-end hardware. This is a major consideration for projects with budget constraints.
  • Production LLMs can be complex, necessitating an operational team to manage both the software and the infrastructure.

Choosing the Right Hardware

  • CPU, GPU, TPU, Inference Cards: All options have cost and performance benefits with research being the key to selecting the right option for your use case.
  • Cloud Managed Products or Self Managed Instances.
  • Consider the ability to scale hardware for the LLM.

Availability and Scalability

  • Self-hosted solutions may not offer the same level of scalability as cloud-based services, impacting the availability of applications.
  • Setting up and maintaining servers can be challenging and may limit scalability.
  • Consider the size of the LLM used as it affects meeting performance and response time goals.

Customization, Privacy and Security

  • Fine-tuning on data can be completed without involving third parties.
  • Existing data agreements may not require changes.
  • Hosting models in your environment avoids data leaving your network, enhancing data privacy.
  • Implement a token solution for access and authentication.
  • The self hosting approach is beneficial for regulated industries.

Trust and Safety

  • Create a strong system prompt to set boundaries in the LLM.
  • Ensure you have resources to validate factual and test safety.
  • Setup filters for offensive language and topics.

Additional considerations

The following apply to both provider-based and self-hosted LLM solutions.

Cost Analysis

The cost of LLMs may be difficult to quantify, but is a worthwhile concern for both the short and long term use of the product. A useful tool for discovering the cost of some Platform implementations is: LLM Price.

https://llm-price.com/

Case Studies

LLM Case studies are a great way to gather more information to help drive decision making. They typically real-world examples of companies that have successfully implemented both provider and self-hosting solutions.

For example, an analysis of four such studies was featured in Carnegie Mellon's Software Engineering Institute blog, back in 2023. It contains some insights about ChatGPT 3.5, and about the studies themselves.

Performance Benchmarks

Look into detailed performance benchmarks for your models. Consider directly comparing provider-hosted LLMs with self-hosted ones. Important metrics include latency, throughput, and scalability.

One such approach was conducted by LMSys Org, that used an ELO raking system. We wrote about this back in May of this year: Has Anthropic Surpassed OpenAI?

Integration and Compatibility

Consider LLMs integrate with existing systems and workflows, for both provider-based and self-hosting scenarios. Look for documented compatibility issues and solutions for integrating LLMs with popular platforms and tools.

Llama Index has one such compatibility matrix for a few popular Paid LLM APIs and some popular tools: LLM compatibility tracking

Security Considerations

As LLMs have access to, or are themselves, data that needs to be secured. Consider comparisons of known risks for models in scope, and the different security postures of provider and self-hosted offerings.

OWASP has distilled a lot of general guidance around security and LLM into a useful document: OWASP Top 10 for LLM applications

Regulatory Compliance

Depending on your business, you may be bound by special legal and industry regulations. Understanding how an LLM-based solution complies or aggravates that compliance is crucial to success. This is most present in self-hosted solutions, but can apply to provider-based LLMs as well.

LLMs are a big part of the fast-moving AI space, with new ideas and technologies emerging around LLMs all the time. Projecting where your solution will be in the years to come is challenging, but worth looking into. That in turn may impact the decision to self-host or to use a provider-based solution.

Innovations around LLMs can also alter how a given product may perform. Concepts like federated learning and edge computing can dramatically change how an LLM performs, costs, and sets the stage for the future.

Conclusion

Given the phases of design, build, compliance, cybersecurity and scaling to production operation, there are many new considerations to make for the first time.

As a team of professionals, we assist companies to deliver successful solutions. If you're setting up a project or require further information like this, feel free to contact us today.

Next steps

Ready to talk about your next project?

1

Tell us more about your custom needs.

2

We’ll get back to you, really fast

3

Kick-off meeting

Let's Talk