The Diminishing Returns of Deploying Local Large Language Models

The Diminishing Returns of Deploying Local Large Language Models

Over the past several weeks, our teams at Blankfactor have continued testing and deploying various open source local Large Language Models (LLMs); You can find quite a few of them at HuggingFace. Despite the additional privacy and control provided by local models, I've become less enamored with their price-performance at run time compared to LLMs from major providers like Anthropic , Google , OpenAI , Microsoft , and others.

Recent innovations from the major players demonstrate remarkable integrations and capabilities out-of-the-box. With a bit of programming, frameworks already developed by LangChain & LlamaIndex , and simple API access, these models can fulfill many natural language processing needs on your content without the overhead of deploying and maintaining a localized model.

Unless you have abundant on-premises compute and memory resources (The GPU crunch is real), the expense and effort to get local LLMs production-ready likely outweighs the privacy benefits. The time spent collecting sufficient training data, iterating on model configurations, monitoring for bias, and maintaining infrastructure and its cost is non-trivial.

Relying on a provider's LLM does introduce some dependence on their technology. However, the same could be said about any cloud infrastructure dependency. Concerns about vendor lock-in could potentially be mitigated by building adaptable data pipelines and APIs on top of the LLM.

My strong opinion (loosely held): For many use cases, leveraging a major provider's LLM as a service will deliver better performance per dollar spent compared to a locally deployed model. The innovations and economies of scale from companies dedicated to developing LLMs full-time are hard to match on your own. Unless regulatory compliance or privacy concerns mandate it, a local LLM may no longer provide the best return on investment.

Let me know if you have any other thoughts on the trade-offs of local vs. large provider LLMs! I'm interested to hear your perspectives.

Roman Naumenko

Capital Markets and Software

10mo

It’s going to cost when you showcase the product build on a local model they spent millions “training” and executives ask to run the same prompt in ChatGPT.

Sai Sekhar P

Chief Technologist | Strategy | Innovation | Transformation

10mo

In my opinion, it would make sense to deploy localized LLMs for specialized capabilities. For example scientific research, deriving decisions / artifacts for design and launch of a satellite, fabricating highly specialized engineering equipment etc. The efficiency of the localized LLMs for these needs, would offset the burden imposed by infra compute. As such these areas would already be having all that needed to get transformed into localized LLMs.

George A. Pace, MBA

Founder and CEO of Keep Pace Technology

10mo

Hey Tom - A lot of this is "point in time" (this is all still early stuff) - especially since the providers are still ramping up their LLM capabilities (Microsoft for example). Even with that - from a size/compute/create perspective - I have yet to see a use case where running an LLM locally makes any sense. Now with that said - I have questions "outside" of "standalone" LLM's - and the employee experience. My guess is that an organization is going to interact with several LLM's (CoPilot Office, Salesforce Einstein, BARD, OpenAI, etc) - There is a Huggingface Whitepaper that talks about the need for an LLM of LLM's - basically 1 primary LLM (and others plug in) But even in that case - I don't see that running locally. What I DO think we will see - is more AI capabilities being driven to the edge (to address latency) https://siliconangle.com/2023/08/27/cloud-giants-eye-potential-windfall-ai-network-edge/

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics