The Diminishing Returns of Deploying Local Large Language Models

Thomas Barton

Executive AI, Data, & Technology Leader: Generative AI, ML, Data, & IT Strategy, Architecture, and Execution

Published Sep 1, 2023

Over the past several weeks, our teams at Blankfactor have continued testing and deploying various open source local Large Language Models (LLMs); You can find quite a few of them at HuggingFace. Despite the additional privacy and control provided by local models, I've become less enamored with their price-performance at run time compared to LLMs from major providers like Anthropic , Google , OpenAI , Microsoft , and others.

Recent innovations from the major players demonstrate remarkable integrations and capabilities out-of-the-box. With a bit of programming, frameworks already developed by LangChain & LlamaIndex , and simple API access, these models can fulfill many natural language processing needs on your content without the overhead of deploying and maintaining a localized model.

Unless you have abundant on-premises compute and memory resources (The GPU crunch is real), the expense and effort to get local LLMs production-ready likely outweighs the privacy benefits. The time spent collecting sufficient training data, iterating on model configurations, monitoring for bias, and maintaining infrastructure and its cost is non-trivial.

Relying on a provider's LLM does introduce some dependence on their technology. However, the same could be said about any cloud infrastructure dependency. Concerns about vendor lock-in could potentially be mitigated by building adaptable data pipelines and APIs on top of the LLM.

My strong opinion (loosely held): For many use cases, leveraging a major provider's LLM as a service will deliver better performance per dollar spent compared to a locally deployed model. The innovations and economies of scale from companies dedicated to developing LLMs full-time are hard to match on your own. Unless regulatory compliance or privacy concerns mandate it, a local LLM may no longer provide the best return on investment.

Let me know if you have any other thoughts on the trade-offs of local vs. large provider LLMs! I'm interested to hear your perspectives.

Roman Naumenko

Capital Markets and Software

10mo

It’s going to cost when you showcase the product build on a local model they spent millions “training” and executives ask to run the same prompt in ChatGPT.

1 Reaction

Sai Sekhar P

Chief Technologist | Strategy | Innovation | Transformation

10mo

In my opinion, it would make sense to deploy localized LLMs for specialized capabilities. For example scientific research, deriving decisions / artifacts for design and launch of a satellite, fabricating highly specialized engineering equipment etc. The efficiency of the localized LLMs for these needs, would offset the burden imposed by infra compute. As such these areas would already be having all that needed to get transformed into localized LLMs.

2 Reactions

George A. Pace, MBA

Founder and CEO of Keep Pace Technology

10mo

Hey Tom - A lot of this is "point in time" (this is all still early stuff) - especially since the providers are still ramping up their LLM capabilities (Microsoft for example). Even with that - from a size/compute/create perspective - I have yet to see a use case where running an LLM locally makes any sense. Now with that said - I have questions "outside" of "standalone" LLM's - and the employee experience. My guess is that an organization is going to interact with several LLM's (CoPilot Office, Salesforce Einstein, BARD, OpenAI, etc) - There is a Huggingface Whitepaper that talks about the need for an LLM of LLM's - basically 1 primary LLM (and others plug in) But even in that case - I don't see that running locally. What I DO think we will see - is more AI capabilities being driven to the edge (to address latency) https://siliconangle.com/2023/08/27/cloud-giants-eye-potential-windfall-ai-network-edge/

2 Reactions

See more comments

To view or add a comment, sign in

AI Strategy: Insights from the Northern Trust Integrated Trading Solutions Summit

Mar 21, 2024

The Diminishing Returns of Deploying Local Large Language Models

Thomas Barton

Executive AI, Data, & Technology Leader: Generative AI, ML, Data, & IT Strategy, Architecture, and Execution

More articles by this author

Insights from the community

Others also viewed

In Search of the 'Red Hat' for Large Language Models”

Leveraging Larger Context Windows in RAG: Benefits and Cost Considerations

Large language models are modern day operating systems

Local Llama

The Fine-Turning an Open Source Language Model Journey Part One: Impetus

From Current Limitations to Transformative Potentials: Unraveling the Evolution of Large Language Models

How to scale Large Language Models (LLMs) to infinite context?

Large Language Models (LLMs): Pros and Cons of using Open Source VS OpenAI (paid models)

Viva la Revolución Of...Open Source Large Language Models?

Cost-Saving Strategies for Large Language Models(LLMs) - Part 1

Explore topics

AI Strategy: Insights from the Northern Trust Integrated Trading Solutions Summit

Mar 21, 2024

Insights from the community

Others also viewed

In Search of the 'Red Hat' for Large Language Models”

Leveraging Larger Context Windows in RAG: Benefits and Cost Considerations

Large language models are modern day operating systems

Local Llama

The Fine-Turning an Open Source Language Model Journey Part One: Impetus

From Current Limitations to Transformative Potentials: Unraveling the Evolution of Large Language Models

How to scale Large Language Models (LLMs) to infinite context?

Large Language Models (LLMs): Pros and Cons of using Open Source VS OpenAI (paid models)

Viva la Revolución Of...Open Source Large Language Models?

Cost-Saving Strategies for Large Language Models(LLMs) - Part 1

Explore topics