m@ksim.pro
Back to all posts
AI 3 min read

Llama 2 and open weights: what it means for enterprise

A look at why larger organisations should pay attention to open-weight language models, and where the real boundary between opportunity and illusion lies.

On July 18 Meta released Llama 2 - a language model with open weights available for commercial use. By the standards of publicly available LLMs this is not a routine event: for the first time a strong model ships not only as an API but as a set of weights you can deploy on your own infrastructure.

I hear two extreme reactions. The first: "now we will have our own ChatGPT". The second: "this is for researchers, we do not need it". Both are imprecise. Let me work through what actually matters here for organisations thinking seriously about AI.

Why open weights is a different class of offering

When you use GPT-4 through an API, you send data to OpenAI's servers. For many tasks that is fine. But there is a category of data - financial documents, HR records, internal communications, production specifications - that a company cannot or legally must not send outside its perimeter.

Open weights let you run the model inside your own infrastructure. Data goes nowhere. The model runs under your control. This is not an advantage for every task, but for some tasks it is the only acceptable option.

Open weights can also be fine-tuned. If a company has a specific domain - legal documents, technical regulations, industry terminology - fine-tuning on its own data gives relevance that a general model will not provide.

Where the illusion is

Running your own LLM is not a download-and-run operation. It is an infrastructure project.

The 70-billion-parameter variant of Llama 2 requires several GPUs with enough video memory. Smaller variants (7B, 13B) run on more modest hardware, but with noticeably weaker quality on complex tasks. Even at the "let's try it" stage you need to know what you are deploying, where, and who will manage it.

Fine-tuning is a separate story. It requires data (labelled, good quality), GPU compute time, understanding of the process, and the ability to evaluate results. Without a team that can do this, fine-tuning becomes an expensive experiment with unpredictable output.

Running the model in production - monitoring, updates, dealing with quality degradation - is yet another layer of operational cost.

Which tasks make sense today

I would identify three scenarios where the open-weights conversation is justified now:

Processing confidential documents. If the task is extracting structured data from internal documents, classification, summarisation - and the data cannot leave the perimeter - open weights provide a path.

A specialised domain with a large proprietary corpus. If the company has thousands of technical documents accumulated over years and the goal is to make that knowledge accessible, fine-tuning on the company's own base can outperform a general-purpose API.

Version control and reproducibility. When it matters that the model does not change without your knowledge (which is inevitable with commercial APIs), fixed weights give predictability.

Questions worth asking before starting

If the open-model conversation has moved from "interesting" to "we are doing this", I would check a few things:

  1. Do we have a task that cannot be addressed through a commercial API for security or quality reasons?
  2. Do we have the infrastructure to run a model of the required size?
  3. Does the team include people who know how to work with weights, not just APIs?
  4. Do we understand how we will evaluate output quality?
  5. Are we prepared for the operational cost of running this?

If most answers are "no" or "we do not know" - that is not a reason to stop, but a reason to start with a pilot on a specific task rather than an infrastructure project built for future scale.

Open weights genuinely expand the possibility space. The space just requires competencies that do not appear automatically along with the licence.

Back to all posts
Contact

If this resonated, write to me. I reply personally.

WhatsApp