AI is already used in many organisations. Sometimes officially, sometimes through an employee who uses ChatGPT to rewrite a text, Copilot for code or an online AI tool to summarise a document. There is nothing wrong with that in itself. Public AI is fast, powerful and easy to access. For many general tasks, that is exactly what you want.
But there is a limit to what you can responsibly do through public AI services. That limit is not about AI itself. It is about the data you put into it.
Once prompts contain customer data, internal documents, logs, configurations, source code or contracts, AI is no longer a standalone tool. It becomes an infrastructure choice. Where does the model run? Who processes the input? What gets logged? How long is data retained? Which jurisdiction applies to the service? And can you explain afterwards what happened?
You cannot answer those questions with an API key alone.
What do we mean by public AI?
By public AI, we mean AI services that run with an external provider. Think of ChatGPT, Microsoft Copilot, Gemini, Claude or business AI APIs. You use a web interface or API, and the processing happens outside your own environment.
That has clear benefits. You do not have to build a GPU platform, manage models or maintain an inference environment. You get quick access to strong models, good tooling and scalability. For brainstorming, general writing, prototyping and non-sensitive tasks, public AI is often a good choice.
Business AI services are not the same as consumer tools. Large providers offer business versions with data controls, audit options and contractual agreements. In many cases, business data is not used by default to train models.
But that is not the end of the story.
Not training is not the same as not processing
Many discussions about AI and privacy get stuck on one question: will my data be used to train the model? That is relevant, but too narrow.
Even if a provider does not train on your data, that data is still processed. A prompt has to go to the model. The output comes back. Sometimes there are logs for abuse monitoring, debugging, support, analytics or legal obligations. Sometimes metadata, requests or responses are stored temporarily.
So you need to look beyond the sentence "we do not train on your data". You want to know:
- what data is processed?
- where does that processing happen?
- how long is data retained?
- who can access it?
- which subprocessors are involved?
- is anything logged?
- can logging be disabled or limited?
- how does deletion work?
- what happens during an incident?
For a simple marketing text, that may be overkill. For customer data, tickets, security logs or legal documents, it is basic due diligence.
The prompt is often the problem
When people think about AI, they often focus on the answer. In business use, the risk is usually in the input.
A support employee asks an AI tool to summarise a customer ticket. An engineer pastes an error message with configuration details into a chatbot. A developer asks for source code to be reviewed. An HR team summarises documents. A SaaS platform sends customer data to a model to generate automated advice.
In all those cases, the prompt is not just a simple question. The prompt is business data.
That becomes even clearer with RAG, retrieval augmented generation. In RAG, an application retrieves information from internal documents, a knowledge base or a vector database and sends that context to the model. That makes AI more useful, but it also means internal data deliberately becomes part of the model input.
If that inference runs publicly, the context leaves your own environment. That may be acceptable. It may not be. But it should be a conscious decision.
Where does your prompt go?
With AI, the risk is often not in the answer, but in the context you send along.
The question is whether that data may leave the organisation, and under which agreements.
The data stays within an environment where you control logging, access and retention.
Compliance starts before the pilot
The Dutch Data Protection Authority is clear on this: if you use algorithmic systems with personal data, you have to establish in advance whether you can comply with the GDPR. Privacy risks must be mapped before you start, including for pilots, tests and proof of concepts.
That matters. Many AI projects start as experiments. Try it quickly. Put in a bit of real data. See if it works.
But a pilot with real customer data is still processing real customer data. Especially with personal data, medical data, financial information, security logs or legal documents, you need to think about legal basis, purpose limitation, data minimisation, retention periods and processors before you start.
AI also touches other regulation. The AI Act is risk-based. NIS2 puts more emphasis on cybersecurity, risk management and supply chain security. An external AI service becomes part of your digital supply chain as soon as you use it in your processes.
In other words, AI is not only a productivity question. It is also a governance question.
What is private inference?
Private inference means using an AI model within a controlled environment. It is not about training a model from scratch. Inference means using an existing model to generate text, analyse documents, classify information or answer questions.
That inference can run on your own hardware, in a private cloud, on your own servers or in a sovereign cloud environment. The main difference is control. You decide where the data goes, how logging works, which models run and who has access.
Private inference is especially useful when AI becomes part of a process where sensitive data, predictability or auditability matters.
Think of:
- analysing customer data
- making internal documents searchable
- summarising support tickets
- reviewing source code
- explaining security logs
- processing legal or financial documents
- offering AI functionality inside a SaaS platform
In those situations, you do not only want to know that the model gives an answer. You also want to know what happens to the data.
The benefits of private inference
The biggest benefit is control. You can keep prompts, outputs, embeddings and logs within your own environment. You can pin model versions, so behaviour does not suddenly change because an external provider replaces a model. You can build audit trails that show which user asked which question, which documents were retrieved and which output was given.
Private inference can also help with response times. If the model runs close to your application and data, you avoid unnecessary network paths. At stable, high volumes, the cost model can also become more predictable than paying per token through an external API.
For organisations working on digital sovereignty, this is especially relevant. Not because "data in the Netherlands" solves everything, but because location, management, access, logging, jurisdiction and exit options together determine how much control you really have.
Private inference is not a magic wand
Private inference does not automatically make an AI application secure or compliant. You move part of the responsibility to yourself or to your hosting partner.
You need to think about GPU capacity, patching, monitoring, access control, incident response and model updates. You need to set up logging without storing sensitive data unnecessarily. You need to test whether the model is good enough for your task. And if you use RAG, you need to enforce document permissions properly. A vector database without proper tenant or user separation can still leak data.
Model choice is not free either. Open models have licences, restrictions and quality differences. A smaller model is cheaper and faster, but may be less capable. A larger model needs more hardware. Model compression can help make a model smaller, but it can also reduce quality.
Private inference gives you more control. It does not automatically make things simple.
Private inference is also operations
More control means you need to design and manage the whole chain deliberately.
Private inference is most valuable when this layer is managed as carefully as the rest of your infrastructure.
When is public AI fine?
Public AI remains useful. For general productivity, it is often the fastest route. Think of brainstorming, general writing, translations, summaries without sensitive content, first prototypes or tasks where a person always checks the output.
For businesses too, public AI can be a logical choice, as long as the data fits the service and the agreements are clear. Sometimes a business public AI service with good contracts, logging and governance is better than a poorly managed private setup.
So the choice is not ideological. It depends on the workload.
Hybrid AI is often the realistic approach
Most organisations do not need one AI environment. They need a policy that determines which data may be processed where.
A practical classification can start simply:
- public information may go to public AI
- internal information needs extra control
- confidential information stays within a controlled environment
- regulated or customer-related data requires private inference or a tightly contracted environment
That creates a hybrid approach. Public AI where it can be used. Private inference where it is needed. Clear rules for everything in between.
AI decision matrix by data class
Start with the data you want to process, not with the model.
Public where it works. Private where it matters. Hybrid where it makes sense.
That avoids two mistakes. The first is banning everything, which pushes employees towards workarounds. The second is allowing everything, which lets sensitive data end up in places where nobody has oversight anymore.
AI has become infrastructure
The main question is not whether public AI is good or bad. The question is which data, processes and risks you allow to run where.
For simple tasks, public AI is often fine. For sensitive data, customer environments, SaaS functionality and regulated processes, you want more control. That is where private inference becomes interesting.
Not because you must run everything yourself. But because AI is increasingly part of your infrastructure. And you do not manage infrastructure on hope. You design it deliberately, with attention to data, security, performance, compliance and continuity.
In the next article, we will look at the practical side: what you need to run private inference, from model and hardware to serving, logging and operations.
Using AI while keeping control over data and operations? See how cloud.nl approaches AI infrastructure and where private inference becomes practical.