Before signing up to an AI deployment, ask the only question that really matters to the supplier: where does my data go, and who can read it? The honest answer is never “it stays with us”. At the very least, it implies encrypted transit to a third-party server, temporary retention for operational reasons, and a data processing contract that must cover your specific legal obligations. If the sales rep tells you “don’t worry, it’s secure”, change sales rep.
What happens with professional APIs
OpenAI API / Enterprise: data submitted via the API is not used by default to train models. OpenAI specifies this in its Enterprise conditions. Data may be stored temporarily for security and abuse detection (default 30 days on API, configurable). Primary data storage is in the USA, with limited regional options.
Anthropic API / Claude for Business : similar policy. Data submitted via API is not used for training. Stored in the United States.
Free ChatGPT and Plus (web interfaces): until 2023, conversations could be used for training. The deactivation option has been added. On consumer interfaces, the default behavior has evolved, but should be checked each time the conditions are updated.
The problem of industrial secrecy
Conditions of use generally distinguish between training (often excluded for API pro) and temporary storage (often present for various operational reasons).
But the question of industrial secrecy goes beyond training. If you submit to an API :
- Contracts with pricing clauses
- Product development strategies
- Patents or technical know-how
- Customer or supplier information with confidentiality clauses
This data passes through a third-party server. Even without training, even with encryption, you create a flow of sensitive data outside your perimeter. This flow may create legal obligations towards your customers (if your contracts include confidentiality clauses), your partners, or a future regulation.
What the RGPD says
If your data contains information about natural persons (employees, customers, prospects), you’re in RGPD territory.
Using a third-party IA API to process this data places you in the position of data controller, with the obligation of having a Data Processing Agreement (DPA) with the supplier. OpenAI, Anthropic and Google offer this as part of their enterprise offerings.
The question of data localization is critical for certain sectors: healthcare data (HDS), financial data, defense data. For these sectors, transit outside the EU is restricted or prohibited. An American API without a European localization option can create non-compliance.
What the AI Act changes
The European AI Act (in force since 2024, progressive obligations until 2027) imposes specific obligations according to the risk level of the AI system.
For high-risk systems (hiring decisions, credit scoring, medical decisions, justice systems), transparency, auditability and documentation requirements are imposed. An AI system deployed in these contexts must be auditable, and its decisions explainable.
For general-purpose LLMs with systemic impact (above a compute threshold), transparency obligations on training data and capabilities are imposed on suppliers. What’s new for you: you can ask your supplier for AI Act compliance documentation.
Four things to contractually require
There are four contractual requirements to be met before any deployment. First, a guarantee that the data will not be used for training (not in a FAQ: in the signed contract). Then, the actual location of processing and storage, with a commitment to give thirty days’ notice before any change. The RGPD subcontracting agreement, which covers your own obligations towards your customers and employees. And the documented procedure if the underlying model changes version. The rest (retention policy, AI Act perimeter, etc.) is dealt with in the appendices. But these four points are the entry requirement.