EU’s GPAI Code of Practice: the world’s first guidance for General Purpose AI model compliance
The European Commission on 10 July 2025 published a General Purpose AI (GPAI) Code of Practice – a non-binding soft law instrument aimed at helping AI developers comply with new EU AI Act rules on transparency, safety, and intellectual property (the Code). This voluntary Code is meant to guide providers of general-purpose AI models (such as large language models and other foundation models) in meeting their obligations under Articles 53 and 55 of the EU AI Act.
Recently, models like ChatGPT or Gemini have demonstrated how powerful general-purpose AI can be, increasing the need to ensure these systems are safe, transparent, and lawful. The Code serves as an early compliance blueprint ahead of the AI Act’s GPAI provisions taking effect on 2 August 2025. Below, we outline the three published chapters of the Code – Transparency, Safety & Security, and Copyright – and discuss the next steps.
1. Transparency requirements for GPAI models
Providers of GPAI models must prepare comprehensive technical documentation for each model. The Code introduces a standardized Model Documentation Form covering details such as the model’s training data sources, intended use cases, and licensing information. This extensive documentation meets the AI acts' transparency requirements and ensures that essential facts about the model is recorded in one place.
The documented information should be made available to those who deploy or integrate the model (“downstream providers”) and to regulators upon request. In practice, providers will share relevant documentation with business customers who build AI systems on top of the model, and, upon formal request, with the EU AI Office or national authorities. Such regulatory requests for model information must state a legal basis and purpose, and sensitive business details will remain confidential. This framework ensures that downstream users get enough transparency to use the model responsible, while regulators can obtain oversight data when needed, without forcing public disclosure of trade secrets.
The transparency chapter also calls for clarity about the model’s data provenance (e.g. evidence of the authenticity or origin of the data with which the model was developed or trained) and the lineage of its training data. Providers of GPAI models should disclose how training data was collected (for example, whether it involved web crawling or private datasets) and any preprocessing steps, to give a clear picture of the model’s foundations. Documenting model authenticity (such as providing a secure hash or identifier for the released model) helps downstream users and regulators verify that a model is genuine and hasn’t been tampered with.
Notably, the Code recognizes an exception for certain open-source AI models, unless the model is later deemed to pose a “systemic risk”. However, even if an open-source AI model falls into the advanced high-impact category, it would still need to comply with the stricter transparency and safety measures for systemic-risk models.
2. Safety & security measures for high-risk AI models
The Safety & Security chapter applies to GPAI models that could pose “systemic risks” – essentially, far-reaching or high-impact risks on a societal scale. Providers of such frontier models must implement a comprehensive risk management framework to identify and control these risks. This involves continuously assessing the model’s potential for significant harm (e.g. risks of misuse, unpredictable behavior, or other large-scale negative outcomes) and ensuring those risks remain at an acceptable level. The Code expects developers to perform structured systemic risk assessments and to update those analyses as the model evolves or new risks emerge.
A notable concept in the Code is the definition of risk “tiers” or thresholds. Providers of GPAI models should establish clear criteria for different levels of systemic risk and determine in advance what safety measures would be required as the model approaches more advanced capability tiers. For each identified risk scenario, the provider is asked to set risk acceptance criteria and to document what additional safeguards will kick in once the model reaches a certain risk tier. By planning ahead in this way, developers can proactively implement mitigations (or even refrain from deployment) if a model’s capabilities grow beyond safe bounds.
Providers of systemic-risk models must deploy state-of-the-art safety measures and security controls to reduce risks. This includes implementing technical safeguards throughout the model’s lifecycle . The Code also calls for external or independent audits of these high-impact models’ safety mechanisms. Requiring third-party or outside expert audits adds an extra layer of oversight, verifying that the model’s risk controls and testing are effective and up to industry best practices. All of these measures contribute to a safety framework aimed at preventing catastrophic failures or malicious abuses of advanced AI.
After a model is released, the Code mandates ongoing post-market monitoring – continuously observing the model’s real-world use and performance for any new hazards or unintended consequences. In short, the lifecycle of a high-risk AI model should be accompanied by continuous oversight, transparent communication with regulators, and a readiness to respond quickly if something goes wrong. If a serious incident or major malfunction occurs, providers are obliged to document it and report it to the EU AI Office and relevant national authorities without undue delay, along with any corrective measures taken.
To facilitate oversight, the Code emphasizes cooperation with the new EU AI Office. By aligning with the Code, providers signal a willingness to work hand-in-hand with regulators to keep advanced AI in check. This collaborative approach aims to build trust that even the most high-impact AI systems are being developed responsibly and monitored after deployment.
3. Copyright compliance and data usage policies
The Code’s Copyright chapter addresses a critical concern for AI developers and content owners alike – making sure AI models respect intellectual property rights. Providers of GPAI models are expected to adopt a clear internal copyright compliance policy. This written policy should outline how the provider ensures that any training or fine-tuning data is obtained and used lawfully under EU copyright rules. The Code suggests that providers designate responsible personnel or teams to oversee copyright compliance within their organization. By formalizing such a policy, AI developers can demonstrate due diligence in avoiding copyright infringement, as required by Article 53(1)(c) of the AI Act.
A cornerstone of the Code’s copyright commitments is respecting rights-holders’ wishes when collecting training data. Signatories pledge not to circumvent or override any technical measures that protect copyrighted content. This means an AI developer should not scrape data behind paywalls or other access restrictions without authorization. The Code requires the use of web crawlers that read and follow standard instructions (like robots.txt or metadata indicating “no text and data mining”).
In addition to lawful training data collection, providers of GPAI Models must implement safeguards to prevent their models from spitting out copyrighted works inappropriately. The Code obliges providers to take technical and organizational measures so that a model does not generate outputs that are essentially verbatim reproductions of protected works from its training data. Possible safeguards include filtering mechanisms, prompt constraints, or post-processing checks to detect and block likely infringing content. By committing to such measures, signatories align with EU copyright principles that protect original expressions from unauthorized reproduction.
The Code also tackles the issue of sources known to be problematic. When scraping the web for training data, providers should exclude websites notorious for copyright infringement. The European Commission has indicated it will maintain a list of sites with a reputation for hosting unlicensed content, and AI model developers are expected to avoid using data from those sites. This reflects a preventative approach: rather than dealing with copyright violations after the fact, the Code pushes providers to be picky about their data sources up front, omitting data pools that are rife with pirated works.
Recognizing that issues may still arise, the Code establishes channels for ongoing cooperation with rightsholders. Providers must set up a mechanism for copyright owners to lodge complaints or inquiries if they believe their rights are being impacted. They also agree to engage in good-faith discussions or remediation if a valid complaint is raised. This could include investigating and, if necessary, rectifying instances where the model’s output may have replicated a protected work or where reserved data was inadvertently used. By instituting a complaint-and-redress system, the Code gives rightsholders a voice in the AI development process and a direct line to model providers, which is expected to help resolve IP disputes more amicably. It also incentivizes providers to be responsive and accountable, under the watchful eye of the AI Office, rather than taking a lax approach to copyright issues.
4. Next steps
The GPAI Code of Practice, while voluntary, is poised to become an influential standard in the AI industry. What happens next? In the immediate term, the Code’s content will be reviewed by EU authorities: the European Commission and Member States are assessing the Code’s adequacy and are expected to formally endorse it if satisfied, with a decision planned by 2 August 2025. Endorsement would signal official support, cementing the Code as a recognized compliance tool. The new European AI Office (established by the AI Act) will also play a key role. The Commission has indicated that if providers adhere to an approved Code of Practice, the AI Office and national regulators will treat that as a simplified compliance path – focusing enforcement on checking the Code’s commitments are met, rather than conducting audits of every AI system. This means early signatories could enjoy greater predictability and a reduced administrative burden in meeting AI Act requirements. By contrast, companies that chose not to sign onto the Code will need to demonstrate compliance through other means, potentially facing more intensive scrutiny. Furthermore, the Commission has indicated that it can favourably take adherence to the Code into account when assessing compliance with the AI Act, which may have an effect in deciding upon the amount of regulatory fines.
Another development to watch is guidance on key definitions. The Commission announced it will soon publish guidelines to clarify who and what falls under the “general-purpose AI provider” rules. This should delineate, for example, when an AI model is considered a GPAI model (versus a narrower AI system), which models qualify as having “systemic risk,” and who exactly is deemed the “provider” (especially in multi-party development scenarios). Clear guidelines will help companies determine if they are in scope of Article 53 or 55 obligations and thus whether they should adhere to this Code. These clarifications are expected ahead of the August 2025 enforcement start date, giving the industry a bit more certainty on the regulatory perimeter.
Crucially, the success of the Code will depend on industry uptake. The Commission is actively encouraging all major AI model developers to sign on. Key players in generative AI (from big tech firms to open-source model labs) will have to decide if they’ll commit to the Code’s requirements. Their decisions may be influenced by how favourably the Code is received by regulators and whether any competitive advantage accrues to signatories. If many providers join, the Code could become a de facto industry benchmark. If uptake is low, regulators might take a tougher stance through direct enforcement of the AI Act’s binding rules.