Skip to main content

How AI Regulation Changed in 2025

Michael D'Angelo
CTO & Co-founder

If you build AI applications, the compliance questions multiplied in 2025. Enterprise security questionnaires added AI sections. Customers started asking for model cards and evaluation reports. RFPs began requiring documentation that didn't exist six months ago.

You don't need to train models to feel this. Federal procurement buys LLM capabilities through resellers, integrators, and platforms, and enterprise buyers are starting to ask the same questions.

Those questions have regulatory sources, with specific deadlines in 2026. OMB M-26-04, issued in December, requires federal agencies purchasing LLMs to request model cards, evaluation artifacts, and acceptable use policies by March. California's training data transparency law AB 2013 takes effect January 1. Colorado's algorithmic discrimination requirements in SB 24-205 (delayed by SB25B-004) arrive June 30. The EU's high-risk AI system rules begin phasing in August.

This post covers what changed in 2025 and what's coming in 2026, written for practitioners who need to understand why these questions are appearing and what to do about them.

How Regulation Reaches Your Product

In practice, AI regulation flows through a stack:

The compliance stack: Executive policy flows down through agency guidance, procurement requirements, and contract clauses, ultimately requiring evidence from vendors

This is why 2025 mattered: the stack filled in. Executive orders issued years ago became OMB memos, which became procurement language, which became contract requirements, which became requests for evidence that vendors need to produce.

If you've received a security questionnaire asking about your AI systems, or seen new sections in an RFP about model documentation, you've felt this stack.


US Federal Policy

The January Transition

The Biden administration issued Executive Order 14110 in October 2023, creating categories for "rights-impacting" and "safety-impacting" AI, requiring federal agencies to implement risk-management practices, and using the Defense Production Act to compel reporting from developers of large models. That order was rescinded on January 20, 2025. Executive Order 14179 replaced it the same day.

The implementation mechanism stayed the same: executive order sets direction, OMB memo operationalizes it, procurement office embeds requirements in contracts. What changed:

Terminology"Rights-impacting AI" "High-impact AI"
Compliance timelineImmediate 365 days
Frontier model reportingRequired under DPA Removed
Overall postureRisk management Innovation promotion

What didn't change: pre-deployment testing requirements for high-risk AI, impact assessments, human oversight expectations, agency AI inventories, and the expectation that vendors provide documentation.

July: LLM Procurement Requirements

Executive Order 14319 added requirements specific to large language models, establishing two "Unbiased AI Principles":

Truth-seeking: LLMs should provide accurate responses to factual queries and acknowledge uncertainty when appropriate.

Ideological neutrality: LLMs should not encode partisan viewpoints into outputs unless specifically prompted.

The December OMB memo implementing these principles specifies what agencies must request:

ArtifactDescription
Model/system/data cardsDocumentation of training, capabilities, limitations
Evaluation artifactsResults from testing
Acceptable use policyWhat the system should and shouldn't do
Feedback mechanismHow users report problematic outputs

Agencies must update their procurement policies by March 11, 2026. The engineering implication: model behavior is now a contractual attribute, and agencies want evidence you can measure and report on it.

For application builders, this means preparing:

  • System card: which model(s) you use, your prompts/policies, tools, retrieval sources, and human review points
  • Evaluation artifacts: red-team results for tool misuse, prompt injection, and data leakage
  • Acceptable use policy: what your UI allows, what it blocks, and what your system won't do
  • Feedback mechanism: a "report output" button plus an internal triage workflow

December: The Preemption Strategy

On December 11, the administration issued an executive order aimed at challenging state AI laws. From Section 4:

The Secretary shall publish an evaluation that identifies State laws, regulations, or other actions that... require AI models to alter their truthful outputs based on protected characteristics or other group-based classifications.

Colorado's SB24-205 is named specifically. The order directs:

  • DOJ AI Litigation Task Force to challenge state laws (~January 10, 2026)
  • Commerce evaluation identifying conflicting state laws (~March 11, 2026)
  • FTC policy statement on when state laws are preempted (~March 2026)
  • FCC proceeding on federal disclosure standards that could preempt state requirements (~June 2026)
  • Authority to condition federal grants on states not enforcing identified laws

This isn't instant preemption. It's an attempt to build legal and administrative pressure toward a single national standard. Whether it succeeds depends on litigation and congressional action, neither of which has happened yet.


Enforcement Without New Laws

Regulators don't need bespoke AI statutes to take action. The FTC's case against Air AI in August is an example: deceptive performance claims, earnings claims, and refund promises already have enforcement playbooks under Section 5.

The practical implication: marketing language about "autonomous agents," "guaranteed savings," or "replaces staff" needs the same rigor as security claims. If you can't substantiate it, don't say it.


State Laws

While federal policy shifted, states continued legislating:

StateLawRequirementsCompliance Date
ColoradoSB24-205 (delayed by SB25B-004)Deployer obligations: impact assessments, algorithmic discrimination prevention, consumer notices, appealsJune 30, 2026 (originally Feb 1)
CaliforniaSB 53Safety frameworks, catastrophic risk assessments (frontier model developers)Signed Sept 2025
CaliforniaAB 2013Training data transparency (includes fine-tuning)January 1, 2026
CaliforniaSB 942 (date extended by AB 853)AI detection tools, content provenanceAugust 2, 2026
NYCLocal Law 144Bias audits, candidate notice for hiring AIIn effect
TexasHB 149Prohibited practices, government AI disclosureJanuary 1, 2026

Most state laws focus on deployment harms rather than model training: discrimination, consumer deception, safety for vulnerable users, transparency in consequential decisions. This means requirements like impact assessments, audit trails, human review pathways, and incident response procedures.

The federal preemption order and state laws reflect a disagreement about what AI systems should optimize for. The federal position treats accuracy and non-discrimination as potentially conflicting. The state position treats non-discrimination requirements as consumer protection. Colorado's law doesn't require inaccurate outputs; it requires deployers to use "reasonable care" to avoid algorithmic discrimination.

On December 10, 42 state Attorneys General sent letters to major AI companies requesting pre-release safety testing, independent audits, and incident logging. The litigation that resolves the federal-state tension hasn't started yet.


International

EU

The EU AI Act (Regulation (EU) 2024/1689) passed in 2024 and began implementation in 2025 (official timeline):

  • February 2025: Prohibited practices (social scoring, certain biometric systems) took effect
  • August 2025: General-purpose AI model obligations took effect
  • August 2026: High-risk AI system requirements were scheduled to apply

However, under pressure from industry and member states citing competitiveness concerns, the Commission proposed a Digital Omnibus package in November 2025 that would delay high-risk obligations by 16 months, to December 2027. The proposal still requires Parliament and Council approval, but it signals that the original timeline is softening.

If you sell into the EU, you'll need to determine whether your systems qualify as "high-risk" under the Act's classification scheme. If they do, conformity assessment and documentation requirements apply, though the exact timing is now less certain.

China

China's AI governance uses administrative filing and content labeling rather than litigation and procurement. Under the Interim Measures for Generative AI Services, public-facing services with "public opinion attributes or social mobilization capacity" must complete security assessments and algorithm filing before launch. As of November 2025, 611 generative AI services and 306 apps had completed this process, and apps must now publicly disclose which filed model they use, including the filing number.

In September, labeling requirements (English translation) took effect, backed by a mandatory national standard (GB 45438-2025): AI-generated content must include visible labels, metadata identifying the source and provider, and platforms must verify labels before distribution. Tampering is prohibited. The rules include a six-month log retention requirement in specific cases (for example, when explicit labeling is suppressed at a user's request). In late November, CAC took action against apps failing to implement these requirements; enforcement looks like compliance campaigns and removals rather than litigation.

In October, CAC also published guidance for government deployments, pushing agencies toward filed models with stronger risk disclosures and hallucination risk management.

US vs China governance approaches: The US requires documentation alongside the product, while China requires provenance embedded within the product

Meanwhile, China's open-source AI reached the frontier. DeepSeek's V3 model matched or exceeded leading proprietary systems on major benchmarks (technical report) and is available as open weights with published licensing terms (GitHub, model license). Qwen, Yi, and other Chinese labs released competitive open-weight models. The Chinese AI research community is producing frontier-class work under a regulatory regime that requires registration and provenance, a different set of constraints than disclosure and procurement.

Elsewhere

Other jurisdictions moved in 2025, generally converging on familiar control families: South Korea's AI Basic Act takes effect January 2026 with risk assessment and local representative requirements. Japan passed an AI Promotion Act in May. Australia published 10 guardrails that read like a procurement checklist. India proposed specific labeling thresholds for AI-generated content (10% of a visual, first 10% of audio). The UK rebranded its AI Safety Institute as the AI Security Institute. Separately, the UK continues fighting over copyright and training data. The pattern: documentation, evaluation, oversight, and provenance are becoming baseline expectations everywhere.


Technical Context

The center of gravity shifted in 2025: from single-prompt completion to agentic systems that plan over many steps, call tools, maintain state across long interactions, and take actions in external environments. This happened across US labs and Chinese labs simultaneously.

Three patterns stand out:

  • Hybrid "fast vs think" modes became standard. Frontier vendors now ship paired variants trading latency for deeper reasoning: GPT-5.2's Instant/Thinking/Pro tiers, Claude 4 and 4.5's extended thinking, Gemini 3's Deep Think mode, and similar options in Chinese open-weight families.
  • Tool use became the product. Claude 4 explicitly interleaves reasoning and tool calls. GPT-5.2 emphasizes long-horizon reasoning with tool calling. Google's Gemini 3 launched alongside Antigravity, an agent-first environment operating across editor, terminal, and browser.
  • Open weights reached the frontier. In 2025, "open" stopped meaning "two generations behind." OpenAI released gpt-oss under Apache 2.0. Meta shipped Llama 4. Mistral 3 arrived with Apache 2.0 multimodal models. DeepSeek and Qwen continued releasing competitive open-weight models.
Lab / Model2025 ReleasesCompliance Implication
OpenAI (GPT-5.2, gpt-oss)Tool calling, long-horizon reasoning, open weightsOpen weights mean your org becomes the "provider"; tool use expands test surface from output quality to action quality
Anthropic (Claude 4, 4.5)Extended thinking interleaved with toolsAgent workflows and "computer use" interactions require testing tool selection, error handling, and action sequences
Google (Gemini 3 + Antigravity)Agent-first IDE, multi-surface operationSystems spanning editor/terminal/browser are exactly what procurement questionnaires struggle to describe
Meta (Llama 4)Open-weight multimodal, long contextAggressive context claims (10M marketed) vs practical limits create evaluation complexity
DeepSeek (R1, V3.x)Rapid iteration, explicit agent positioningStrong tool use makes system-level evaluation unavoidable
Qwen (Qwen3)Open MoE, thinking modes, 1M contextMore "thinking vs non-thinking" variants multiply the configurations to test

The compliance implication: regulations written for text-in-text-out systems don't map cleanly to systems that choose tools, interpret tool output, recover from errors, and mutate external state. Evaluating whether a model hallucinates is different from evaluating whether an agent selects the right tool, handles its errors appropriately, and takes actions aligned with user intent. Impact assessments and audits need to cover the deployed stack: prompts, tool inventory, tool permissions, retrieval, memory, and logging, not just base models.


2026 Timeline

Key AI regulation deadlines in 2026: Q1 brings federal task forces and state laws, Q2 brings FCC/FTC statements and Colorado compliance, Q3 brings EU AI Act enforcement

DateEvent
Jan 1California AB 2013 (training data transparency) effective
Jan 1Texas HB 149 effective
~Jan 10DOJ AI Litigation Task Force established
~Mar 11Commerce evaluation of state laws due
~Mar 11FTC policy statement on preemption due
Mar 11Agencies update LLM procurement policies (M-26-04)
May 19TAKE IT DOWN Act (S. 146): platforms must remove nonconsensual intimate images within 48 hours of valid request (CRS summary)
~Jun 11FCC proceeding on federal disclosure standard begins
Jun 30Colorado SB24-205 compliance
Aug 2California SB 942 effective
Aug 2026EU AI Act high-risk requirements scheduled (may slip to Dec 2027 per Digital Omnibus proposal)

What This Means for Builders

Documentation is now structural. Whether you're responding to a federal RFP, complying with a state law, or filling out an enterprise security questionnaire, you'll be asked for documentation about how your system works and how you tested it. Model cards, evaluation results, acceptable use policies, incident response processes. If this exists but is scattered across internal wikis and Slack threads, you'll need to consolidate it.

Testing needs to cover deployed systems. Regulatory requirements focus on use cases and deployments, the combination of model, prompts, tools, retrieval, and guardrails that users interact with. If your application uses retrieval, test retrieval quality. If it uses tools, test tool selection and error handling. If it maintains context across turns, test behavior at different context lengths. If it reads untrusted input, test adversarial conditions, not just cooperative ones. We built Promptfoo for exactly this: system-level red teaming and evaluation that produces the artifacts regulators and procurement officers now ask for: exportable results, regression tests, and audit trails that document what you tested and what you found.

If your AI can take actions, regulators will evaluate the actions. If your system can issue refunds, send emails, modify records, or execute code, compliance requirements apply to the action path, not just the text output. This is why agentic systems need testing that covers tool selection, error handling, and rollback behavior.

The regulatory landscape is unsettled. The federal-state conflict isn't resolved. Preemption litigation hasn't started. International requirements continue to diverge. Building compliance infrastructure that adapts to different requirements is more practical than optimizing for any single regime.

If you only do one thing before 2026: make your AI system's behavior measurable, repeatable, and explainable to someone outside your team.


Further Reading

Federal policy:

State laws:

International:

Technical:

News: