The Invisible Layer That Makes Enterprise AI Governable
MCP gateways are becoming mandatory infrastructure for any organization deploying AI agents at scale. Here is what they actually do, what they must do, and how to evaluate one honestly.
In November 2024, Anthropic released the Model Context Protocol: a wire format for connecting AI clients to tools, data sources, and APIs. Eighteen months later, MCP has crossed 78% adoption among production AI engineering teams. The public registry has passed 9,400 servers. Anthropic, OpenAI, Google, and Microsoft all support it. Practitioners have started calling it "the USB-C of AI applications."
The protocol's success created an infrastructure problem that nobody anticipated at quite this speed. Every MCP server connection expands an organization's attack surface. Every AI agent operating with tool access can read private data, write to production systems, and execute commands under the permissions of whoever authorized it. Without a governance layer, these agents are black boxes: no audit trail, no access control, no identity attribution, no way to answer "what did this agent do?" to an auditor.
The answer the market has converged on is an MCP gateway: a control plane that sits between AI agents and the tools they call. But the term covers a lot of ground, from lightweight protocol proxies to full enterprise governance platforms. The differences are significant. Getting the choice wrong creates compliance exposure; getting it right creates the foundation for scaling AI safely.
What a gateway actually is
The core function of an MCP gateway is collapsing what engineers call the N×M integration problem. Without a gateway, every AI agent manages its own credentials, authentication flows, and access policies for every tool it connects to.
Ten agents and twenty tools produce two hundred independent connection paths, each with its own credentials, each potentially leaking secrets, each invisible to anyone trying to govern AI behavior centrally. A gateway reduces that to a single control point: N agents connect to the gateway; the gateway manages access to M tools.
That description makes it sound like a proxy. It is not just a proxy. The proxy, the routing layer, accounts for roughly 5% of what an enterprise-grade gateway delivers. The remaining ninety-five per cent is everything else: identity federation, automated user provisioning, audit logging, role-based access control, policy enforcement, and protection against attack vectors that API gateways from the previous decade were never designed to handle.
"The proxy is roughly 5% of the actual scope. The rest is what makes it usable, governed, and defensible to your security team."
This distinction matters for procurement. Organisations that evaluate gateways primarily on latency benchmarks and integration counts are optimising for the five per cent. The ninety-five percent, whether the gateway can prove, in a form an auditor accepts, who did what, is what determines whether the deployment is actually enterprise-grade.
The Composio MCP Gateway is designed around this reality. Rather than selling a proxy and calling it governance, it ships the full stack: 1,000+ managed integrations across enterprise SaaS, a unified authentication layer, action-level RBAC, zero data-retention architecture (tool call payloads and credentials are never stored on Composio infrastructure), and SOC 2 and ISO certification. The quickstart takes about ten minutes; the governance layer is built in from the start, not bolted on later.

The four things that cannot be missing
Across the compliance frameworks that govern enterprise AI deployments (SOC 2, HIPAA, GDPR, ISO 27001, and now the EU AI Act), four governance capabilities appear repeatedly, either explicitly or implicitly. The absence of any one of them creates either regulatory exposure or operational failures that can escalate into incidents.

1. Identity federation and SSO
Without SSO integration, agents authenticate using shared service account credentials or locally-stored API keys. This creates credential sprawl, obscures user-level attribution in audit logs, and prevents IT from cleanly revoking access when an employee departs. With federated identity, every tool call carries the identity of the specific user who authorized it, flowing through the gateway from the enterprise identity provider down to the MCP server.
The technical baseline is support for OAuth 2.1, standardized in the MCP specification in June 2025, alongside SAML 2.0 for enterprise SSO and OpenID Connect for modern attribute mapping.
But the capability that separates governance-capable gateways from identity-aware proxies is On-Behalf-Of (OBO) token propagation: the pattern where a gateway passes the end-user identity downstream to the MCP server rather than substituting a service account. Without OBO, an audit log records "gateway service account called database write tool." With OBO, it records "Elena Mwangi in Finance called database write tool at 14:32 UTC." The difference is the difference between an audit log and an audit trail.
Composio's MCP Gateway handles this through SSO via SAML and OIDC, with documented integrations for Okta, Microsoft Entra ID, and Google Workspace. Every team gets a unique, scoped MCP endpoint. Developers paste it into Claude, Cursor, or ChatGPT. SSO authenticates. Only the tools their team is authorized to use appear, and there is no separate configuration step to restrict visibility.
One practical concern worth flagging: identity provider integrations that look stable can break silently. Microsoft Entra changed its attribute mapping behavior for synchronized users in late 2024 without a deprecation notice. Every such change is a potential gap in governance coverage. When evaluating any gateway, ask vendors specifically how they monitor for and respond to IdP-side breaking changes.
2. SCIM provisioning
SCIM — System for Cross-domain Identity Management — automates the user lifecycle at scale. New hires receive correct tool access on day one. Role changes propagate immediately to gateway permissions. Departing employees lose all access at the moment their directory account is disabled.
Without SCIM, MCP gateway access management becomes a manual operation at every organizational boundary event. HIPAA requires that access to systems that hold protected health information be revoked immediately upon a role change or separation. SOC 2 CC6.2 requires that access be provisioned based on authorized requests and revoked promptly when no longer needed. Manual processes fail both tests at scale.
The scenario that illustrates this most clearly is a developer departing on difficult terms. Legal advises IT to immediately revoke all access. IT disables the directory account. If SCIM is integrated, that change propagates to the gateway; every agent connection that developer had, from GitHub to Jira to Salesforce to internal APIs, terminates immediately. No gap exists between directory disabling and access revocation. Without SCIM, someone has to hunt and manually revoke individual credentials across every connected system. At any scale above a handful of users, some will be missed.
Composio's SCIM 2.0 implementation maps directory groups to teams directly. The mapping logic is explicit and auditable: if department = Engineering then Team: engineering. New hires get the right tools on day one without any manual gateway configuration. The group sync is active and continuous, not a nightly batch job.
For teams building toward this themselves: the build vs. buy analysis Composio published puts the engineering effort for SCIM provisioning at 4–8 weeks for a mid-sized team, before accounting for ongoing maintenance as IdP behavior changes. That estimate covers the SCIM endpoint, group sync logic, and conflict resolution. It does not cover the OAuth token lifecycle management that sits adjacent to it, which is typically another 4–8 weeks and carries a higher ongoing maintenance cost.
3. Audit logging
Audit logs answer the question every regulator and every security team will eventually ask: "What did your AI agents access, and when?" Without comprehensive, immutable, structured audit logs, the honest answer is "we don't know." That answer fails every compliance framework that governs regulated data.
The minimum required fields per log entry are: timestamp in UTC at millisecond precision; user identity attributed through the IdP, not a service account; agent identity; MCP server and tool name invoked; tool input parameters; tool output or error state; authorization decision and the policy rule that produced it; and session identifier for multi-turn correlation. These fields are what make a log entry into evidence.
Beyond minimum fields, enterprise-grade logs must be immutable after writing, tamper-evident, either through cryptographic signing or append-only storage. They must be structured for reliable SIEM ingestion. They must support configurable retention aligned to the organization's most demanding applicable requirement: HIPAA access records for protected health information require six-year retention; SOC 2 typically requires twelve months.
Composio's audit trail logs every tool call with the fields: user, team, tool, action, and outcome. Critically, no payloads are stored, only metadata. This zero-data-retention architecture matters for regulated industries, where storing tool call contents on third-party infrastructure creates its own compliance risk. The logs support CSV export for compliance reviews, and retention is configurable from 7 days to 1 year. The audit log format generates entries compliant with SOC 2, HIPAA, and GDPR requirements.
4. Policy enforcement
The fourth pillar is where identity, provisioning, and audit turn from documentation tools into enforcement tools. Policy enforcement means the gateway doesn't just record that an agent attempted to call a destructive action. It blocks the call if the agent's role doesn't permit it.
The critical implementation detail is the granularity at which access control operates. Standard RBAC in legacy API gateways operates at the API endpoint level. MCP gateway RBAC must operate at the action level within each toolkit. A GitHub integration may expose GITHUB_CREATE_PR, GITHUB_MERGE_PR, and GITHUB_DELETE_REPO. Governance requires that a junior developer role can call the first two but not the third, without blocking access to the GitHub toolkit entirely.
Composio enforces action-level RBAC at the gateway layer, not at the model layer. Each team gets a scoped MCP endpoint exposing only the tools they are authorized to use. Destructive actions within allowed toolkits — GITHUB_DELETE_REPO, SLACK_DELETE_CHANNEL — can be blocked independently of toolkit access. This is enforced at the gateway: if a model attempts to call a blocked action, the gateway rejects the request regardless of what the model was instructed to do.
The access model supports both whitelist and blacklist modes. Teams can request access to blocked tools; admins approve or deny. This creates a self-service discovery path that doesn't require IT to anticipate every team's tooling needs in advance, while retaining central control over what actually gets enabled.
Attack vectors that API gateways were not built for
Traditional API gateways were built for HTTP traffic between services. MCP traffic between AI agents and tool servers introduces attack vectors that legacy infrastructure was never designed to handle.
Tool poisoning places instructions inside tool Metadata, specifically in tool descriptions and parameter documentation that AI models read to understand how tools work. If descriptions contain adversarial instructions, the model may execute them. Unlike prompt injection, tool poisoning persists across sessions: it affects every agent that interacts with the tool, not just the session in which the attack was introduced.
Rug pull attacks are tool poisoning with a delayed trigger. A server publishes clean, vetted tool definitions during the security review. After approval, the operator modifies descriptions to inject malicious instructions. Without tool hash pinning, hashing tool descriptions on first scan and alerting when they change, the gap between the approved state and the live state can persist indefinitely.
Prompt injection via tool output embeds adversarial instructions in tool outputs (document contents, database records, web page responses) that the agent ingests as legitimate input. The MCP specification only "SHOULD" require a human-in-the-loop, which is insufficient protection in production environments handling sensitive data at agent speed.
Cross-server shadowing is an MCP-specific threat with no analog in traditional API security. A malicious MCP server impersonates a trusted server or embeds instructions in tool metadata that override the behavior of adjacent servers in the same agent context.
Credential sprawl is the most operationally common risk. Agents that store API keys, database passwords, and OAuth tokens in local configuration files expose them through prompts, logs, or accidental repository commits. In multi-agent architectures, credentials propagate through chained tool calls in ways invisible without gateway-level telemetry.
A security leader at Medtronic accurately described the operational concern: "MCP opens a lot of opportunities to do a lot of damage very quickly." The velocity at which autonomous agents can chain tool calls makes human review an insufficient backstop without gateway-level guardrails enforcing limits in real time.
Composio's zero data-retention architecture addresses the credential sprawl risk directly: tool call payloads and credentials are never stored on Composio infrastructure. This eliminates the most common vector for credential exfiltration through the gateway layer itself.
What compliance frameworks actually require
No compliance framework explicitly names MCP gateways. All of them implicitly require what a gateway provides: a centralized layer where AI tool access is governed, logged, and restricted.
SOC 2 Trust Services Criteria CC6.1 through CC6.3 require access to be restricted to minimum necessary permissions; action-level RBAC satisfies this. CC7.2 and CC7.3 require monitoring and investigation of anomalies; real-time audit log alerting and SIEM integration satisfy these requirements. CC8.1 requires change management controls; access approval workflows and configurable retention policies satisfy this.
For teams pursuing SOC 2 Type II certification, the observation period is a minimum of six months. That means an organization that starts building its own gateway today won't have a reportable SOC 2 Type II audit for seven or eight months at the earliest, and that timeline assumes the controls were architected correctly from day one. Composio ships with SOC 2 Type II and ISO 27001 certification already in place, which removes this timeline entirely from the governance roadmap.
HIPAA adds a harder requirement: Business Associate Agreements. Any vendor that creates, receives, maintains, or transmits protected health information on an organization's behalf is a Business Associate and legally requires a signed BAA before any PHI touches their infrastructure. Composio's enterprise plan supports BAA execution. For healthcare organisations, this is a binary filter that precedes all technical evaluation: verify BAA availability before spending time on feature comparison.
The EU AI Act, whose high-risk system provisions became fully enforceable in August 2026, requires documented risk management, human oversight mechanisms, and technical evidence of controls for AI systems operating in healthcare, financial services, employment, and critical infrastructure. MCP gateway audit logs are the primary evidence artefact for conformity assessment. Organizations that have not established audit logging infrastructure before enforcement begins cannot retroactively generate evidence for the period before capture began.
The build vs. buy question, answered honestly
Internal builds of MCP gateway infrastructure are a recurring theme in enterprise AI teams. The engineering argument is usually that "a proxy is a few weeks of work." That framing is accurate for the proxy. The full enterprise stack is different.
Component | Build estimate | Ongoing cost |
|---|---|---|
MCP routing proxy | 2–4 weeks | Low |
OAuth 2.1 implementation | 3–6 weeks | High — each SaaS app handles OAuth differently and changes without notice |
SAML/OIDC IdP integration | 2–4 weeks | Medium — silent breaking changes require active monitoring |
SCIM provisioning endpoint | 4–8 weeks | Medium |
Per-user OAuth token lifecycle | 4–8 weeks | High |
Audit log infrastructure | 3–5 weeks | Medium |
Action-level RBAC policy engine | 6–8 weeks | High |
15 SaaS integrations | ~15 weeks | Ongoing per-integration maintenance |
SOC 2 Type II observation period | 6+ months | Continuous |
The proxy is five per cent of the scope. The OAuth maintenance burden is where most internal builds stall or quietly degrade over time: every SaaS application handles OAuth slightly differently, and those implementations change without notice. GitHub OAuth app permissions behave differently depending on whether the organization has SAML SSO enabled. Entra changed its attribute mapping behavior in late 2024 without a deprecation notice. Each change is a potential silent breakage.
Buying wins for most teams because they are not buying a proxy, they are buying maintained integrations, per-user OAuth lifecycle management, SSO and SCIM support, RBAC enforcement, audit logging, and compliance readiness, with the maintenance burden sitting on the vendor rather than internal engineering. Composio's MCP Gateway developer quickstart gets a working agent connected to its first toolkit in about ten minutes. That's the realistic comparison point against a multi-month internal build.
The cases where building makes sense are narrower: unique deployment constraints no vendor accommodates, classified network requirements, or organizations with the appetite to own the entire AI infrastructure stack as a long-term strategic investment.
How to evaluate a gateway honestly
Start with the deployment model. For organisations in healthcare, finance, or government where regulated data must remain within specific boundaries, the deployment model is often a legal requirement before any technical comparison begins. Cloud-hosted managed gateways reduce time to production but involve data transiting vendor infrastructure. Self-hosted or VPC-deployed options provide data sovereignty. Composio operates as managed SaaS with a zero-data-retention architecture as the default; for organisations requiring a VPC or on-premises deployment, this narrows the field significantly and should be the first filter applied.
After the deployment model, evaluate in this sequence:
Identity depth. Does the gateway support OBO token propagation, or does it substitute service accounts? Ask vendors for a sample audit log entry and verify that user identity is IdP-attributed, not a service account name.
SCIM implementation. Does it support SCIM 2.0 with push provisioning? What is the documented maximum deprovisioning latency? The deprovisioning case, an employee departure or a security incident requiring immediate access revocation, is where manual processes fail most expensively.
Audit log quality. Require vendors to provide a sample log entry with all fields populated. Confirm the format is structured and suitable for SIEM ingestion. Confirm logs are immutable after writing. Confirm that the retention policy can be configured to your longest applicable requirement. Ask whether PII redaction in tool parameters is configurable and, in Composio's case, whether the zero data-retention architecture means payloads aren't stored at all, which is the stronger answer.
Access control granularity. Confirm that RBAC operates at the action level, not the toolkit level. A gateway that blocks or enables whole toolkits, but cannot distinguish between GITHUB_CREATE_PR and GITHUB_DELETE_REPO is not implementing least-privilege access control.
Compliance certification. Request the current SOC 2 Type II report date and auditor. Confirm whether a BAA is available. For European deployments, ask whether the vendor has documented controls relevant to the EU AI Act high-risk system provisions. Composio's SOC 2 and ISO 27001 certifications are current, significantly shortening the security review process.
MCP-specific threat coverage. Ask whether tool hash pinning is implemented and whether it generates alerts when tool definitions change post-approval. Ask whether the tool metadata is scanned for hidden prompt instructions. These questions distinguish purpose-built MCP governance platforms from extended API management products.
Exit terms. Gateway choice shapes AI adoption architecture for three to five years. Confirm that gateway configuration, audit logs, and access policies can be exported in standard formats, and that contract exit terms do not create data portability barriers.
What comes next
The MCP specification continues to evolve. Client ID Metadata Documents, added in the November 2025 spec update, introduce a new mechanism for trusted client discovery. The Agent-to-Agent protocol is emerging as a complement to MCP for multi-agent orchestration, governing agent-to-agent delegation rather than agent-to-tool connectivity. Future enterprise governance will require control planes spanning both protocols.
As AI agents gain persistent memory and state across sessions, the audit and governance scope expands beyond tool calls to memory operations and state modifications. Gateways scoped only to tool call governance will require extension as these capabilities become standard.
The broader trajectory is toward federated multi-gateway architectures: separate gateway instances per business unit or geographic region with centralized policy management. This pattern addresses data residency requirements without requiring monolithic governance infrastructure. Including A2A roadmap questions in current gateway evaluations is forward-looking work that belongs in any 2026 RFP.
The bottom line
The teams establishing MCP governance infrastructure now (building audit trails, connecting identity providers, implementing SCIM provisioning, enforcing action-level access policies) are building the foundation for AI adoption that compliance teams can accept and auditors can verify. The teams deferring governance are accumulating technical debt measured not in refactoring effort but in regulatory exposure.
The audit log for the last quarter does not exist if it was never captured. The SOC 2 observation period clock does not start until you start running controls. The EU AI Act conformity evidence cannot be generated retroactively. The compliance timeline is contracting, and the enforcement mechanisms are real.
For most teams moving from pilot to production, the practical starting point is a managed gateway that handles the ninety-five percent Composio's MCP Gateway covers the integrations, the OAuth lifecycle, the SCIM provisioning, the action-level RBAC, the audit logging, and the compliance certifications in a single product. The developer quickstart takes ten minutes. The governance is not an afterthought.
Further reading: