OpenAI on AWS: GPT-5.5, Codex, and the End of the "Pick One Cloud" AI Stack

What actually shipped

OpenAI and AWS expanded their partnership this week and pushed three things into limited preview on Bedrock: GPT-5.5 as a first-class Bedrock model, Codex (the CLI, the desktop app, and the VS Code extension) configurable to run against Bedrock-hosted GPT-5.5, and Bedrock Managed Agents powered by OpenAI — Amazon's agentic runtime, but with OpenAI's planning loop underneath. GPT-5.5 itself lands with 82.7% on Terminal-Bench 2.0, which is the current state of the art for command-line agentic coding.

If you only read the model-release headline you'd file this under "another month, another frontier number." The procurement implication is bigger.

Why this changes the buying conversation

For two years the operating assumption inside large enterprises has been: if we want OpenAI's best models, we ship our data into Azure or out to OpenAI's own API. For an AWS-native shop with a regulated workload — bank, insurer, hospital chain, government contractor — that was a non-starter. Data residency, IAM boundaries, KMS keys, VPC peering, the security review backlog: all of it was built for AWS, and re-doing it for Azure to get GPT was a six-figure organizational tax before a single token was ever generated.

GPT-5.5 on Bedrock removes that tax. The model runs inside the AWS account, behind the same IAM policies, the same KMS keys, the same CloudTrail logs the enterprise security team already audits. Codex talks to it through the Bedrock API. Managed Agents inherit Bedrock's existing guardrails, knowledge bases, and agent-action plumbing. From a compliance perspective, calling GPT-5.5 is now indistinguishable from calling Claude or Titan on the same platform.

The practical effect: every "we'd love to use GPT-5 but our security team won't let us out of AWS" conversation just ended.

The Codex angle is the under-reported one

The model news got the headline, but Codex on Bedrock is the more interesting product move. Codex is OpenAI's coding harness — it runs the agent loop, manages the workspace, handles tool use, executes shell commands, and orchestrates multi-file edits. Until now it was a managed OpenAI product talking to OpenAI infrastructure.

Codex on Bedrock means a developer can keep their workspace, source code, and execution environment entirely inside an AWS account, point Codex at a Bedrock-hosted GPT-5.5 endpoint, and get the same agentic coding experience without anything leaving the perimeter. For a regulated team that's been watching Cursor and Claude Code from the sidelines because of data egress rules, this is the first credible path to letting agentic coding into the dev environment without rewriting the threat model.

What this means for product teams shipping AI features

Three things to act on:

1. The cloud lock-in argument is dead — design for substitution, not allegiance. A year ago, picking Bedrock implicitly meant Anthropic-or-Titan-or-Cohere as the model layer. Picking Azure meant OpenAI. That mapping just collapsed. The teams that win the next eighteen months will treat the model layer as a substitutable runtime, with prompts, evals, and tool definitions versioned independently of the inference endpoint. If your codebase still has OPENAI_API_KEY hardcoded against api.openai.com, you have a small refactor in your future.

2. Bedrock Managed Agents is now a credible deployment target — but only for the second agent you build, not the first. The first agent any team builds teaches them what their actual workflow needs: which tools, which guardrails, which escalation paths, what "done" means. Trying to learn that on a managed runtime adds a layer of opacity at exactly the moment you most need to see what's happening. Build agent #1 against a thin custom loop, learn the requirements, then port to Managed Agents when you want the operational story (audit logs, IAM, multi-tenant isolation) handed to you.

3. The premium on "AI-native" engineers who can navigate this multi-cloud, multi-model world is going up, not down. Choosing between GPT-5.5, Claude Opus 4.7, Gemini 3.1, and Cursor Composer 2 for a given task is now a real engineering decision with a real cost-quality-latency frontier. Most teams don't have anyone in-house who has shipped against more than one of those models in production. That gap is the gap we get hired to close.

Sonnet Code's take

We build for clients who want their AI features inside their existing cloud account, under their existing security controls, with someone in-house who actually understands which model is the right tool for which job. The OpenAI-on-Bedrock launch makes that posture available to AWS-first organizations that were previously stuck choosing between the model they wanted and the security perimeter they had. If your team is staring at a regulated workload and trying to figure out how to get GPT-5.5 inside the account without three months of compliance work, that's the conversation we're best at.

We also run AI training engagements — RLHF data, evals, red-teaming, expert prompting — for teams whose internal benchmarks are now the bottleneck, not the model. When the same prompt routes to four frontier models depending on the request, you need a much sharper internal eval suite than you needed when there was one model. We staff that work.