☁️ Cloud & DevOps

Multi-Cloud Architecture: The $110B AWS & OpenAI Deal

Lucas Hayes
Lucas Hayes
[email protected]
AWS BedrockOpenAI Frontierstateful runtimeDevOps testingKeycloak IAM

The $110B multi-cloud architecture deal between OpenAI and AWS is the biggest infrastructure earthquake since Kubernetes 1.0. If you are building enterprise applications, your deployment strategy just fractured into two distinct paths. You need to pay attention to this shift immediately.

OpenAI just secured $110B in funding, with Amazon dropping $50B to become the exclusive third-party distributor for Frontier. This creates a massive architectural split in how we deploy models. Azure keeps the stateless APIs, while AWS gets the stateful runtime environments via Amazon Bedrock.

In my experience, this changes everything about how we design cloud systems. You are no longer just picking a cloud provider based on pricing. You are picking a fundamental application architecture.

The Great Stateless vs. Stateful Divide

For the past few years, building with OpenAI meant managing your own state. You sent a prompt to an Azure endpoint, got a response, and the API immediately forgot who you were. If you wanted memory, you had to build it yourself.

I've spent weeks configuring massive Redis clusters and vector databases just to maintain user context across sessions. It is a brutal, expensive engineering overhead. You are constantly fighting latency, data serialization, and context window limits.

AWS just eliminated that entire headache. By bringing Frontier to Amazon Bedrock, AWS is offering native stateful runtimes. The model maintains memory, context, and identity across your workflows without you writing a single line of state-management code.

Enterprise App Azure (Stateless) Direct API Calls Client Manages Context AWS (Stateful) Amazon Bedrock Frontier Manages Memory

The Engineering Reality

This territorial split forces you to make a hard choice on day one of your project. If you want raw, unopinionated compute, you stick with Azure. If you are building complex agents that need to remember things for months, AWS is your new home.

Here is exactly how the two approaches stack up for enterprise workloads:

FeatureAzure OpenAIAWS Bedrock (Frontier)
ArchitectureStateless REST APIsStateful Runtime
Memory ManagementClient-side (Redis, PgVector)Native (Managed by Frontier)
Best ForSingle-shot queries, data processingLong-running agents, complex workflows
Infrastructure CostHigh (Requires external DBs)Low (Included in runtime)
Vendor Lock-inLow (Standard API calls)High (Tied to AWS Bedrock ecosystem)

Make no mistake, AWS is playing a brilliant game here. By owning the state, they own the stickiness of your application. Once your enterprise agents have six months of memory stored in Bedrock, migrating away will be nearly impossible.

Developers Are Finally Writing Tests (For Real)

While the cloud giants fight over infrastructure, the day-to-day reality of DevOps testing is shifting fast. A new global survey from Perforce just revealed that 53% of developers are now authoring tests directly.

I've been in this industry for 12 years. Getting developers to write comprehensive tests used to require threats, bribes, and strict CI/CD gates. Now, they are doing it voluntarily.

Why the sudden change? Modern tooling has finally caught up to our microservice sprawl. When you have 50 different services talking to each other, you simply cannot rely on a separate QA team to catch integration bugs.

// Modern shift-left testing looks like this
// Developers mock the stateful runtimes directly in their suites
import { describe, it, expect } from 'vitest';
import { BedrockAgent } from '@aws-sdk/client-bedrock';

describe('Frontier Agent State', () => {
  it('should maintain user context across requests', async () => {
    const agent = new BedrockAgent({ region: 'us-east-1' });
    const session = await agent.createSession({ userId: 'user-123' });
    
    expect(session.contextWindow).toBeGreaterThan(0);
    expect(session.memoryState).toBe('PERSISTENT');
  });
});

If your organization still separates "developers" from "testers," you are falling behind. You need to push testing responsibilities completely to the left. Give your engineers the tools to test stateful architectures locally before they ever hit a staging environment.

Identity is Now Foundational Infrastructure

With workloads splitting across Azure and AWS, managing access is becoming a nightmare. This brings me to the recent announcements surrounding KeycloakCon Europe 2026.

The CNCF is finally acknowledging what platform engineers have known for years. Identity and Access Management (IAM) is no longer an application-level concern. It is foundational infrastructure, just like your network overlay or your storage classes.

When I deployed Keycloak at scale last year, we stopped treating it as a simple login box. We integrated it directly into our Kubernetes clusters as a global trust domain.

Keycloak IAM Core Global Trust Domain K8s Cluster EU OIDC Auth K8s Cluster US OIDC Auth Edge Agents Verifiable Identity

The Multi-Cluster Challenge

As workloads span across AWS Bedrock and Azure APIs, your IAM strategy must evolve. You cannot have fragmented identity silos. You need a unified trust domain that handles machine-to-machine authentication seamlessly.

Keycloak 24.0 makes this incredibly straightforward. By leveraging verifiable identity models and agent-based systems, you can secure cross-cloud communication without exposing static credentials. If you are still passing long-lived API keys between your Azure and AWS environments, you are asking for a security breach.

What You Should Do Next

Stop treating cloud providers as interchangeable compute platforms. The AWS and Azure split forces you to make deliberate architectural decisions today.

  • Audit your current workloads: Identify which applications require persistent memory and which only need stateless processing.
  • Evaluate Amazon Bedrock: If you are building complex agents, start testing Frontier's stateful runtime on AWS. It will save you hundreds of hours of database management.
  • Shift testing left: Mandate that your developers write integration tests for these new stateful APIs. Do not rely on manual QA.
  • Centralize IAM: Deploy a unified identity provider like Keycloak across your multi-cloud environments. Treat identity as core infrastructure.

Frequently Asked Questions

Why did OpenAI split its distribution between AWS and Azure?It is a strategic move to maximize enterprise adoption. Azure remains the home for traditional, stateless API calls, while AWS provides the specialized infrastructure needed for stateful, long-running agent workflows via Frontier.
What is a stateful runtime environment?Unlike a standard REST API that forgets you after every call, a stateful runtime maintains memory and context. It remembers previous interactions, managing the data persistence layer automatically without requiring external databases like Redis.
How does this impact vendor lock-in?Using AWS Bedrock for stateful workloads significantly increases vendor lock-in. Because AWS manages the memory and context of your agents natively, migrating that historical state to another provider later becomes highly complex.
Why is Keycloak recommended for multi-cloud setups?Keycloak provides a centralized, open-source Identity and Access Management (IAM) solution. It allows you to establish a single global trust domain across different cloud providers, securing machine-to-machine communication without relying on fragmented, cloud-specific IAM tools.

📚 Sources