☁️ Cloud & DevOps

Backstage vs kro: Choosing Platform Engineering Tools

Marcus Cole
Marcus Cole
Cloud & DevOps Lead

Platform engineer who's been through every infrastructure era — bare metal, VMs, containers, serverless. Has strong opinions about YAML files and even stronger opinions about over-engineering.

internal developer platformKubernetes resource orchestratorBackstage developer portalcloud native infrastructure

The Reality Check: The Platform Monolith

It's 3:00 AM. Your pager is screaming. You drag yourself to your monitor, eyes burning, only to find that your core application is fine. The Kubernetes clusters are healthy. The database is humming.

So why are you awake?

Because the Internal Developer Platform (IDP)—the shiny, complex abstraction layer your team spent six months building to "make things easier for developers"—has crashed. A misconfigured UI plugin caused an out-of-memory error, and now no one can deploy code.

We've spent the last decade running away from monolithic applications, breaking them down into microservices, only to accidentally build a massive, fragile monolith out of our infrastructure tooling. We wanted to improve developer experience, but in our rush to adopt trendy platform engineering tools, we traded infrastructure complexity for software maintenance complexity.

The latest Q1 2026 CNCF Technology Radar just dropped, and it highlights a clear trend: organizations are desperately trying to standardize their platforms. Tools like Helm, Backstage, and kro have moved firmly into the "Adopt" category. This push is largely driven by the arrival of heavy, stateful workloads—like the newly accepted CNCF Sandbox project llm-d, which treats distributed inference as a first-class Kubernetes workload.

But as we prepare our clusters for these massive new infrastructure demands, we have to ask ourselves: how should we actually deliver these capabilities to our developers? Do we build a heavy web portal, or do we simplify the underlying API?

The Core Problem: Misunderstanding the Bottleneck

The real bottleneck in modern infrastructure isn't a lack of tools. It's the cognitive load required to maintain the abstraction layer itself.

When developers complain that "Kubernetes is too hard," our instinct is to hide it behind a graphical interface. But technology is just a tool for solving problems, and hiding a problem doesn't solve it. When you build a heavy UI to mask infrastructure, you now have two problems: the underlying infrastructure, and the UI that inevitably falls out of sync with it.

The Restaurant Kitchen Analogy

Think of your infrastructure like a high-end restaurant kitchen. Your developers are the waiters taking orders, and Kubernetes is the kitchen staff cooking the meals.

Building a heavy developer portal (like Backstage) is like installing a massive, custom-built touchscreen ordering system in the dining room. Yes, the waiters can tap pictures of food to send orders to the kitchen. But when the menu changes, or the touchscreen breaks, you need a dedicated IT technician just to fix the menu board before anyone can eat.

Using a native resource orchestrator (like kro) is like giving the waiters a standardized, simple shorthand notepad. It's not a flashy touchscreen, but it's a universal language the kitchen instantly understands. It requires zero maintenance, it never crashes, and it gets the order to the chef exactly the same way.

Under the Hood: No Fluff, Just Architecture

Before we rely on a tool's magic, we need to understand what's happening underneath. Let's look at how these two distinct approaches to platform engineering actually function in production.

Backstage: The UI Monolith

Backstage is a developer portal originally created by Spotify. Under the hood, it is not an infrastructure tool; it is a full-stack web application.

It runs on Node.js and React. It relies heavily on a Software Catalog, usually backed by a PostgreSQL database, which tracks all your services, APIs, and ownership metadata. To make Backstage do anything useful—like scaffolding a new service or viewing deployment status—you have to write or install TypeScript plugins.

When a developer clicks "Create Service" in Backstage, the Node.js backend executes a script, talks to your Git provider to create a repository, and maybe triggers a CI/CD pipeline. Backstage itself doesn't manage the state of your infrastructure; it just triggers external systems.

kro: The API Native

kro (Kubernetes Resource Orchestrator) takes the opposite approach. It has no graphical interface.

Under the hood, kro is a Kubernetes controller that allows you to create your own Custom Resource Definitions (CRDs) without writing a single line of Go code. It uses CEL (Common Expression Language) to map a simple, high-level custom resource into multiple complex, low-level Kubernetes resources.

When a developer applies a kro resource, they are talking directly to the Kubernetes API. The Kubernetes control plane handles the state, the retries, and the drift reconciliation.

Backstage vs kro: Side-by-Side Analysis

Let's break down how these two approaches compare across the criteria that actually matter at 3 AM.

1. Architectural Complexity & Maintenance

Backstage requires a dedicated team of software engineers. You are maintaining a React frontend, a Node backend, a database, and a fleet of custom plugins. If your infrastructure team doesn't know TypeScript, you are going to have a bad time. kro is a single binary running inside your cluster. It leverages the existing Kubernetes API. The maintenance burden is practically zero compared to a full-stack web app.

2. Developer Experience (DX)

Backstage shines if your developers absolutely refuse to touch the command line. It provides a beautiful, unified "single pane of glass" where they can see their documentation, pager alerts, and service catalogs in one place. kro provides a GitOps-native developer experience. Developers still write YAML, but instead of writing 500 lines of complex Deployment, Service, and Ingress manifests, they write 15 lines of a custom Microservice resource that kro expands automatically.

3. Handling Complex Workloads (The llm-d Factor)

As we bring stateful, latency-sensitive workloads into our clusters—like the distributed inference models managed by the new llm-d project—the underlying API changes rapidly. With Backstage, every time a new workload type is introduced, someone has to write a new UI plugin to support it. With kro, you simply define a new ResourceGraph mapping, and developers can immediately start provisioning the new workload via the native Kubernetes API.

4. Time to Value

Backstage projects are notorious for taking 6 to 12 months before developers see any real value. It requires massive organizational buy-in. kro can be installed in five minutes, and you can give developers a simplified API endpoint by the end of the day.

Comparison Overview

Feature/CriteriaBackstage (Developer Portal)kro (Resource Orchestrator)
Primary InterfaceWeb UI (React)Kubernetes API / CLI
Core LanguageTypeScript / Node.jsGo (Under the hood) / YAML
State ManagementExternal (Git, CI/CD, K8s)Native Kubernetes Control Plane
Setup TimeMonthsMinutes
Best ForLarge orgs needing a service catalogTeams wanting to simplify K8s manifests
Maintenance CostHigh (Requires dedicated developers)Low (Standard K8s operator)

The Pragmatic Solution

Platform Engineering Decision Tree What is your primary goal? Unified UI & Catalog Have a TypeScript Team? Yes No Backstage Reconsider Simplify K8s YAML GitOps & API Driven? Yes kro / Helm

The best code is code you don't write. If you are a small to medium-sized team, do not start by building a Backstage portal. You are taking on massive operational overhead for a cosmetic improvement.

Start with the API. Use tools like kro or Helm to create sensible, secure defaults for your developers.

Why we use kro (Before the YAML)

Let's say you want developers to deploy a standard web service. Normally, they need to understand Deployments, Services, ServiceAccounts, and Ingress routing. If they misconfigure the Ingress, they can take down routing for other apps.

Instead of building a web UI to hide this, we use kro to define a ResourceGraph. This tells the Kubernetes API: "When a developer asks for a SimpleApp, automatically generate the Deployment and Service with our corporate security standards applied."

Here is what that looks like in practice:

apiVersion: kro.run/v1alpha1
kind: ResourceGraph
metadata:
  name: simpleapps.briefstack.com
spec:
  schema:
    apiVersion: briefstack.com/v1
    kind: SimpleApp
    spec:
      image: string
      port: int
  resources:
    - id: deployment
      template:
        apiVersion: apps/v1
        kind: Deployment
        spec:
          replicas: 2
          template:
            spec:
              containers:
                - name: app
                  image: ${schema.spec.image}
                  ports:
                    - containerPort: ${schema.spec.port}

Once this is applied to your cluster, your developers never have to look at a complex Deployment manifest again. They just submit a tiny SimpleApp manifest. You've achieved the goal of an Internal Developer Platform—reducing cognitive load—without deploying a single web server or writing a line of TypeScript.

Only when your organization grows to the point where service discovery (finding out who owns what API) becomes a bigger problem than deployment complexity, should you consider adopting Backstage.

Frequently Asked Questions

Can I use Backstage and kro together? Absolutely. In fact, this is the ideal end-state for enterprise organizations. You use kro to simplify the underlying Kubernetes API, and then you use Backstage as the catalog and trigger mechanism. Backstage simply commits the lightweight kro YAML to your GitOps repository.
Does kro replace Helm? Not entirely. Helm is a client-side templating engine (it renders YAML before sending it to the cluster). kro is a server-side orchestrator (the cluster itself understands the custom resource). Helm is great for packaging third-party apps; kro is better for defining internal platform abstractions.
How do these tools handle AI workloads like llm-d? AI serving requires highly stateful, hardware-aware orchestration (like managing GPU nodes and KV caches). kro allows platform teams to wrap these complex new llm-d resources into simple APIs for data scientists, while Backstage would require custom plugin development to visualize these specific AI infrastructure metrics.

The Takeaway

Stop trying to hide your infrastructure behind fragile web interfaces, and start simplifying the abstractions at the API level. There is no perfect system. There are only recoverable systems.

📚 Sources

Related Posts

☁️ Cloud & DevOps
Kubernetes Infrastructure Reality: Metrics, Security, AI
Mar 19, 2026
☁️ Cloud & DevOps
Cloud Resilience: Multi-AZ Myths and K8s Realities
Mar 18, 2026
☁️ Cloud & DevOps
Fixing Kubernetes Observability for Heavy AI Workloads
Mar 17, 2026