☁️ Cloud & DevOps

Mastering Kubernetes Virtual Clusters to Cut Costs

Marcus Cole
Marcus Cole
Cloud & DevOps Lead

Platform engineer who's been through every infrastructure era — bare metal, VMs, containers, serverless. Has strong opinions about YAML files and even stronger opinions about over-engineering.

Kubernetes multi-tenancycluster sprawlvcluster tutorialDevOps infrastructure

Let's start with a reality check. If you manage infrastructure for a growing engineering organization, you've likely received a 3 AM PagerDuty alert because someone broke a shared cluster. Maybe Team A deployed a Custom Resource Definition (CRD) that conflicted with Team B's workload. Maybe someone accidentally deleted a shared ingress controller because they had a bit too much RBAC privilege.

To prevent this, the industry's knee-jerk reaction has been to give everyone their own cluster. Need a dev environment? Spin up an EKS cluster. Need to test a new operator? Spin up an AKS cluster.

Before you know it, you are staring at a cloud bill that has ballooned by tens of thousands of dollars. You are paying a massive "hidden tax" just for control planes—the API servers, etcd databases, and controller managers that sit idle 90% of the time. You haven't solved the problem; you've just traded complexity for cluster sprawl.

We need a pragmatic middle ground. We need Kubernetes virtual clusters.

The Core Problem: The Multi-Tenancy Mirage

The fundamental bottleneck in Kubernetes infrastructure isn't the compute power; it's isolation.

Kubernetes was not originally designed for hard multi-tenancy. We try to use Namespaces to separate teams, but Namespaces are just a polite fiction. They are like roommates sharing an apartment. You can put tape on the floor to divide the living room, but you still share the same front door, the same plumbing, and the same thermostat. In Kubernetes terms, you share the same API server, the same cluster-scoped resources (like CRDs and ClusterRoles), and the same etcd backend.

When you need true isolation—where developers can have cluster-admin rights without taking down the company—Namespaces fail. But provisioning dedicated physical clusters for every team is financially irresponsible and an operational nightmare to patch and upgrade.

Under the Hood: The Syncer Pattern

Before we deploy anything, you need to understand what a virtual cluster actually is. We aren't relying on magic here; we are relying on a very clever architectural pattern.

Think of a major shipping harbor. If every logistics company built their own private harbor, it would be a massive waste of concrete and coastline. Instead, they share the physical docks and cranes (the worker nodes), but each company gets their own private ledger and dispatch office (the control plane).

A virtual cluster (often implemented via tools like vcluster) runs a lightweight Kubernetes API server and a datastore (like SQLite or a minimal etcd) inside a single pod in a Namespace of your underlying "Host" cluster.

When a developer talks to the virtual cluster, they are talking to this isolated API server. They can create CRDs, Namespaces, and ClusterRoles all day long. The Host cluster knows nothing about this.

But pods need actual Linux nodes to run on. The virtual cluster doesn't have nodes. This is where the Syncer comes in. The Syncer is a small controller that watches the virtual cluster. When you ask the virtual cluster to run a Pod, the Syncer translates that request and quietly schedules the Pod onto the Host cluster's worker nodes.

Here is how the traffic and scheduling flow:

Host Kubernetes Cluster Host Control Plane - API Server - etcd - Scheduler Host Namespace: "team-a-vcluster" Virtual Control Plane - vAPI Server - SQLite / k3s - The Syncer Synced Pod Runs on Host Worker Nodes Syncer requests Pod scheduling Sync

The beauty of this is simplicity. The Host cluster only sees standard Pods. The developer only sees their private Kubernetes API.

Let's build it.


Prerequisites

Before we start, you will need:
1. A running Kubernetes cluster (this will be our "Host"). A local kind or minikube cluster is perfect for this tutorial.
2. kubectl installed and configured to talk to your Host cluster.
3. helm installed (we use Helm because managing raw YAML manifests for complex controllers is a fast track to operational debt).
4. The vcluster CLI installed (optional but highly recommended for context switching).

Step 1: Prepare the Host Environment

We don't just throw things into the default namespace. Isolation starts with proper boundaries on the Host cluster. We need a dedicated namespace where our virtual cluster's control plane will live.

Why? Because if we ever need to tear down the virtual cluster, we want to simply delete the namespace and let Kubernetes garbage collection handle the rest. No lingering resources, no orphaned secrets.

# Create the namespace for our virtual cluster
kubectl create namespace team-a-vcluster

Step 2: Deploy the Virtual Control Plane

We are going to use Helm to deploy the virtual cluster. Before running the command, let's understand what the Helm chart is doing. It will pull down a StatefulSet containing the k3s binary (a certified, lightweight Kubernetes distribution by SUSE). This StatefulSet acts as the API server and datastore for our virtual cluster.

We will name our virtual cluster my-vcluster.

# Add the loft repository which hosts the vcluster chart
helm repo add loft https://charts.loft.sh
helm repo update

# Install the virtual cluster
helm upgrade --install my-vcluster loft/vcluster \
  --namespace team-a-vcluster \
  --repository-config=""

Note: In a production environment, you would pass a values.yaml file here to configure resource limits, storage classes, and node selectors to ensure the virtual control plane doesn't consume all your Host cluster's memory.

Wait a few moments for the pod to start. You can verify it on the Host cluster:

kubectl get pods -n team-a-vcluster

You should see a pod named something like my-vcluster-0 in a Running state.

Step 3: Connect and Context Switch

Right now, your kubectl is still talking to the Host cluster. We need to tell it to talk to the virtual cluster's API server instead.

Why do we need a new context? Because authentication and authorization inside the virtual cluster are completely separate from the Host. You are about to log into a completely different system that just happens to be running inside the first one.

Use the vcluster CLI to securely connect:

vcluster connect my-vcluster -n team-a-vcluster

This command does two things:
1. It sets up a port-forwarding tunnel from your local machine to the virtual cluster's API server pod.
2. It automatically updates your ~/.kube/config file and switches your current context to the virtual cluster.

Step 4: Deploy a Workload

Now that you are "inside" the virtual cluster, let's prove that it works. We will create a new namespace and deploy a simple Nginx pod.

Remember, because you are in the virtual cluster, you have full cluster-admin rights here. You can create namespaces freely without asking the infrastructure team for permission.

# Create a namespace INSIDE the virtual cluster
kubectl create namespace web-app

# Deploy a simple pod
kubectl run nginx --image=nginx:alpine -n web-app

Verification

This is where the "hard way" understanding pays off. Let's verify the isolation.

While still connected to the virtual cluster, check your pods and namespaces:

kubectl get namespaces
kubectl get pods -n web-app

You will see your web-app namespace and your nginx pod. It looks like a completely standard Kubernetes environment.

Now, let's disconnect from the virtual cluster and look at the Host cluster.

# Disconnect (or open a new terminal with your host kubeconfig)
vcluster disconnect

# Try to find the web-app namespace on the Host cluster
kubectl get namespaces

You will not see the web-app namespace. The Host cluster has no idea it exists.

So where is the pod? The Syncer translated it and placed it in the Host namespace where the virtual control plane lives. Let's look there:

kubectl get pods -n team-a-vcluster

You will see your nginx pod, but the Syncer has renamed it to avoid conflicts. It will look something like nginx-x-web-app-x-my-vcluster.

This is the pragmatic beauty of the system. The Host cluster only sees compute workloads in a single namespace. The tenant sees a full, isolated Kubernetes cluster.

Troubleshooting

Things go wrong. When they do, here is where you look:

1. Pods are stuck in Pending inside the vCluster
The virtual cluster doesn't schedule pods; the Host cluster does. If a pod is pending, switch back to the Host cluster context and check the events of the synced pod in the team-a-vcluster namespace. Usually, this means the Host cluster is out of CPU/Memory or lacks a required PersistentVolume.

2. Networking and Ingress issues
By default, the virtual cluster has its own CoreDNS instance. Services inside the virtual cluster can talk to each other perfectly. However, if you want external traffic to reach a pod inside the virtual cluster, you must configure the Syncer to sync Ingress resources to the Host cluster. This requires updating the Helm values.yaml during installation to enable Ingress syncing.

3. CRD Conflicts
Wait, didn't we say vClusters solve CRD conflicts? Yes, for the tenant. But remember that the Syncer only syncs standard resources (Pods, Services, ConfigMaps). If you install a complex Operator inside the vCluster, the Operator's pods will run on the Host, but the Operator must be able to talk to the virtual API server to read its CRDs. Ensure your Operator configurations point to the in-cluster DNS of the vCluster API, not the Host API.

What You Built

You just deployed a fully functional, isolated Kubernetes control plane that shares the compute resources of a host cluster. You eliminated the need to provision a new physical cluster, saving both cloud costs and operational overhead. You gave developers the freedom to be cluster admins without risking the stability of the underlying infrastructure.

Next steps? Start looking into automating vCluster provisioning via GitOps tools like ArgoCD, and define strict ResourceQuotas on the Host namespaces to ensure no single virtual cluster starves the others of compute power.

FAQ

Does running a virtual cluster add latency to my applications?No. The virtual cluster only handles the control plane (API requests, scheduling). The actual application pods run directly on the Host cluster's worker nodes and use the Host's container network interface (CNI). Network traffic between pods has zero additional overhead.
Can I run a virtual cluster inside a virtual cluster?Technically, yes. Practically, please don't. While "Inception" style architecture is a fun science experiment, it creates an operational nightmare for debugging the Syncer logs. Keep your architecture as flat and simple as your business requirements allow.
How much overhead does the virtual control plane use?Very little. A standard vCluster running k3s requires about 200-300MB of RAM and a fraction of a CPU core. Compare this to a dedicated AWS EKS control plane which costs roughly $73 a month just to exist, regardless of compute usage.

There is no perfect system. There are only recoverable systems.

📚 Sources

Related Posts

☁️ Cloud & DevOps
Dedicated vs Virtual Clusters: Which to Choose in 2026?
Mar 30, 2026
☁️ Cloud & DevOps
Prometheus Alert Validation: Stopping 3 AM Pager Noise
Mar 28, 2026
☁️ Cloud & DevOps
Managing Kubernetes AI Workloads Without 3 AM Pages
Mar 27, 2026