Running Minecraft on a Kubernetes Cluster with Coding Agents

This one is kind of wild to explain, so I'll just walk through it.

I wanted to run Minecraft inside containers on a Kubernetes cluster, stream it to a browser, and have a coding agent inside each container that could write Minecraft plugins, compile them, and test them live. All of this on L4 GPUs.

The streaming setup

Each pod in the cluster gets an L4 GPU. The container runs Minecraft with a full display server, and I use Selkies to stream the screen to a browser over WebRTC. Selkies is a project that does GPU-accelerated remote desktop streaming from containers. It hooks into the container's virtual display and encodes the video feed using the GPU's hardware encoder (NVENC on the L4s), then ships it over WebRTC.

So from the user's perspective, you open a URL and you're watching Minecraft running live on a GPU in some pod somewhere. Low latency, hardware-encoded. It looks and feels like the game is running locally.

Getting Selkies to work inside the container took some fiddling. You need the display server (Xvfb or Xorg with a dummy driver) set up correctly, the NVIDIA drivers exposed into the container, and Selkies configured to find the right display. But once it's working, it's solid.

The agent

Here's where it gets interesting. Inside each container, alongside Minecraft and Selkies, there's a coding agent. This agent accepts tasks over an API. You tell it something like "write a plugin that spawns a diamond block in front of the player every 5 seconds" and it goes to work.

The agent writes Java code. Minecraft plugins are Java, compiled into JAR files. So the agent writes a .java file, compiles it against the server API, and produces a JAR. If the compilation fails, it reads the errors, fixes the code, and tries again. This loop runs until the code compiles or it gives up after a few attempts.

But compiling is just step one. A plugin that compiles doesn't mean it actually works in the game.

Testing in-game

Step two is the agent loading the plugin into the running Minecraft server and checking that it works. It does this with Mineflayer, which is a Node.js library that lets you connect a bot to a Minecraft server. The bot joins the game as a player and can move around, execute commands, read chat messages, and interact with the world.

So after the JAR is compiled and loaded, the agent tells Mineflayer to join the server. It runs commands, checks that the game doesn't crash, and takes screenshots through the Selkies stream to visually verify what's happening. If the plugin is supposed to spawn blocks, the bot can look at the world state and confirm the blocks are there.

The full loop looks like this:

Agent receives a task
Writes Java code for the plugin
Compiles to a JAR
If compilation fails, reads errors, rewrites, tries again
Loads the JAR into the Minecraft server
Mineflayer bot joins the game
Bot executes commands and checks game state
Agent takes screenshots via the stream
Reports back whether it worked

The Kubernetes part

Each of these is a pod. One container in the pod runs Minecraft + Selkies + the display server. Another container (or a sidecar) runs the agent. They share the filesystem so the agent can drop JARs into the server's plugin directory.

Scaling this means just adding more pods. The L4 GPUs handle both the game rendering and the video encoding, which is nice because you don't need separate hardware for streaming. The cluster autoscaler spins pods up and down based on demand.

The tricky part was getting the GPU allocation right in Kubernetes. Each pod needs exactly one GPU, and you need the NVIDIA device plugin set up so nvidia.com/gpu: 1 in your resource requests actually works. We also had to set up the container toolkit properly so the GPU was accessible both to Minecraft (for rendering) and to Selkies (for encoding).

Why bother

The point of all this is to have a system where an AI agent can write Minecraft mods, test them in a real running game, and iterate. Not just "does it compile" but "does it actually work when a player is in the game." The streaming piece means you can watch it happen in real time from a browser, which is useful for debugging and also kind of fun to watch.