Every Platform Promises "Any System." Here's Why They Don't Deliver.

Every IaC tool promised "any system." Few delivered once you tried to use it in your own business. Provider ecosystems were designed for humans who can fill in the gaps. Agent ecosystems need a different contract. That's what extensions are built for.

Paul Stack

08 Jun 2026 • 7 min read

The pieces are designed to fit. How you combine them is up to you.

I've made this promise myself. More than once. I travelled to conferences and talked about it because I was sure I was right. I wanted to be right.

Every IaC tool promised it. Every conference keynote, every vendor deck, every README: one platform, every system, your infrastructure. I helped build some of the tools that made those promises, built providers, wrote modules, shipped integrations. Each generation of tool was genuinely better than the last, and but few (if any) of them truly delivered "any system" in a way that held up once you tried to use it in your own business.

The tools got closest when open source communities filled the gaps. Terraform's provider ecosystem, Chef's community cookbooks, Puppet Forge. The "any system" promise was largely delivered by volunteers, not the vendor, and it worked because humans were writing providers for other humans to read and configure. It was almost impossible for the vendor to keep up with the ever evolving ecosystem. The agentic shift changes both sides of that equation. What contribution looks like is changing, and what consumers need from an extension is changing with it. If the ecosystem needs rethinking anyway, the extension model should be part of that conversation.

The underlying problem is always the same: the tool works well for the systems it was built around, and everything else gets a provider that somebody wrote, somebody maintains, and somebody has to learn from scratch. That's the 200% problem I wrote about in the determinism post: you learn the tool's abstraction, then you learn the underlying API it models. Every new system resets the clock.

But there's a deeper constraint too. Traditional IaC tools are built around a CRUD lifecycle where every resource has create, read, update, delete, and that's the contract. If you need to do something that doesn't fit, you're working around the tool rather than with it. I've written null_resource blocks with local-exec provisioners that shell out to scripts that call APIs, three layers of workaround for something that should have been a single method called rotate or drain or debug. That's not a criticism of the tools themselves. Data sources and provisioners were smart adaptations, the teams behind them recognised the gap and built escape hatches to keep up with what users actually needed. But they're still escape hatches, bolted onto a lifecycle that wasn't designed for those operations.

Why we built it differently

I've maintained tools where the issue tracker was dominated by two kinds of request: "add support for provider X" and "add feature Y to the existing provider." Hundreds of them, thousands over the life of a tool, each one reasonable and each one blocked on a maintainer having time to write it, review it, ship it. The backlog becomes a wall between the user and the thing they're trying to automate.

We didn't want swamp to work that way. The goal was never to be the team that writes every integration, it was to build extension points deep enough that we never become the bottleneck. Your infrastructure, your secrets provider, your internal APIs, your report needs, you should be able to model them without waiting for us to add support. And if an existing extension covers your domain but doesn't have the method you need, you should be able to add it yourself without forking anything.

It also turns out this makes the tool work incredibly well with agents. An agent isn't constrained to whatever operations the maintainer decided to ship. If the method exists, the agent can discover it and use it. If it doesn't exist, you or the agent can add it. The agent carries out the work you instructed it to do rather than hitting a wall because the provider doesn't support that operation yet.

What an extension actually is

A swamp extension is a package that contains everything needed to interact with a system: model types, workflows, vaults for secret management, execution drivers, datastores, reports, and skills. Any combination, versioned together, published to a registry, pulled with a single command.

The manifest looks like this:

manifestVersion: 1
name: "@acme/deploy"
version: "2026.06.01.1"
description: "Deployment automation for Acme's infrastructure"
models:
  - service.ts
workflows:
  - staged-deploy.yaml
dependencies:
  - "@swamp/aws/ec2"
tags:
  - deployment
  - aws

Scoped names (@collective/name) so you know who published it, CalVer versioning so you know when. Dependencies are bundled into the package, and every download is SHA-256 verified. TypeScript files get safety-analysed before push and after pull, with dependency trust audits running against vulnerability databases. No eval() and no code injection.

A model type

Most extensions contain one or more model types. Here's a deployment model:

import { z } from "zod";

export const model = {
  type: "@acme/deploy/service",
  version: "2026.06.01.1",
  description: "Deploy a service to a target environment",
  globalArguments: z.object({
    service: z.string().describe("Service name"),
    region: z.enum(["us-east-1", "us-west-2", "eu-west-1"]),
    replicas: z.number().int().min(1).max(50),
  }),
  resources: {
    result: {
      schema: z.object({
        deployed: z.boolean(),
        endpoint: z.string().url(),
      }),
    },
  },
  checks: {
    "region-safety": {
      description: "Block deployment to unstable regions",
      labels: ["policy"],
      execute: async (context) => {
        const blocked = ["us-east-1a"];
        if (blocked.includes(context.globalArgs.region)) {
          return {
            pass: false,
            errors: [`${context.globalArgs.region} is currently unstable`],
          };
        }
        return { pass: true };
      },
    },
    "replica-ceiling": {
      description: "Enforce account-level replica limit",
      labels: ["policy"],
      execute: async (context) => {
        if (context.globalArgs.replicas > 20) {
          return {
            pass: false,
            errors: ["Account limit is 20 replicas — raise a quota request"],
          };
        }
        return { pass: true };
      },
    },
  },
  methods: {
    deploy: {
      description: "Deploy the service",
      arguments: z.object({
        image: z.string().describe("Container image tag"),
      }),
      execute: async (args, context) => {
        // actual deployment logic
        const handle = await context.writeResource("result", "result", {
          deployed: true,
          endpoint: `https://${context.globalArgs.service}.example.com`,
        });
        return { dataHandles: [handle] };
      },
    },
  },
};

If you've read the encoding knowledge post, those checks should look familiar. The region that's been unstable since March is a pre-flight check now. The account-level replica limit someone discovered during an outage is encoded in the schema. Both run automatically before any mutating method, whether the caller is a human running a CLI command or an agent executing a workflow.

The Zod schemas validate inputs at creation time so type mismatches surface immediately, and they're also discoverable. An agent can query the CLI and see what arguments a model type accepts, what methods it exposes, what the outputs look like.

Those methods aren't locked to a CRUD lifecycle either. This model has deploy, but it could just as easily have debug, rollback, drain, or migrate, whatever the system actually needs. The Kubernetes debugging extensions have methods like checkPodHealth and validateSelectors, while the issue lifecycle extension has triage, plan, iterate, approve, which is about as far from create-read-update-delete as you can get.

Extensions are also open to extension themselves. Say your security team needs every VPC creation to pass a CIDR overlap check. In the provider model, that's either running a fork of a provider or a pull request to the upstream repo, a review cycle, a release, and you're waiting weeks or months before it ships. If it ships at all, because the maintainer might not agree it belongs in the core provider. With swamp, you extend the type locally:

export const extension = {
  type: "@swamp/aws/ec2/vpc",
  methods: [{
    "validate-cidr-policy": {
      description: "Check CIDR against company allocation policy",
      arguments: z.object({}),
      execute: async (_args, context) => {
        // your org's CIDR validation logic
      },
    },
  }],
  checks: [{
    "no-cidr-overlap": {
      description: "Ensure CIDR doesn't overlap existing VPCs",
      labels: ["policy"],
      execute: async (context) => {
        return { pass: true };
      },
    },
  }],
};

Your check attaches to the base type and runs alongside the originals. It's enforced on the next deployment, not the next upstream release. An agent discovering the type sees everything, the methods from the registry extension and the ones your security team added, without knowing or caring which came from where.

Wiring it together

Workflows wire multiple model interactions into a sequence:

name: staged-deploy
inputs:
  service:
    type: string
  image:
    type: string
jobs:
  - name: deploy-staging
    steps:
      - name: deploy
        task:
          type: model_method
          modelType: "@acme/deploy/service"
          modelName: staging-${{ inputs.service }}
          methodName: deploy
          globalArgs:
            service: ${{ inputs.service }}
            region: eu-west-1
            replicas: 2
          inputs:
            image: ${{ inputs.image }}
  - name: deploy-production
    dependsOn:
      - job: deploy-staging
        condition:
          type: succeeded
    steps:
      - name: deploy
        task:
          type: model_method
          modelType: "@acme/deploy/service"
          modelName: prod-${{ inputs.service }}
          methodName: deploy
          globalArgs:
            service: ${{ inputs.service }}
            region: us-east-1
            replicas: 10
          inputs:
            image: ${{ inputs.image }}

Production only deploys if staging succeeded. The pre-flight checks from the model type run before each deployment. Data flows between steps through typed CEL expressions, and if a step references data that doesn't exist the expression fails loudly rather than passing blank values through.

An agent creates this, but you can run it without one. That's the point I keep coming back to from the determinism post, the agent does the work of creating the automation, then the automation runs deterministically from that point on.

How agents find what they need

swamp model type search returns every model type in the registry with its description, arguments, and methods. swamp model type describe returns the full schema for a specific type. The agent sees what parameters exist, what constraints apply, what methods are available, all without reading docs.

Extensions from trusted collectives auto-resolve on first use. When swamp encounters a type it hasn't seen before, say @swamp/aws/ec2/instance, it searches the registry, pulls the extension, loads it, and continues. The version gets pinned in a lockfile so the same resolution produces the same result next time. If it's not from a trusted collective, then it will ask you to pull the extension to use it.

This is the core of why extensions look the way they do, and it's worth saying directly. A human can work from incomplete information. They'll read docs, search GitHub issues, copy examples from Stack Overflow, and eventually figure out how a provider behaves even when the documentation is wrong or missing. I've done that hundreds of times and so have you. Agents don't have that path. If a capability isn't represented in a schema, an agent has to infer it from prose, and that's a much weaker contract. Provider ecosystems were designed for humans who can fill in the gaps. Agent ecosystems need everything to be explicit, typed, and discoverable, because there's nobody in the loop to compensate when it isn't.

What we use it for

We ship extensions for our own infrastructure. The Kubernetes debugging extensions from the determinism post are real, model types that know how to query pod state, inspect services, validate configmap references, and check image availability. The issue lifecycle extension drives our entire development workflow from triage through implementation. Same mechanism, completely different domains.

The extensibility is one of the parts of swamp I'm most proud of. Users model their own systems, extend what exists, and publish what they build. Nobody waits for us to get there first.

The open source communities that built the provider ecosystems for the last generation of tools did extraordinary work. The next generation of contribution isn't another provider. It's operational knowledge encoded in a form an agent can discover and use.