Upgrades
Bootstrap pins every service to a specific Helm chart version in the platform stack's locals. Upgrading means: pull a newer bootstrap revision (or bump the versions yourself), re-run apply.sh, and let Argo CD reconcile.
There is no single helm upgrade agyn-platform step today — upgrades are stack-by-stack, service-by-service. A centralized umbrella chart at agynio/platform-charts is in preparation and will eventually replace per-service deployment in bootstrap; it is not used today.
Read the release notes
Every meaningful change to the bootstrap or to a platform service ships with notes on the relevant repository:
- Bootstrap-level:
agynio/bootstrapcommits and releases. - Per-service: each
agynio/<service>repository has its own releases. - Architecture changes are tracked in
agynio/architectureunderchanges/.
Check them before upgrading.
Pre-upgrade checks
- Take a fresh backup. See Operate → Backup & DR. At minimum: Postgres dumps + OpenFGA store export.
- Drain agent workloads. New workloads shouldn't start during a service rollout. Either pause the Agents Orchestrator deployment or wait for a quiet period.
- Note the current state.
git rev-parse HEADin the bootstrap clone, andhelm list -Afor service versions in the cluster.
Upgrade
cd bootstrap
git pull
./apply.sh
apply.sh re-applies every stack. Each stack's Terraform compares its declared chart version against what's in the cluster:
- Platform service charts: the
platformstack updates Argo CDApplicationrevisions. Argo CD pulls the new chart, applies it, and rolling-restarts the service. Each service runs its own DB migrations on startup. - Apps (Reminders, Telegram Connector, k8s-runner): the
appsstack updates their Argo CD applications the same way. - Stacks that build CRDs or core infrastructure (Istio, OpenZiti, cert-manager): only re-applied if their versions changed in the corresponding stack.
Watch progress in Argo CD: https://argocd.agyn.dev:2496/. Each Application flips from Progressing to Synced + Healthy.
You can also watch from the cluster:
watch kubectl get applications -n argocd
watch kubectl get deploy -A
A healthy upgrade ends with every Application Synced + Healthy and every Deployment at full replicas.
Verify
After the rollout:
- Sign in to the Console.
- Open an organization and confirm Agents, Runners, and Apps pages load.
- Send a test message in an existing conversation. The agent should start, respond, and idle out.
- Check Tracing for the new run.
If any of these fail, see Rollback.
Resuming agent traffic
If you scaled the Agents Orchestrator down for the upgrade:
kubectl scale deploy agents-orchestrator -n agyn --replicas=1
Agent workloads for any threads with unread messages will start within one reconciliation tick.
Rollback
For most patches and minor service upgrades, you can roll back by checking out the previous bootstrap revision and re-applying:
git checkout <previous-revision>
./apply.sh
Argo CD picks up the older chart versions and rolls services back. Each service's DB migrations are designed to be backwards-compatible within one minor version — but destructive migrations exist and are called out in service release notes. If a release ran a non-reversible migration, Helm/Argo CD rollback alone is not enough:
- Stop the affected services.
- Restore Postgres / OpenFGA from your pre-upgrade backup.
- Re-deploy the previous revision.
This is why pre-upgrade backups are non-negotiable.
Per-service upgrades
If you want to upgrade a single service (e.g. patch a hotfix without touching the rest), edit the version in stacks/platform/main.tf or the per-service variables, then run only the platform stack:
terraform -chdir=stacks/platform apply
Argo CD reconciles only the changed Application.
Authorization model migrations
The OpenFGA authorization model ships with the platform stack. When a release changes the model, re-applying the platform stack writes the new model version. Old tuples that reference removed relations are migrated by the service that owns them — never delete tuples by hand.
You can verify the active model:
fga model list --store-id <store-id>
Get the store ID from https://openfga.agyn.dev:2496/ or the Authorization deployment's environment.
Production considerations
For production installs:
- Maintain separate Terraform state per environment (
backend "s3" {}or similar). - Stage upgrades in a non-prod environment first.
- Use
-targetto roll a single stack if you need surgical changes. - Document your rollback drill — it's much faster than reading these docs at 3 AM.