Automation & Infrastructure as Code
Source: Marc Mercer (SRE Lead) —
sre-iacrepository, Rev 1.0, 2026-02-24
The infrastructure follows a hybrid automation approach: Terraform handles resource provisioning, Ansible handles configuration management, and Helm/ArgoCD handles Kubernetes application deployment.
Tooling Overview
| Layer | Tool | Purpose |
|---|---|---|
| Resource provisioning | Terraform + Atmos | LXC containers, VMs, networks, volumes |
| Configuration management | Ansible | Host bootstrap, domain enrollment, certificates, reverse proxy |
| OpenStack deployment | Kolla-Ansible | Containerized OpenStack control plane lifecycle |
| Kubernetes applications | Helm | Application deployment and lifecycle |
| GitOps | ArgoCD (planned) | Kubernetes application lifecycle in OpenStack environment |
| Secrets management | Infisical | Centralized secrets with Kubernetes operator integration |
| Certificate issuance | acme.sh + ZeroSSL | Wildcard ECDSA certificates via DNS-01 challenge |
Terraform: Infrastructure Provisioning
Provider (current Proxmox): telmate/proxmox
Repository structure:
terraform/
containers.tf # All LXC container definitions
variables.tf # Resource allocation variables
terraform.tfvars # Environment-specific values
modules/
lxc_container/ # Reusable container module
Container Module Pattern
module "gitlab-01" {
source = "./modules/lxc_container"
hostname = "gitlab-01"
vmid = 701
node = "pmx-02"
ostemplate = "hpserver-storage:vztmpl/almalinux-9-default_20240911_amd64.tar.xz"
rootfs_storage = "local-lvm"
rootfs_size = "24G"
memory = 16384
cores = 2
ip = "10.10.96.41/20"
gateway = "10.10.96.1"
tags = "almalinux;gitlab"
mountpoints = [
{ mountpoint = "/data", storage = "local-lvm", size = "2T" }
]
}
VMID Naming Convention
| Range | Assigned To |
|---|---|
| 100s | Core infrastructure (domain controllers, proxies, VPN) |
| 200s | Kubernetes nodes (control plane and workers) |
| 700s | Application services (GitLab, MongoDB, etc.) |
Workflow
terraform plan # Review planned changes
terraform apply # Apply infrastructure changes
terraform destroy -target=module.container-name # Destroy specific resource (use with caution)
Ansible: Configuration Management
Inventory: Dynamic discovery via community.proxmox.proxmox plugin — automatically groups hosts by Proxmox tags (e.g., groupdc, groupk8s_worker, groupproxy).
Vault management: All sensitive credentials stored in group_vars/all/vault.yml (Ansible Vault AES-256). Vault password file .vault_pass excluded from git.
| Credential | Purpose |
|---|---|
acme_email | ZeroSSL account identifier |
aws_access_key_id / aws_secret_access_key | Route53 DNS management for cert validation |
kerberos_principal / kerberos_password | FreeIPA enrollment automation |
proxmox_api_user / proxmox_api_token | Proxmox API authentication |
vaulted_dockerhub_username / vaulted_dockerhub_token | Docker Hub registry authentication |
Key Ansible Roles
roles/common/ — Base Configuration
Applied to all containers. Domain enrollment, SSH hardening, and baseline configuration.
| Task File | Purpose |
|---|---|
ipa_enroll.yml | Automated FreeIPA domain enrollment |
firewalld.yml | Firewall rule management |
chrony.yml | NTP time synchronization |
Enrollment pattern:
- name: Enroll host with FreeIPA
command: >
ipa-client-install --unattended
--principal {{ kerberos_principal }}
--password {{ kerberos_password }}
--domain anshinhealth.net
--realm ANSHINHEALTH.NET
--server dc-01.anshinhealth.net
--server dc-02.anshinhealth.net
roles/certificates/ — Certificate Lifecycle
Full SSL/TLS certificate lifecycle via acme.sh and ZeroSSL:
- Clone/update acme.sh repository with DNS plugins
- Decrypt existing certificate state from Ansible Vault
- Issue or renew wildcard certificates via ZeroSSL DNS-01 challenge
- Re-encrypt private keys and renewal config with Ansible Vault
- Clean up temporary files while preserving renewal state
Managed certificate domains:
anshinhealth.net+*.anshinhealth.netapps.anshinhealth.net+*.apps.anshinhealth.netsvcs.anshinhealth.net+*.svcs.anshinhealth.net
roles/reverse-proxy/ — Caddy Configuration
- Installs Caddy from EPEL repository
- Decrypts and deploys vault-encrypted certificates
- Generates modular configuration files per upstream service
- Validates configuration before applying (atomic updates)
- Certificate permissions: 600 for keys, 644 for certs (owned by caddy service user)
roles/k8s/ — Kubernetes Node Configuration
- Container runtime configuration
- Registry mirror configuration (
registries.yamlfor Docker Hub pull-through cache) - Storage provisioner setup
- Node labels and taints
Playbook Execution
# Full site configuration
ansible-playbook playbooks/site.yml
# Targeted by host group
ansible-playbook playbooks/site.yml -l groupdc # Domain controllers
ansible-playbook playbooks/site.yml -l groupk8s_worker # K8s workers
ansible-playbook playbooks/site.yml -l groupproxy # Reverse proxy
# Targeted by individual host
ansible-playbook playbooks/site.yml -l gitlab-01
# Certificate management
ansible-playbook playbooks/certificates.yml
# Reverse proxy deployment
ansible-playbook playbooks/reverse-proxy.yml
acme.sh: Certificate Issuance
| Attribute | Value |
|---|---|
| CA | ZeroSSL |
| Challenge Type | DNS-01 |
| DNS Providers | Route53 (primary), Cloudflare (secondary) |
| Key Type | ECDSA (Elliptic Curve) |
| Validity | 90 days |
| Renewal Threshold | 30 days before expiration |
acme.sh/
account.conf
anshinhealth.net_ecc/
anshinhealth.net.key # Private key (Vault-encrypted in git)
anshinhealth.net.cer # Certificate
fullchain.cer # Full certificate chain
anshinhealth.net.conf # Renewal configuration
apps.anshinhealth.net_ecc/
svcs.anshinhealth.net_ecc/
Helm: Kubernetes Application Deployment
| Chart | Namespace | Values File | Purpose |
|---|---|---|---|
| kube-prometheus-stack | monitoring | kubernetes/monitoring/helm-values/kube-prometheus-stack-values.yaml | Prometheus, Grafana, AlertManager, node-exporter, kube-state-metrics |
| Blackbox Exporter | monitoring | kubernetes/monitoring/helm-values/ | Endpoint monitoring |
| Karma | monitoring | kubernetes/monitoring/helm-values/ | AlertManager dashboard |
| external-dns | kube-system | Custom values | RFC2136 DNS updates to FreeIPA |
| MetalLB | metallb-system | Custom values | Bare-metal LoadBalancer |
| Infisical | infisical | kubernetes/infisical/helm-values/infisical-standalone-values.yaml | Secrets management |
# Monitoring stack
cd kubernetes/monitoring && ./deploy.sh all
# Infisical (database, server, operator)
cd kubernetes/infisical && ./deploy.sh
Utility Scripts
| Script | Purpose | Usage |
|---|---|---|
scripts/ipa-dns-grant.py | Manages FreeIPA DNS zone permissions for service accounts — grants external-dns TSIG key access to specific DNS zones | Run after adding new DNS zones |
scripts/r53-to-bind9.py | Exports Route53 DNS records to BIND9 format | Used for DNS migration and auditing |
Kolla-Ansible: OpenStack Deployment (Planned)
Kolla-Ansible will manage the deployment and lifecycle of the entire OpenStack platform. Every OpenStack service (Nova, Neutron, Cinder, Glance, Keystone, Heat, Octavia, Designate, Trove, Manila) runs as a Docker container.
Key configuration files:
globals.yml— Service selection, networking backends, storage backends, TLS configinventory/— Host inventory defining control, compute, network, and storage roles
Specific globals.yml parameters, Kolla-Ansible version, and target OpenStack release are pending.
Terraform: OpenStack Provisioning (Planned)
The same Terraform patterns used for Proxmox will be adapted for the OpenStack provider, using the same AWS-parity subnet layout (10.100.0.0/16 with /20 blocks per type). Atmos will be used for multi-environment configuration management.
Document Control
| Rev | Date | Author | Description |
|---|---|---|---|
| 1.0 | 2026-02-24 | Marc Mercer | Initial release |