This guide provides detailed procedures for backup and restore of PostgreSQL in on-premises Kubernetes Crosswork Assurance deployments.
Note: This procedure applies to both
postgresandyang-postgresinstances.
Prerequisites
kubectlaccess to the Crosswork Assurance clusterSSH access to cluster nodes (for restore)
Familiarity with OpenEBS local storage paths
Backup Overview
PostgreSQL backups run automatically via Kubernetes CronJobs at 01:00 UTC daily. Backups are stored in MinIO at:
<bucket-name>/postgres/v2/
Verify the backup jobs exist:
kubectl get cronjobs -n pca | grep postgres
Trigger Backup Manually
From any pod with curl (such as airflow):
kubectl exec -it -n pca airflow-0 -- sh
curl http://postgres:10004/backup
Expected output: Successfully ran Backup on postgres
Access Backups in MinIO
Connect to the MinIO pod:
kubectl exec -it -n pca pca-minio-pool-0-0 -- sh . /tmp/minio/config.envConfigure the MinIO client:
mc alias set --insecure pca https://localhost:9000 "$MINIO_ROOT_USER" "$MINIO_ROOT_PASSWORD"List available backups:
mc --insecure ls pca/<bucket-name>/postgres/v2/Copy a backup to the pod filesystem:
mc --insecure cp -r \ pca/<bucket-name>/postgres/v2/postgresBackup-<timestamp>/ \ /tmp/
Export Backup to Admin VM
The MinIO image does not include tar, so use cat to stream files:
Set the backup directory name:
export PG_BKUP="postgresBackup-<timestamp>" mkdir $PG_BKUPCopy the backup files:
kubectl exec -n pca pca-minio-pool-0-0 -- cat /tmp/$PG_BKUP/pg_wal.tar > $PG_BKUP/pg_wal.tar kubectl exec -n pca pca-minio-pool-0-0 -- cat /tmp/$PG_BKUP/base.tar > $PG_BKUP/base.tarVerify checksums:
kubectl exec -n pca pca-minio-pool-0-0 -- cksum /tmp/$PG_BKUP/base.tar kubectl exec -n pca pca-minio-pool-0-0 -- cksum /tmp/$PG_BKUP/pg_wal.tar cksum $PG_BKUP/*.tarChecksums must match exactly.
Identify Storage Location
Embedded and air-gapped deployments use OpenEBS local storage. PVC data resides on the node filesystem.
Identify the PostgreSQL pod's node and PVC:
kubectl get pods -n pca -o wide | grep postgres-0 kubectl get pvc -n pca | grep postgres-data-postgres-0Set environment variables:
export PG_HOST=<node-name> export PG_PVC=<pvc-id> export PG_BKUP="postgresBackup-<timestamp>"Copy the backup to the target node:
scp -r ./$PG_BKUP \ ${PG_HOST}:/var/lib/embedded-cluster/openebs-local/$PG_PVC/
Extract Backup on Target Node
On the node where the PVC resides:
cd /var/lib/embedded-cluster/openebs-local/$PG_PVC/$PG_BKUP
tar xf base.tar -C pgdata
tar xf pg_wal.tar -C pgdata/pg_wal/
chmod 600 pgdata
chown 70:root pgdata
chown 70:70 pgdata/*
Restore Procedure
Warning: This procedure replaces existing data. Ensure you have a valid backup before proceeding.
Step 1: Stop Dependent Services
for deploy in cerbos coordinator gather overlord sky-zitadel-gw skylight-aaa zitadel; do
kubectl scale deployment -n pca $deploy --replicas=0
done
for sts in airflow airflow-scheduler; do
kubectl scale statefulset -n pca $sts --replicas=0
done
Step 2: Verify No Active Connections
kubectl exec -it -n pca postgres-0 -- psql -U postgres
SELECT DISTINCT client_addr, usename, datname
FROM pg_stat_activity
WHERE client_addr IS NOT NULL;
This query should return 0 rows.
Step 3: Scale Down PostgreSQL
kubectl scale statefulset -n pca postgres --replicas=0
Step 4: Replace Data Directory
On the target node:
cd /var/lib/embedded-cluster/openebs-local/$PG_PVC
mv pgdata pgdata_$(date +%Y-%m-%d_%H-%M-%S)
cp -r $PG_BKUP/pgdata ./
Step 5: Restart Services
Start PostgreSQL:
kubectl scale statefulset -n pca postgres --replicas=1
kubectl logs -f -n pca postgres-0
Start dependent services:
for deploy in cerbos coordinator gather overlord sky-zitadel-gw skylight-aaa zitadel; do
kubectl scale deployment -n pca $deploy --replicas=1
done
for sts in airflow airflow-scheduler; do
kubectl scale statefulset -n pca $sts --replicas=1
done
Post-Restore Validation
Expected Data Gaps
Backups run at 01:00 UTC. Any data generated after that time (typically via Airflow jobs) will be missing.
Backfill Airflow Jobs
Determine the missing hours and re-run ingestion jobs:
kubectl exec -it -n pca airflow-0 -- sh
for hour in 05 06 07 08 09 10 11 12 13 14 15 16 17; do
payload=$(printf '{"logical_date":"<date>T%s:00:00Z"}' "$hour")
curl -X POST http://airflow:8080/airflowui/api/v1/dags/Batch_Ingestion_v1.1_contemporary/dagRuns \
-H 'Content-Type: application/json' \
-d "$payload"
sleep 180
done
Note: Only two Airflow job slots are available. Adjust the
sleepinterval based on job duration and system load.
Related Topics
© 2026 Cisco and/or its affiliates. All rights reserved.
For more information about trademarks, please visit: Cisco trademarks
For more information about legal terms, please visit: Cisco legal terms