Postgres
Prerequisites
- template-operator is installed
- canary-checker is installed
- An S3 compatible object store is available to store logical backups of Postgres Cluster
Deploy¶
Postgres databases can be deployed using the Zalando Postgres Operator
karina.yml
postgresOperator:
version: v1.3.4.flanksource.1
backupPassword: passwordForResticBackup # Optional but can't be changed once the operator is deployed
defaultBackupRetention: # Optionally specify the retention of the backup, this can be overridden in db.flanksource.com/v1/PostgresqlDB CR, see example below
keepHourly: 24
keepDaily: 7
keepWeekly: 4
keepMonthly: 6
keepYearly: 1
templateOperator:
version: v0.1.11
canaryChecker:
version: v0.15.1
## Below are optional configurations:
# S3 connection information (to store logical backups of all Postgres Clusters)
s3:
access_key: minio
secret_key: minio123
endpoint: http://minio.minio.svc:9000
# Only applicable if you want to use MinIO as S3 Object Storage
minio:
version: RELEASE.2020-09-02T18-19-50Z
access_key: minio
secret_key: minio123
replicas: 1
Deploying using :
karina deploy postgres-operator -c karina.yml
A CRD called PostgresqlDB is used as a wrapper around the core zalando objects
Once the operator is deployed, you can create a new database
db.yml
apiVersion: db.flanksource.com/v1
kind: PostgresqlDB
metadata:
name: db
namespace: postgres-operator
spec:
backup:
bucket: postgres-backups
schedule: "0 */4 * * *"
retention: # Optionally specify the backup retention for this cluster, this will override the backup retention defined in operator configuration above
keepHourly: 10
keepDaily: 3
keepWeekly: 2
keepMonthly: 1
keepYearly: 1
cpu: 4000m
memory: 8Gi
replicas: 3
storage:
size: 200Gi
parameters:
archive_mode: "on"
archive_timeout: 60s
log_destination: stderr
max_connections: "600"
shared_buffers: 2048MB
kubectl apply -f db.yml
The template operator will pick up the new db.flanksource.com/v1
object and create underlying Zalando Postgres
objects, CronJobs
for backups and 2 Canary
's - 1 for the backup freshness and another for connecting to the postgres instance
Day 2 Tasks¶
Failover¶
Clone¶
This command will create a new database cluster restored from WAL backup of another cluster.
karina db clone --name test1-clone --clone-cluster-name postgres-test1 --clone-timestamp "2020-04-05 14:01:00 UTC"
See karina db clone documentation for all command line arguments.
Backup¶
This command will perform a logical backup of the given cluster.
# Run backup once
karina db backup --name test1
# Deploy a cron job to run a backup every day at 04:00 AM
karina db backup --name test1 --schedule "0 4 * * *"
See karina db backup documentation for all command line arguments.
Restore¶
This command will list all backups of a given cluster
karina db backup list --name test1
This command will restore a given cluster from a previous logical backup
karina db restore "/backup-file-path.sql" --name test1
You can also restore a given cluster from a logical backup located in a different backup
karina db restore bucket-name "/backup-file-name.sql" --name test1
See karina db restore documentation for all command line arguments.
Port Forwarding¶
- Retrieve the password
kubectl get secret postgres.postgres-{DB-NAME}.credentials -o json -n postgres-operator | jq -r '.data.password' | base64 -D
- Port forward the DB port
kubectl port-forward po postgres-{DB-NAME}-0 5432 -n postgres-operator
- Connect to the database via
localhost:5432
Disaster recovery¶
In the event that the patroni cluster fails completely, the following steps can be followed to reset the cluster state and recover from backup:
Scale down the cluster:
```shell script kubectl scale statefulset broken-cluster --replicas=0
Delete the existing persistent volume claims:
```shell script
# get list of pvcs
kubectl get pvc | grep broken-cluster
# run the delete command for each
kubectl delete pvc pgdata-broken-cluster-0
Create new persistent volume claims:
```shell script cat <<EOF | kubectl apply -f - apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pgdata-broken-cluster-0 namespace: postgres-operator spec: accessModes: - ReadWriteOnce resources: requests: storage: 20Gi storageClassName: vsan volumeMode: Filesystem EOF
Scale the replicas back up
```shell script
kubectl scale statefulset broken-cluster --replicas=n
at this point, if both pods report in the logs that leader: none
, delete all the endpoints related to the cluster:
shell script
kubectl get endpoints | grep broken-cluster
Once the cluster reports a leader and replicas, a database restore can be run (See karina db restore)
Useful utilities¶
pg_ctl stop
should be used to stop the postgres server if required. This can only be done after sv stop patroni
has been run.
pg_resetwal
can be used recover from corrupted/missing WAL files