Redis: Single Replica to Redis in High Availability (HA) Mode Migration¶
Starting from Cognigy.AI release v4.65, the single-replica Redis setup is deprecated and replaced with Redis in High Availability (HA) Mode.
Prepare Redis and Redis persistent HA Configuration¶
Before upgrading to Cognigy.AI v4.65, you must perform the following steps and modify the values.yaml
of your Cognigy.AI Helm Release accordingly.
Self-Managed Redis Installations¶
If you do not use Redis and Redis persistent services delivered with AI Helm Chart and prefer self-managed external Redis services instead, ensure that you have the following variables in your values.yaml
:
statefulRedis:
enabled: false
statefulRedisPersistent:
enabled: false
values.yaml
:
redisHa:
enabled: false
redisPersistentHa:
enabled: false
statefulRedis
and statefulRedisPersistent
sections from values.yaml
and skip the remaining steps in this migration guide.
Cloud Infrastructure Configuration¶
- Redis and Redis persistent in HA mode are provisioned with 3 replicas to increase service availability. Ensure that your Kubernetes cluster has enough free capacity for additional Redis and Redis persistent pods in HA setups. In total, both configurations require an additional provision of 3 CPU cores and 3GB RAM in the cluster.
- Check
reclaimPolicy
for the existingredis-persistent
StorageClass with the following command:Ifkubectl get storageclass redis-persistent -o yaml
reclaimPolicy: Delete
, you can skip the Persistent Volume Clean-up section of this guide. IfreclaimPolicy: Retain
you will need to clean up the Persistent Volume and the underlying disk manually as described in the Persistent Volume Clean-up section. - Redis HA default settings imply that you run your cluster with 3 Availability Zones (AZ) (Cognigy.AI recommended setup), and
the Helm Release spawns HA replicas across 3 AZs. If your cluster is provisioned without Availability Zones, override zonal
podAntiAffinity
invalues.yaml
:redisHa: replica: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: [] redisPersistentHa: replica: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: []
- If your cloud provider is neither AWS nor Azure, create
redis-persistent-ha
StorageClass manually. Set all the parameters for the newredis-persistent-ha
StorageClass equal to the existingredis-persistent
StorageClass:- Get existing
redis-persistent
StorageClass and store it in theredis-persistent-ha.yaml
file:kubectl get storageclass redis-persistent -o yaml > redis-persistent-ha.yaml
- Open the
redis-persistent-ha.yaml
file in a text editor. Change thename:
field toredis-persistent-ha
. Remove theuid:
,resourceVersion:
andcreationTimestamp:
fields. - Save the file and create the new
redis-persistent-ha
StorageClass by applying it to the cluster:kubectl apply -f redis-persistent-ha.yaml
. - Check that the new StorageClass is created in the cluster:
kubectl get storageclass redis-persistent-ha -o yaml
.
- Get existing
- If your cloud provider is either AWS or Azure, the
redis-persistent-ha
StorageClass will be created automatically. Before upgrading the Helm Release, ensure that:- On AWS: the
gp3
storage and theebs.csi.aws.com
provisioner are enabled in your cluster. - On Azure: the
Premium_LRS
storage account type and thedisk.csi.azure.com
provisioner are enabled in your cluster. - Alternatively, you can override
redisPersistentHa
settings understorageClass:
section to match the parameters of the existingredis-persistent
StorageClass, seevalues.yaml
for AWS and Azure reference.
- On AWS: the
Migrate Custom Redis and Redis Persistent Configuration¶
- If you do not have any custom configuration under the
statefulRedis
andstatefulRedisPersistent
sections in your Cognigy.AI Helm Releasevalues.yaml
, skip this section. -
If you have a custom configuration under the
statefulRedis
and/orstatefulRedisPersistent
sections, you need to copy it underredisHa
andredisPersistentHa
respectively as follows:- If
statefulRedis.auth.password
is defined in cleartext, copy its value underredisHa.auth.password
. - If
statefulRedisPersistent.auth.password
is defined in cleartext, copy its value underredisPersistentHa.auth.password
. - If a custom
statefulRedis.auth.existingSecret
is defined, copy its value underredisHa.auth.existingSecret
. Ensure that the corresponding custom secret exists in the cluster. - If a custom
statefulRedisPersistent.auth.existingSecret
is defined copy its value underredisPersistentHa.auth.existingSecret
. Ensure that the corresponding custom secret exists in the cluster. - If custom
resources
forstatefulRedis
are defined in yourvalues.yaml
, copy theresources
section (including bothrequests
andlimits
) toredisHa.replica.resources
. Set themaxmemory
setting under theredisHa.replica.configuration
parameter to 85% ofresources.limits.memory
. Refer to values.yaml for details. - If custom
resources
forstatefulRedisPersistent
are defined in yourvalues.yaml
, copy theresources
section (including bothrequests
andlimits
) toredisPersistentHa.replica.resources
. Refer to values.yaml for details.
- If
Upgrade Cognigy.AI Helm Release to v4.65¶
- Double check that all parameters in your
values.yaml
are adjusted as described above. -
Perform the upgrade of the Cognigy.AI Helm Release to v4.65 as usual. During the upgrade:
- New
redis-ha-node
andredis-persistent-ha-node
StatefulSets, along with their corresponding pods, will be created in the cluster. - Old
redis
andredis-persistent
Deployments and corresponding pods will be removed from the cluster. - Cognigy.AI services will reconnect to Redis and Redis persistent HA setups.
- Verify that all the pods are running as expected by executing:
kubectl get pods -n=cognigy-ai
.
- New
Persistent Volume Clean-up¶
After upgrading Cognigy.AI to v4.65 and verifying that the release works properly, you can clean up the remaining persistent
volume (PV) for the old redis-persistent
Deployment:
- If
reclaimPolicy: Delete
was set for the oldredis-persistent
StorageClass
, skip this section. The underlying PV and PVC will be deleted automatically. -
If
reclaimPolicy: Retain
was set for the oldredis-persistent
StorageClass
, manually remove the PV associated with the oldredis-persistent
Deployment and the underlying disk in your cloud infrastructure:- Get PVs in your cluster:
kubectl get pv
. - Write down the name of the PV in the
Released
state and CLAIM:cognigy-ai/redis-persistent
(referenced below asPV_NAME
). - Delete the
pv
withkubectl delete pv PV_NAME
. - Delete the storage disk corresponding to
PV_NAME
in your cloud environment.
- Get PVs in your cluster: