DB Failover and DR

Hi all,

How are you all dealing with DB failover and disaster recovery? Since the aptible DB runs on EC2 and those are being scheduled for retirements or the hardware fails from time to time, we are looking for the best way to recover from a failure ASAP.

Ideally, we would like to have a private DNS record for our master and replica DBs and have our apps pointing to this record and change DNS in a case of failover. This is in our opinion the fastest and most secure way of handling failovers and private DNS is something that is supported by AWS even for peered VPCs in different AWS accounts, however, this is not supported by Aptible, so it’s not an option for us.

The workaround is to have a public record pointing to the private DB endpoint, however, this is not something we are comfortable with, since we are a healthcare company and having a public DNS record for our DB (even though it points to private endpoint) is a security concern.

The last option I know of is to write a maintain a script that will change the app’s settings to point to the new master DB endpoint. This is an option we will probably go with, but it’s not ideal as the recovery will take longer and we also will need to maintain the list of apps we need to change the DB URI for.

I was wondering how others have this setup, is there any better, faster or more secure way of how to deal with DB failure?

Thanks!

Hi Petr,

Can you talk a bit more about your security concerns related to public DNS records for private IP addresses? Aptible’s database DNS records are already public (even though the underlying IP addresses are all private RFC 1918 addresses).

Also, have you evaluated the speed of aptible config:set compared to the time it takes to update and propagate a DNS record? These operations take right around 1 minute to complete, on average (counting blue-green switchover time), and so may ultimately be faster than the DNS update approach you’re describing.

— Frank