Architecting for Active/Active Operations
Organizations understand the importance of having a disaster recovery (DR) plan in place to help the business quickly restore operations in the event of a failure. The gold standard quickly emerging for architecting DR is an active-active data center configuration, with data synchronization and load balancing for applications between two or more data centers.
In an active/active architecture, both data centers – or as many data centers as are included in the design – serve application traffic at the same time. This design is superior to the more common active/standby design because it provides both an insurance policy against failures and also enables use of the secondary data centers, increasing capacity and therefore application performance.
On yesterday’s webinar on architecting for active/active operations, Aaron Lee of database consultancy Pythian made some great observations about why we’re stuck where we are (active/standby vs. active/active ) in our data center architectures and how we can more easily progress toward a more robust design.
Active/active designs have secondary systems actively participating in database operations, supporting read operations and ready to take over in handling writes as well. Active/standby designs, on the other hand, have idle systems that do not serve any database traffic until they take over as the operational system.
“DR [with standby] has been easy to cost justify, but active/active has been harder,” said Lee. The reason it’s been tough to show the business return, he noted, is that the most common means has been the very laborious DIY approach of building mechanisms for active failover directly into the app.
Lee called out the pros and cons of three approaches to architecting for active/active ops:
- build mechanisms such as read/write split into the app
- rely on capabilities of the database
- build an abstraction layer that does the work for you
He went on to contend that the abstraction layer approach – with database load balancing software that automatically supports read/write split and failover, with no app changes – fits the bill for 80% of organizations’ environments, making it the most pragmatic option most of the time.
The audience had some great questions along the way:
Where does the abstraction layer get deployed? On the database, on the app server?
It sits between the two, on its own bare metal or VM instance. Or it can be deployed in the cloud. But close to the app is valuable.
Where is replication handled – in the database, or in the abstraction layer?
ScaleArc relies on the database for replication, because with that approach, we can support any replication scheme the customer wants to use, and we’re totally agnostic to the replication method.
Is ScaleArc app agnostic or is there a list of apps certified for ScaleArc?
ScaleArc is 100% app agnostic. We process traffic at the SQL protocol level, ensuring broad interoperability across any application talking to MySQL, SQL Server, or Oracle databases. Transparent deployment – with no changes needed to the app or database – is fundamental tenet of our architecture.
How is ScaleArc protected in terms of server or data center failure?
Customers deploy ScaleArc in HA mode, pairing two ScaleArc instances to front-end the database. Cache content and configuration are synched between the two, ensuring rapid failover.
What is your perspective on native database replication vs. storage array-based replication?
ScaleArc contends that data accuracy and timeliness is much higher with database-based replication than with storage replication. In addition, ScaleArc has designed in mechanisms to track database replication timing, to ensure we don’t send reads to a secondary that is too far behind in replication – you can’t do that with storage replication.
Thanks again to a great audience yesterday – getting so many good questions along the way makes for a much more interesting conversation for everyone.
comments powered by Disqus