Key Essentials for Disaster Recovery

Given the high standards for digitalization and security in the financial sector, Caida Securities has adopted a metro-area active-active architecture. This decision follows a comprehensive evaluation of recovery metrics, technical maturity, and cost-efficiency to ensure cross-site cluster protection and disaster recovery (DR). Unlike traditional active-passive models characterized by low resource utilization and prolonged downtime, the active-active data center addresses these inefficiencies while significantly enhancing availability. Consequently, Caida Securities has defined the following project objectives:

  • RPO = 0 via Storage Synchronization: Implementing real-time I/O synchronous replication at the storage layer to ensure data integrity and zero loss, even in the event of a critical site failure.
  • Business Continuity Assurance: Supporting cross-site High Availability (HA) for virtual machines, enabling automated failover and rapid restoration of operations during site-level outages.
  • Application-Level Active-Active Support: Integrating the active-active cluster with database and application server groups to achieve transparent failover at the application layer.

Validating Active-Active HCI Cluster Functionality

The Active-Active Hyperconverged Infrastructure (HCI) cluster is deployed in a stretched configuration, consisting of two Availability Zones (AZs) and a witness node, communicating via network interconnects. In the event of an AZ failure, the remaining AZ continues to provide services, ensuring AZ-level DR.

Caida Securities simulated various failure scenarios, including host outages, AZ (data center) failures, and network disruptions. Test results demonstrate that the Active-Active HCI cluster enables immediate risk mitigation. Standalone business systems recovered within 2–3 minutes, while integration with Oracle RAC multi-active database architecture achieved the DR objective of zero RPO and zero RTO.

Active-Active Cluster Solution for Next-Gen HCI

Caida Securities currently utilizes the China Business Center as its primary data center for core business systems, with the Development Zone Telecommunications Center serving as the DR site. The Zhuangjia Financial Mansion center hosts a limited number of management applications. The physical distance between the primary and DR sites is approximately 20km, with a stable network round-trip latency (based on ping tests) of 0.6ms.

Based on these functional roles, the Development Zone and China Business centers are designated as the primary and secondary AZs, respectively, with the Zhuangjia site serving as the arbitration node. By leveraging SmartX stretched cluster technology and existing Layer 2 active-active networking capabilities, Caida Securities has successfully constructed an integrated active-active data center architecture.

Enhancing Business Continuity and O&M Efficiency

The HCI active-active cluster provides a resilient foundation for business continuity. By migrating business systems that previously lacked DR capabilities due to resource constraints into this active-active cluster, Caida Securities has elevated their emergency readiness. This transition eliminates the need for separate DR construction for systems on the cluster, effectively reducing IT investment costs.

Currently, Caida Securities utilizes a hybrid of automated workflows and manual intervention for emergency switching between production and DR systems. This approach mitigates the operational workload and complexity associated with legacy manual processes, significantly enhancing both business continuity and regulatory compliance.

Read more