SmartX, a leading provider of enterprise cloud infrastructure, recently released SMTX Kubernetes Service (SKS) version 1.5. This production-grade container management and service product focuses on enterprise-grade reliability and refined governance.

This release introduces cross-site active-active disaster recovery capabilities, enhances multi-tenancy management, and boosts platform observability, providing the highest level of stability and security assurance for enterprise core business applications.

Release Background

As the business applications running on Kubernetes evolve from edge and general applications to the enterprise’s core production systems, this places unprecedentedly stringent requirements on service continuity. Traditional single-data-center deployment models inherently expose their fragility when faced with “site-level” failures such as power outages or network interruptions.

Therefore, the challenge of “How to build a reliable disaster recovery system for containerized core business applications?” has been placed before many CIOs and IT leaders. Users no longer just need simple backup and recovery; they require a truly active-active application solution with automatic failover capabilities.

To address these challenges, SmartX leverages its deep experience in active-active storage to innovatively and deeply integrate its mature active-active capabilities with the container platform. This makes SKS a production-grade platform that can truly support enterprises’ critical missions.

Supporting Active-Active Deployment to Enhance Kubernetes Cluster Reliability

SKS 1.5 introduces a new active-active cluster deployment capability based on SmartX hyper-converged software. This feature achieves real-time data synchronization, dynamic resource scheduling, and rapid fault recovery by distributing the Kubernetes cluster simultaneously across primary and secondary availability zones. Even if the primary zone fails, the system can quickly restore service in the secondary zone, achieving resource redundancy and automatic disaster recovery failover across availability zones, thereby improving the disaster recovery capability and business continuity of the Kubernetes platform.

Compared to native Kubernetes multi-cluster or active/standby DR solutions, SKS 1.5 active-active deployment offers the following advantages:

  • Dual-layer High Availability for Control and Data Planes: Not only is cross-zone control plane redundancy supported, but data consistency at the workload level is also achieved through the distributed storage module’s synchronization mechanism in the hyper-converged software, mitigating storage layer risks that native K8s struggles to cover.
  • Minute-level Business Recovery: Combining VM HA and K8s node self-healing mechanisms, the Recovery Time Objective (RTO) can be controlled within minutes. This significantly shortens the recovery time compared to solutions relying on cross-cluster backup and restore.
  • Unified Cluster and O&M View: No need to build complex systems like multi-cluster Federation or Velero. Cross-availability zone management and failover can be achieved through CloudTower.
  • Balancing Performance and Resource Utilization: Both availability zones simultaneously bear the business load, ensuring high availability while avoiding long-term idle resources typical of traditional cold standby solutions.

Multi-Tenancy Enhancements to Improve Team Collaboration Efficiency

In version 1.4, SKS already achieved multi-tenant isolation and collaboration through project-level permission management. Version 1.5 further introduces support for multi-tenant resource quotas and expands visualization management capabilities, helping enterprises allocate, control, and use resources more precisely in complex multi-team environments.

The new version introduces project-level and Namespace-level resource quota management (CPU, memory, storage, GPU). Quotas can be intuitively set, and resource usage statistics are displayed in real-time on the UI. When usage approaches the quota threshold, the system automatically sends alerts to prevent resource contention and system risks.

Enhanced Observability and Auditing Capabilities to Build a Fully Controllable Platform

SKS 1.5 continues to improve cluster observability and adds an auditing feature. This supports full traceability of user, application, Kubernetes API, and control plane activities, enhancing operational security and compliance. Users can flexibly configure audit policies based on actual needs: for resource-constrained or PoC scenarios, the concise policy can be chosen to only record core metadata, minimizing resource consumption. Additionally, SKS 1.5 supports basic, detailed, and custom policies to adapt to various production environments.

Through these enhancements, SKS supports key O&M functions like monitoring, alerting, logging, events, auditing, and traffic visualization, ensuring users can promptly discover and properly handle cluster anomalies, helping them achieve efficient and reliable cluster operation and maintenance.

User Practices

Since its release, SKS has been deployed in production environments by users in the finance, healthcare, and other industries, assisting them in building a converged virtualization and container infrastructure. Below are examples of user practices:

  • A Trust Company: VMware replacement and active-active construction from virtualization to container platform practice. >>Learn more
  • Zigong First People’s Hospital: Hyper-convergence and SKS support core business applications like HIS, accelerating localization and cloud-native transformation. >>Learn more
  • A State-Owned Water Utility: Full-stack hyper-convergence builds a lightweight cloud foundation for unified management of virtualized and container environments.
  • An Autonomous Driving Company: Virtualization and container hybrid infrastructure promotes the architectural upgrade of intelligent port systems.

Learn More

Download SmartX ECP Product Brief Flyer

Download SMTX Kubernetes Product Brief Flyer

Continue Reading