Dual-Focus on Application Performance: Driving China Southern Airlines' IT Operations Transformation and Upgrade
Bonree helped China Southern Airlines build a unified monitoring platform, integrating with existing monitoring tools to establish and enhance a comprehensive IT operations monitoring system. Covering over 1,000 application instances, the platform significantly improved overall business performance quality, enabled minute-level fault localization, and ensured a smooth user experience for China Southern Airlines customers.
1. Difficult to Perceive User-Side Access Experience
The entire process of users accessing applications over the network is hard to trace, and issues such as access anomalies are often only discovered through user complaints.
2. Complex Business Call Relationships Make It Difficult to Identify Performance Bottlenecks Across the Entire Chain
China Southern Airlines manages different microservice applications through project teams. When a failure occurs, it's essential to quickly analyze the logical architecture and call relationships of the entire application, identify the responsible project team, and reduce MTTR (Mean Time to Repair).
3. E-commerce Acceleration Makes Ensuring Application Performance a Top Priority
With the rising number of online users, systems face increasing concurrent pressure. Quickly resolving issues and ensuring high-performance application operation has become critically important.
1. Understanding User-Side Access Experience, Focusing on Flight Demand for Overseas Chinese Returning Home
Bonree ONE utilizes globally distributed monitoring nodes to actively simulate access requests. By monitoring performance latency from multiple overseas regions accessing the China Southern Airlines official website, it assesses the site’s availability. Through transaction monitoring, it simulates real user operations to evaluate the smoothness of each step, measuring response times and success rates throughout the ticket booking process. Additionally, it monitors access quality from key countries with major overseas routes, using the same regional monitoring nodes and issuing timely alerts to ensure normal flight service operations—significantly improving business availability
2. Application Topology Visualization, Quantified Performance Metrics, and Rapid Fault Localization
Bonree ONE helps China Southern Airlines understand the load status of each machine and correlate it with application data. It visually presents hardware resource usage percentages, enabling operations personnel to identify issues at a glance. Additionally, through Bonree Server's snapshot analysis function, complete fault scene data can be captured—including code execution stacks, CPU/memory/JVM info, request parameters, SQL statements, JVM arguments, and server resource loads—enabling rapid correlation analysis and significantly reducing time consumption while improving operational efficiency.
When performance bottlenecks or faults are detected, Bonree ONE can directly pinpoint the problematic system call and assign the issue to the responsible project team, reducing communication overhead, shortening MTTR, and improving troubleshooting efficiency to a matter of minutes
3. Comprehensive Infrastructure Monitoring for Unified Resource Management and Alerting
Using Bonree ITIM, China Southern Airlines achieved global monitoring of data center hardware, enabling unified management and alerting for foundational resources. The overall project covers approximately 12,000 monitored objects, including servers, firewalls, switches, routers, virtual machines, and more. This solution delivers a streamlined and clear display of devices, real-time visualization of resource usage, accurate alerting, and intelligent alert correlation
1. Industry-Leading Technological Innovation:
The first in the industry to achieve both CMMI Level 5 certification — the highest level of software capability maturity — and ISO 9001 certification.
2. Core Value of Customer First:
Equipped with a professional service team that provides timely 24/7 support, delivering comprehensive and expert services to customers.
1. Significant Improvement in Core Business Availability.
The average response time for core business dropped from 1450ms to 130ms, increasing business availability to 98.5%
2. Application crash rate and request error rate reduced.
APP request error rate decreased by 3.7%, and application crash rate decreased by 2%
3. Performance metrics quantified.
Hardware resource usage visualized, precise querying of individual business transaction data enabled, and abnormal data collection and troubleshooting time reduced to the minute level
4. Continuous business monitoring
Through real-time monitoring, proactively grasp user experience in real time, quickly locate abnormal points, and improve fault handling efficiency