Background Analysis

With the continuous development of Wanglaoji Health Company (hereinafter referred to as “Wanglaoji”), its IT system architecture has become increasingly complex. Both system traffic and the number of functional modules continue to grow, leading to rising application complexity.

As IT failures and risk points increase, traditional infrastructure monitoring methods are no longer sufficient to meet current operational requirements. At the same time, the company faces multiple challenges, including rapidly changing business demands, continuously increasing user expectations, and pressure for cost reduction and efficiency improvement. As a result, the likelihood of performance degradation or service anomalies in IT applications has significantly increased, impacting overall business continuity.

Therefore, establishing an effective application management mechanism and ensuring stable IT system operations has become an urgent requirement for business development.

Before the implementation of the project, Wanglaoji lacked RUM (Real User Monitoring) and APM (Application Performance Monitoring) alerting mechanisms. System status was entirely dependent on user complaints—meaning the first person to detect issues was often customer service or business staff rather than operations engineers.

This reactive model led to significant delays in incident detection and left operations teams in a blind spot.

Application Scenario

In daily operations, the Bonree ONE platform unified the collection of RUM (Real User Monitoring) and APM (Application Performance Monitoring) data from Wanglaoji’s SSO and TPM systems, building an end-to-end observability baseline.

When system anomalies occur, intelligent alerting strategies trigger notifications within seconds and notify operations engineers. A closed-loop troubleshooting process is then executed as follows:

APM Call Chain Analysis: Quickly Define Incident Boundaries

Operations engineers first use the APM module’s full call-chain tracing to accurately locate faulty services and abnormal nodes, quickly determining whether the issue is caused by downstream dependency latency or code-level performance bottlenecks.

6-1782719413938

7-1782719418243

RUM Session Replay: Reconstruct User Behavior

RUM session replay is used to reconstruct real user interaction paths during the incident period. Combined with client IP, device type, and geographic distribution, engineers can determine whether the issue is caused by specific environments or regional network conditions.

This helps eliminate irrelevant client-side factors and ensures optimization efforts focus on true root causes.

Deep Correlation of Middleware Metrics: Identify Hidden Bottlenecks

For complex incidents, detailed monitoring data from databases, caches, and message queues is analyzed. Key metrics such as connection count, response latency, and queue backlog are compared across dimensions.

Through multi-dimensional correlation analysis, hidden issues such as slow SQL queries, connection pool exhaustion, cache breakdowns, or message queue congestion can be quickly identified, significantly reducing trial-and-error troubleshooting costs.

Playbook Accumulation and Knowledge Loop: Continuous Stability Improvement

Based on the above analysis, targeted remediation measures are implemented. After incident resolution, full-chain data, root cause conclusions, and handling processes are standardized and stored in a knowledge base.

When similar incidents occur in the future, historical cases can be automatically referenced, shortening response time and forming a closed-loop process of:

Detection → Diagnosis → Recovery → Knowledge Retention

This continuously strengthens the resilience of Wanglaoji’s business systems.

Application Result

Average incident detection time reduced from hours to under 10 minutes
Alert accuracy improved to ≥95%, significantly reducing noise and operational interference
Incident resolution efficiency improved from hours of manual investigation to minutes of intelligent localization
Successfully implemented RUM, APM, and core middleware monitoring capabilities, enabling full end-to-end observability from user side to service side
Built a unified monitoring and alerting system that makes system status visible, measurable, and traceable
Transitioned operations from a reactive complaint-driven model to proactive governance

Looking forward, Bonree will continue to collaborate with Wanglaoji, focusing on advancing AI capabilities, including:

Intelligent root cause analysis
AI-assisted diagnostics

Further improving operational efficiency and evolving the system from “observable and measurable” to “intelligent and autonomous.”

Why Bonree

Global Leader in Intelligent Observability

Bonree is an AI-driven global leader in intelligent observability.

Full-Stack End-to-End Observability Capabilities

Bonree ONE provides full-stack observability from user experience, application services, middleware, databases, to underlying infrastructure.

Background Analysis

APM Call Chain Analysis: Quickly Define Incident Boundaries

RUM Session Replay: Reconstruct User Behavior

Deep Correlation of Middleware Metrics: Identify Hidden Bottlenecks

Playbook Accumulation and Knowledge Loop: Continuous Stability Improvement

Why Bonree

Global Leader in Intelligent Observability

Full-Stack End-to-End Observability Capabilities

Related Case

No.1 in China's APMO Market Share

1000+ Top Customers' Choice

Observability Metrics 1000+

See Our Unified Intelligent Observability Platform in Action!

Background Analysis

APM Call Chain Analysis: Quickly Define Incident Boundaries

RUM Session Replay: Reconstruct User Behavior

Deep Correlation of Middleware Metrics: Identify Hidden Bottlenecks

Playbook Accumulation and Knowledge Loop: Continuous Stability Improvement

Why Bonree

Global Leader in Intelligent Observability

Full-Stack End-to-End Observability Capabilities

Related Case

China Tower

No.1 in China's APMO Market Share

1000+ Top Customers' Choice

Observability Metrics 1000+

See Our Unified Intelligent Observability Platform in Action!