Building an Operations and Monitoring System in the Wave of Digital Transformation: The Journey of China Datang Corporation Ltd
Bonree Data supports China Datang Corporation Ltd. (hereinafter referred to as "Datang Group") in building a unified operations and monitoring platform to ensure effective IT operations that support business continuity. At the same time, it helps reduce operational costs, better meet the enterprise's needs for digital asset management, and improve the visibility, controllability, and reliability of IT systems
With the rapid development of Datang Group's business and the acceleration of its digital transformation, the scale and complexity of its information systems continue to increase, making system management and maintenance increasingly challenging. The importance of Configuration Management Databases (CMDB) and monitoring systems in operations is becoming more prominent. Overall, traditional IT operations and monitoring face the following issues
1. Operations personnel are stuck in inefficient, reactive "firefighting" mode, with frequent alerts and recurring chain-reaction failures
2. Operations processes are disorganized, with low management efficiency and a lack of an effective IT operations mechanism
3. The complex web of IT components is difficult to monitor effectively
4. Traditional tools are limited in scope and lack centralized control capabilities
An integrated operations platform supports Datang Group's IT asset operations and management, enabling functions such as resource management, panoramic monitoring, operations automation, data analysis, and IT service management
1. Integrated IT Resource Management
The integrated IT resource management system provides full-stack device management capabilities, supporting automatic discovery and topology visualization of IDC cabinets, servers, and more. It includes a flexible relationship chain configuration tool that allows custom modules, configuration items (CIs), and architectural views to be defined, enabling hierarchical visualization of business systems. It also ensures data accuracy through automatic calibration using custom data sources, and provides open APIs to support system integration
2. Full-Stack Monitoring System
Covers hosts, networks, security, applications, storage devices, as well as middleware, databases, and message queues. It is compatible with major vendors such as Huawei, F5, and Sangfor, and supports systems like IPMI, Windows, and Linux. The platform provides visualized metrics such as CPU usage, memory, traffic, and packet loss, meeting monitoring needs across various scenarios
3. Automated Operations Management
Supports visual drag-and-drop workflow orchestration with nested sub-processes, tailored for operations scenarios. Includes management features for tasks, scripts, software, and inspections. Also supports grouping resources to categorize target objects for automated tasks.
4. Unified IT Service Management
Supports custom generation of service catalogs and service-oriented publishing of process models, with a unified workspace integrating various types of ticket management. Tickets are linked to the CMDB and cover the full lifecycle of incident, problem, change, and release management. The system also supports monitoring ticket progress, status operations, and process time alerts—enhancing operational efficiency and providing decision-making support
1. The intelligent observability platform provides full-stack, end-to-end, and all-scenario intelligent observability capabilities to support improved operational efficiency
2. Adhering to the "Customer First" philosophy, we offer 24/7 professional services
1. AI-Driven Anomaly Detection and Proactive Incident Management:
The unified intelligent observability platform leverages machine learning and artificial intelligence to deeply analyze and mine massive volumes of operations data. It automatically detects anomalies, accurately pinpoints root causes of failures, and intelligently predicts potential risks. This empowers operations teams to shift from reactive to proactive management, significantly reducing incident resolution time and enhancing overall system stability
2. Provides full-stack, full-link, and full-scenario intelligent observability
Enables end-to-end monitoring from the user side to the server side. The Bonree Agent intelligent probe supports automatic data collection for over 500 technical frameworks, and the SuperTrace technology enables full-link tracing for precise performance bottleneck identification
3. Unified intelligent observability platform with powerful scalability and flexibility
Supports multiple deployment methods to accommodate IT environments of varying scale and complexity. It also offers a rich set of plugins and API interfaces for integration with other systems, enabling functional expansion and centralized management