2022 Report on Big Data Analysis Platforms in China

Source:iResearchJanuary 05,20236:06 PM Overview

Market situation: The industry boundary is increasingly blurred. There are many players in the market. According to deployment model, architecture classification and capacity replenishment, they can be divided into five types. 1) Public cloud vendors focusing on cloud data lake solutions; 2) traditional software service providers focusing on local big data analysis platforms; 3) Database/data warehouse vendors that provide lightweight data warehouse architecture; 4) Software vendors that provide service capabilities for the data application layer; 5) AI vendors that improve data application capabilities. The industry market is in a state of competition and cooperation.

Architecture selection: Before building platforms, users need to have a clear understanding of their data volume and business scenario demand. After finding out the basic functions needed, determine the big data processing framework and tools used in the platform construction. Component selection and overall construction of the data analysis layer are crucial to the architecture. In particular, the choice of the storage engine directly determines the support for offline, online, and real-time scenarios and computing power efficiency.

Trends: In the traditional architecture, separated data lake and data warehouse lead to data silos, resulting in problems in implementation, operation and maintenance, and costs. The data lake and data warehouse integration forms an integrated architecture at the data and query levels, making breakthroughs in real-time and concurrency, cluster scale, and unstructured data integration, and solving problems such as the long modeling path, and weak data consistency. Meanwhile, the platforms can integrate AI self-learning and adaptive capabilities, and enhance the analysis and decision-making capabilities of data users.

Industry Definition

Big data analysis platforms are used by enterprise to analyze and make decisions in a big data environment. From the perspective of technical architecture,  a big data analysis platform mainly consists of three levels, data acquisition and storage, computation, analysis and decision-making.  From the perspective of service boundary, the concept of big data analysis platform is smaller than that of data center. It emphasizes the data analysis and decision-making capabilities of the platform and attaches less importance to the planning, governance and services of the data. On the basis of OLAP, it integrates technologies such as deep learning. While increasing the depth and breadth of data analysis, it also largely increases the friendliness of data services and lowers the threshold on the business side. It can meet enterprises' demand for real-time enterprise-level wide table analysis, real-time BI report analysis, user behavior analysis, self-service analysis, AI analysis, and so on.

Industry Map

Products of vendors in the upstream, middle stream and downstream of the industry chain overlap with products of midstream big data analysis vendors

Types of Players

Big data analysis platforms gradually shift from product form to integration form. The industry market has a lot of players and many service types. The boundary of the industry is increasingly blurred. The players can be divided into the following types. 1) Public cloud vendors use cloud-native capabilities for the natural evolution of disaggregated storage and compute architecture, and provide data lake solutions that can facilitate access to various types of data and reduce storage, operation and maintenance costs. 2) Different from cloud vendors that provide services in the form of PaaS, traditional software vendors provide integrated big data analysis platform solutions based on local deployment. 3) Chinese domestic database and data warehouse vendors integrate innovative technologies to independently research and develop products and architectures with excellent storage and analysis performance. 4) Software vendors, which provide the application layer of data analysis platforms with capabilities such as BI analysis, user portrait, intelligent operation and visual publishing, work with market participants mentioned above to establish a cooperative ecology. In addition, AI vendors provide AI capabilities to extend the application of data, making the process of data access, cleaning, storage, analysis, training, and visual output more automated, enhancing the adaptability of data analysis in different scenarios.

Trend: Architecture Evolution

With open data architecture and management model, data lake and data warehouse integration build data warehouses on data lakes to combine their advantages to improve enterprises’ basic technology stack. The integration connects underlying heterogeneous data sources/platforms, supports the coexistence of different data types, and realizes data sharing. Data in the lake can be processed directly to reduce data computing, network and costs caused by data redundancy and flow. Compared to traditional data warehouse and data lake solutions, the lake and warehouse integration architecture can enhance the real-time business processing ability and unstructured data governance. The advantages mainly include 1) Perfect data management capability; 2) Strong computing engine support; 3) More real-time data; 4) Increased openness. In addition, necessary functions for enterprise-level systems, such as data security, access control, and data exploration are all deployed, tested and managed in the integrated architecture of data warehouse and lake.

Trend 2: AI Integration

Big data analysis has been evolving thanks to the development of artificial intelligence, improving data users’ analysis and decision-making ability from multiple levels and multiple dimensions. Although the business environment for enterprises has changed dramatically since the outbreak of Covid-19, AI and machine learning have always been important. As business decision-making becomes more connected, more continuous and more scenario-oriented, enterprises adapt, resist or absorb various factors through AI engineering orchestration and system optimization to improve adaptive AI capabilities. In this way, they can quickly adapt to scenario changes and realize faster and more flexible decision-making. NLP enhances the accurate recognition, analysis and processing of natural language by computer systems, turning search-based analysis into a new visual interaction method. System intelligence converts questions of natural language structure into SQL statements for query, which improves ease of use and self-service level, and make it easier for business personnel to use.

Architecture Selection

In a broad sense, big data analysis platforms are no longer limited to the product form. They are increasingly like the integration of data application layer, storage layer, scheduling layer, computing layer, interactive analysis layer, data service layer and so on. From the perspective of technical architecture, all big data analysis platforms' architecture belongs to Lambda or Kappa. From the perspective of the scenario, the architecture can be divided into offline, online and real-time analysis architecture. In the bottom-up hierarchical integration state, the difference between the three analysis architectures is mainly caused by the selection of storage and computing engines in the data analysis layer. From the perspective of technology, the deployment of the data analysis layer is the most complex and innovative. It not only has the features of separation of storage and computing and elastic expansion and contraction of cloud-native data lakes but also has platform decoupling based on docker technology under local deployment, which solves the problem of insufficient elasticity of physical server resource supply and supports the horizontal expansion of storage and computing capabilities. In terms of implementation, user analysis scenarios are converging. There are not only fusion framework of HTAP data warehouse solutions but also big data analysis platforms that integrate AP and TP scenarios. Users can choose based on their needs.

Table of Contents of the Full Report

Abstract


1 Overview of the Big Data Analysis Platform Industry
1.1 Industry Definition
1.2 Technology Evolution (1)
1.3 Technology Evolution (2)
1.4 Core Application
1.5 Core Product (1)
1.6 Core Product (2)
1.7 Core Value
1.8 Evaluation System


2 Analysis of the Big Data Analysis Platform Market
2.1 Develop History
2.2 Driving Factors
2.2.1 Policy Factors
2.2.2 Macro Factors
2.2.3 Micro Factors
2.3 Industry Map
2.4 Business Models
2.5 Market Structure
2.6 Comparison of Overseas and Domestic Markets
2.7 Pain Points of Application
2.8 Trend
2.8.1 AI Integration
2.8.2 Architecture Evolution
2.8.3 Diversified Scenarios

3 Suggestions on Building Big Data Analysis Platforms
3.1 General Idea
3.2 Capability Building
3.3 Deployment Method
3.4 Architecture Selection
3.5 Component Selection
3.6 Technology Trends


4 Industry Application and Typical Cases  
4.1 Industry-Overview
4.2 Industry-Government Affairs
4.3 Industry-Finance
4.4 Industry-Retail
4.5 Industry-Healthcare
4.6 Industry-Transportation
4.7 Industry-Education
4.8 Case Study-Arctic Data


5 Analysis of Investment in the Big Data Analysis Industry
5.1 Analysis of the Overall Market 
5.2 Analysis of Investment Rounds 

5.3 Analysis of Investment Cycle
5.4 Analysis of Investment Risks

Contents

Industry definition: Big data analysis platforms gradually change from product form to integration form. The industry's boundaries are blurred. From the perspective of technical architecture, a big data analysis platform mainly consists of three levels, data acquisition and storage, computation, analysis and decision-making. On the basis of OLAP, it integrates technologies such as deep learning. While increasing the depth and breadth of data analysis, it largely increases the friendliness of data services and lowers the threshold on the business side, enabling users to drive business development by data analysis.

Related Reports
Close
Top Reports
Contact
Beijing Office
3/F, Tower B, Guanghualu SOHO II, No. 9 Guanghua Road, Chaoyang District, Beijing, 100020 Phone: +86 18610937103
Services Related:wanghe9@iresearch.com.cn Media Interview: vikdong@iresearch.com.cn