Skip to product information
1 of 8

PayPal, credit cards. Download editable-PDF and invoice in 1 second!

GB/T 38673-2020 English PDF (GBT38673-2020)

GB/T 38673-2020 English PDF (GBT38673-2020)

Regular price $205.00 USD
Regular price Sale price $205.00 USD
Sale Sold out
Shipping calculated at checkout.
Delivery: 3 seconds. Download true-PDF + Invoice.
Get QUOTATION in 1-minute: Click GB/T 38673-2020
Historical versions: GB/T 38673-2020
Preview True-PDF (Reload/Scroll if blank)

GB/T 38673-2020: Information technology -- Big data -- Basic requirements for big data systems
GB/T 38673-2020
GB
NATIONAL STANDARD OF THE
PEOPLE’S REPUBLIC OF CHINA
ICS 35.240
L 67
Information technology - Big data - Basic
requirements for big data systems
ISSUED ON: APRIL 28, 2020
IMPLEMENTED ON: NOVEMBER 01, 2020
Issued by: State Administration for Market Regulation;
Standardization Administration of the People’s Republic of
China.
Table of Contents
Foreword ... 3 
1 Scope ... 4 
2 Normative references ... 4 
3 Terms and definitions ... 4 
4 Abbreviations ... 5 
5 Big data system framework ... 5 
6 Functional requirements ... 7 
7 Non-functional requirements ... 14 
Information technology - Big data - Basic
requirements for big data systems
1 Scope
This Standard specifies the functional requirements and non-functional
requirements of big data systems.
This Standard is applicable to the design, model selection, acceptance and
testing of various big data system requirements.
2 Normative references
The following documents are indispensable for the application of this document.
For dated references, only the dated version applies to this document. For
undated references, the latest edition (including all amendments) applies to this
document.
GB/T 35295-2017, Information technology - Big data - Terminology
GB/T 35589-2017, Information technology - Big data - Technical reference
model
3 Terms and definitions
Terms and definitions determined by GB/T 35295-2017 and the following ones
are applicable to this document. For ease of use, some of the terms and
definitions in GB/T 35295-2017 are repeated below.
3.1 Big data system
The system that implements all or part of the big data reference architecture.
[GB/T 35295-2017, Definition 2.1.14]
3.2 Distributed computing
A computing mode that covers the storage layer and the processing layer and
is used to implement multi-type programming algorithm models.
c) It shall provide column conversion, row conversion and table conversion
functions of structured data;
d) It shall provide data loading function, to support the loading of cleaned and
converted data to the data analysis module;
e) It should provide data comparison function before and after cleaning;
f) It should support data conversion function of unstructured data.
6.3 Data storage module
The data storage module requirements are as follows:
a) It shall provide data storage function, to support the storage of structured
data, unstructured data and semi-structured data.
b) It shall provide the function of exchanging data or files with relational
databases and other file systems.
c) Support distributed file storage, to realize the following functions:
1) It shall support basic operations of the file system, including upload,
download, read and write, copy, move, delete, rename, permission
modification, etc.;
2) It shall support multi-copy storage and recovery functions of data blocks;
3) It should support the function of fast retrieval of files, and support the
unified retrieval, cataloging, adding and deleting operations of data
resources;
4) It should support data compression storage function.
d) Support distributed column data storage, to achieve the following functions:
1) It shall support the function of storing data in the form of key-value;
2) It should support user authority management functions that are based
on tables, column families, and columns. Authority management
operations include read, write, and create.
e) Support distributed structured data storage, to achieve the following
functions:
1) It should support distributed storage of structured data, to ensure the
scalability and consistency of data storage;
1) Built-in graph data query API, support synchronous or asynchronous
computing model to write iterative algorithms;
2) Online graph analysis and query function;
3) Graph data expression that is based on the attribute graph model,
including the label and attribute type definition on the node/edge;
4) Built-in common graph index calculation function, to describe the
topological structure characteristics of graphs.
d) It should support memory computing, to realize the following functions:
1) Provide data processing capabilities through distributed memory
computing and DAG execution engine;
2) Support multiple data types, including data processing of structured data,
unstructured data, and semi-structured data.
e) It should support the batch stream integration computing framework, to
achieve the following functions:
1) Batch stream integration unified query SQL language;
2) Streaming SQL in multiple scenarios, such as location information
analysis, etc.;
3) Common time windows, including jumping windows, sliding windows, etc.
f) It should support automatic scheduling of tasks according to the
dependencies between tasks.
g) It should support the description of multi-task dependencies within the job
in the form of a directed acyclic graph.
h) It should provide the ability to dispatch complex tasks.
6.5 Data analysis module
The data analysis module requirements are as follows:
a) Support data query, to realize the following functions:
1) It shall provide the function of querying through a standard database
connection interface;
2) It shall provide the function of querying through the REST API query
interface;
3) It should support data statistics on real-time streams;
4) It should support the sorting of streaming data;
5) It should support the association with static tables;
6) It should support the associated processing of multiple data streams.
f) It should support interactive on-line analysis, to achieve the following
functions:
1) Perform distributed on-line analysis of data through structured query
language, such as OLAP;
2) Perform ad hoc query of data through structured query language;
3) Use visualization middleware to display data analysis results;
4) Define the calculation formula and parameter configuration during the
interactive analysis process;
5) Automatically save and roll back during interactive analysis;
6) Save and publish analysis results during interactive analysis;
7) Interactive data analysis based on online on-line analysis.
g) It should support visual process editing operations, to achieve the
following functions:
1) Perform process editing and revision through drag;
2) Support workflow dispatch trigger mechanism, configurable trigger time
or trigger event;
3) Support the persistent storage of process editing results.
6.6 Data visualization module
The requirements of the visualization module are as follows:
a) It should support the use of conventional charts to display data, such as
tables, bar charts, pie charts, line charts, heat maps;
b) It should support the API of third-party data visualization tools.
6.7 Data access module
d) It shall provide service management functions, including the management
of big data system component services;
e) It should provide the health check management function, to support the
realization of cluster health check through a graphical interface.
7 Non-functional requirements
7.1 Reliability requirements
7.1.1 High availability
High availability requirements are as follows:
a) It shall provide the system automatic fault detection and management
functions;
b) It shall ensure that there is no single point failure risk for system
components;
c) When any node of the cluste...
View full details