Exclusive: Big data strategy

Big data technologies in open source software: A survey

  • JIANG Tian ,
  • QIAO Jialin ,
  • HUANG Xiangdong ,
  • WANG Jianmin
  • School of Software, Research Center for Big Data, Tsinghua University;National Engineering Laboratory for Big Data Software, Beijing 100084, China

Received date: 2019-11-08

  Revised date: 2020-01-31

  Online published: 2020-04-01


The Google's GFS and Big Table have broken the limitations of the technology of having to use the relational databases to manage the big data in the past decade. A number of open source big data management systems, such as the Apache Hadoop, carry the technology further by developing more matured technologies and applications. This paper reviews the big data management systems in addressing the usage scenarios of the OLTP and the OLAP based on the Apache software, and the state of art of the data storage engine, the data partition, the data replication, the distributed system protocol, together with a comparison of the pros and the cons of the current distributed file system, the key value store, and the time series database.

Cite this article

JIANG Tian , QIAO Jialin , HUANG Xiangdong , WANG Jianmin . Big data technologies in open source software: A survey[J]. Science & Technology Review, 2020 , 38(3) : 103 -114 . DOI: 10.3981/j.issn.1000-7857.2020.03.007


