专题论文

基于复杂网络视角的科学文献数据分析

  • 周建林 ,
  • 牛琪锴 ,
  • 曾安 ,
  • 樊瑛 ,
  • 狄增如
展开
  • 北京师范大学系统科学学院, 北京 100875
周建林,博士,研究方向为复杂网络,电子信箱:jianlinzhou@mail.bnu.edu.cn

收稿日期: 2017-05-11

  修回日期: 2017-12-05

  网络出版日期: 2018-04-27

基金资助

国家自然科学基金项目(61374175,61573065);北京市自然科学基金项目(L160008)

Analysis of scientific literature database from a perspective of complex network

  • ZHOU Jianlin ,
  • NIU Qikai ,
  • ZENG An ,
  • FAN Ying ,
  • DI Zengru
Expand
  • School of Systems Science, Beijing Normal University, Beijing 100875, China

Received date: 2017-05-11

  Revised date: 2017-12-05

  Online published: 2018-04-27

摘要

综述了科学文献数据的复杂网络表现形式,介绍了科学家合作网络和科学引文网络的拓扑结构性质及演化模式和演化机制,概述了论文和科学家的相关评价方法。分析表明,基于复杂网络视角对科学文献数据的分析,能解释很多有意义的研究问题及有趣的现象。

本文引用格式

周建林 , 牛琪锴 , 曾安 , 樊瑛 , 狄增如 . 基于复杂网络视角的科学文献数据分析[J]. 科技导报, 2018 , 36(8) : 55 -64 . DOI: 10.3981/j.issn.1000-7857.2018.08.006

Abstract

Scientific literature data cover the complete information of papers and authors. Facing the massive scientific literature data, traditional statistical analysis methods cannot fully explore the information hidden behind the data without the help of other analysis methods. The interactions in scientific literature data, such as citation between papers and co-authorship between scientists, allow for the construction of different forms of complex networks (citation networks, collaboration networks, etc.), which can allow us to distinguish the effective information hidden in the scientific literature data based on network analysis. This paper summarizes the complex network forms of scientific literature data and highlights the topological properties, evolution patterns as well as evolution mechanisms of scientific collaboration networks and scientific citation networks. As impact evaluation of papers and scientists has attracted so much attention from researchers for a long time,, we also briefly summarize the related evaluation methods of papers and scientists. From the perspective of complex network, it can also explain many meaningful questions and interesting phenomena in scientific literature data, such as the shift of scientists' research interests and the sleeping beauties. In the future, the method of network analysis must be able to achieve more abundant research results in mining scientific literature data.

参考文献

[1] 吴亚晶, 张鹏, 狄增如, 等. 二分网络研究[J]. 复杂系统与复杂性科学, 2010, 7(1):1-12. Wu Yajing, Zhang Peng, Di Zengru, et al. Study on bipartite networks[J]. Complex Systems and Complexity Science, 2010, 7(1):1-12.
[2] Kessler M M. Bibliographic coupling between scientific papers[J]. American Dcumentation, 1963, 14(1):10-25.
[3] Newman M E J. The structure of scientific collaboration networks[J]. Proceedings of the National Academy of Sciences, 2001, 98(2):404-409.
[4] Newman M E J. Scientific collaboration networks. I. Network construction and fundamental results[J]. Physical Review E, 2001, 64(1):016131.
[5] Newman M E J. Scientific collaboration networks. Ⅱ. Shortest paths, weighted networks, and centrality[J]. Physical Review E, 2001, 64(1):016132.
[6] Barabási A L, Jeong H, Néda Z, et al. Evolution of the social network of scientific collaborations[J]. Physica A:Statistical Mechanics and its Applications, 2002, 311(3):590-614.
[7] Fan Y, Li M, Chen J, et al. Network of econophysicists:A weighted network to investigate the development of econophysics[J]. International Journal of Modern Physics B, 2004, 18(17-19):2505-2511.
[8] Redner S. How popular is your paper? An empirical study of the citation distribution[J]. The European Physical Journal BCondensed Matter and Complex Systems, 1998, 4(2):131-134.
[9] Lehmann S, Lautrup B, Jackson A D. Citation networks in high energy physics[J]. Physical Review E, 2003, 68(2):026113.
[10] Golosovsky M, Solomon S. Runaway events dominate the heavy tail of citation distributions[J]. The European Physical Journal Special Topics, 2012, 205(1):303-311.
[11] Peterson G J, Pressé S, Dill K A. Nonuniversal power law scaling in the probability distribution of scientific citations[J]. PNAS, 2010, 107(37):16023-16027.
[12] Li M, Fan Y, Chen J, et al. Weighted networks of scientific communication:The measurement and topological role of weight[J]. Physica A:Statistical Mechanics and its Applications, 2005, 350(2):643-656.
[13] Newman M E J. Coauthorship networks and patterns of scientific collaboration[J]. PNAS, 2004, 101(Suppl 1):5200-5205.
[14] Tomassini M, Luthi L. Empirical analysis of the evolution of a scientific collaboration network[J]. Physica A:Statistical Mechanics and its Applications, 2007, 385(2):750-764.
[15] Šubelj L, Fiala D, Bajec M. Network-based statistical comparison of citation topology of bibliographic databases[J]. Scientific Reports, 2014, 4:6496.
[16] Girvan M, Newman M E J. Community structure in social and biological networks[J]. PNAS, 2002, 99(12):7821-7826.
[17] Newman M E J, Girvan M. Finding and evaluating community structure in networks[J]. Physical Review E, 2004, 69(2):026113.
[18] Lužar B, Levnajic Z, Povh J, et al. Community structure and the evolution of interdisciplinarity in Slovenia's scientific collaboration network[J]. PloS One, 2014, 9(4):e94429.
[19] Evans T S, Lambiotte R, Panzarasa P. Community structure and patterns of scientific collaboration in business and management[J]. Scientometrics, 2011, 89(1):381-396.
[20] Velden T, Lagoze C. The extraction of community structures from publication networks to support ethnographic observations of field differences in scientific communication[J]. Journal of the American Society for Information Science and Technology, 2013, 64(12):2405-2427.
[21] Zhang P, Li M, Wu J, et al. The analysis and dissimilarity comparison of community structure[J]. Physica A:Statistical Mechanics and its Applications, 2006, 367:577-585.
[22] 徐玲, 胡海波, 汪小帆. 一个中国科学家合作网的实证分析[J]. 复杂系统与复杂性科学, 2009, 6(1):20-28. Xu Ling, Hu Haibo, Wang Xiaofan, Empirical analysis of a China scientists collaboration network[J]. Complex Systems and Complexity Science, 2009, 6(1):20-28.
[23] Chen P, Redner S. Community structure of the physical review citation network[J]. Journal of Informetrics, 2010, 4(3):278-290.
[24] Gopalan P K, Blei D M. Efficient discovery of overlapping communities in massive networks[J]. Proceedings of the National Academy of Sciences, 2013, 110(36):14534-14539.
[25] Newman M E J. Assortative mixing in networks[J]. Physical Review Letters, 2002, 89(20):208701.
[26] Foster J G, Foster D V, Grassberger P, et al. Edge direction and the structure of networks[J]. PNAS, 2010, 107(24):10815-10820.
[27] Newman M E J. Mixing patterns in networks[J]. Physical Review E, 2003, 67(2):026126.
[28] Ramasco J J, Dorogovtsev S N, Pastor-Satorras R. Self-organization of collaboration networks[J]. Physical Review E, 2004, 70(3):036106.
[29] Martin T, Ball B, Karrer B, et al. Coauthorship and citation patterns in the Physical Review[J]. Physical Review E, 2013, 88(1):012814.
[30] Zhai L, Li X, Yan X, et al. Evolutionary analysis of collaboration networks in the field of information systems[J]. Scientometrics, 2014, 101(3):1657-1677.
[31] Li J, Li Y. Patterns and evolution of coauthorship in China's humanities and social sciences[J]. Scientometrics, 2015, 102(3):1997-2010.
[32] Liu P, Xia H. Structure and evolution of co-authorship network in an interdisciplinary research field[J]. Scientometrics, 2015, 103(1):101-134.
[33] Newman M E J. The first-mover advantage in scientific publication[J]. Europhysics Letters, 2009, 86(6):68001.
[34] van Raan A F J. Sleeping beauties in science[J]. Scientometrics, 2004, 59(3):467-472.
[35] Newman M E J. Clustering and preferential attachment in growing networks[J]. Physical Review E, 2001, 64(2):025102.
[36] Jeong H, éda Z, Barabási A L. Measuring preferential attachment in evolving networks[J]. Europhysics Letters, 2003, 61(4):567.
[37] Börner K, Maru J T, Goldstone R L. The simultaneous evolution of author and paper networks[J]. PNAS, 2004, 101(Suppl 1):5266-5273.
[38] Li M, Wu J, Wang D, et al. Evolving model of weighted networks inspired by scientific collaboration networks[J]. Physica A:Statistical Mechanics and its Applications, 2007, 375(1):355-364.
[39] Price D S. A general theory of bibliometric and other cumulative advantage processes[J]. Journal of the American society for Information science, 1976, 27(5):292-306.
[40] Wu Z X, Holme P. Modeling scientific-citation patterns and other triangle-rich acyclic networks[J]. Physical Review E, 2009, 80(3):037101.
[41] Medo M, Cimini G, Gualdi S. Temporal effects in the growth of networks[J]. Physical Review Letters, 2011, 107(23):238701.
[42] Golosovsky M, Solomon S. Stochastic dynamical model of a growing citation network based on a self-exciting point process[J]. Physical Review Letters, 2012, 109(9):098701.
[43] Krapivsky P L, Redner S. Organization of growing random networks[J]. Physical Review E, 2001, 63(6):066123.
[44] Krapivsky P L, Redner S. Network growth by copying[J]. Physical Review E, 2005, 71(3):036118.
[45] Zeng A, Shen Z, Zhou J, et al. The science of science:From the perspective of complex systems[J]. Physics Reports, 2017, (714/715):1-73.
[46] Hirsch J E. An index to quantify an individual's scientific research output[J]. PNAS, 2005, 102(46):16569-16572.
[47] Alonso S, Cabrerizo F J, Herrera-Viedma E, et al. h-Index:A review focused in its variants, computation and standardization for different scientific fields[J]. Journal of Informetrics, 2009, 3(4):273-289.
[48] Nykl M, Campr M, Ježek K. Author ranking based on personalized PageRank[J]. Journal of Informetrics, 2015, 9(4):777-799.
[49] Yan E, Ding Y. Applying centrality measures to impact analysis:A coauthorship network analysis[J]. Journal of the American Society for Information Science and Technology, 2009, 60(10):2107-2118.
[50] Yan E, Ding Y. Discovering author impact:A PageRank perspective[J]. Information Processing & Management, 2011, 47(1):125-134.
[51] Ding Y, Yan E, Frazho A, et al. PageRank for ranking authors in co-citation networks[J]. Journal of the American Society for Information Science and Technology, 2009, 60(11):2229-2243.
[52] Ding Y. Applying weighted PageRank to author citation networks[J]. Journal of the American Society for Information Science and Technology, 2011, 62(2):236-245.
[53] Radicchi F, Fortunato S, Markines B, et al. Diffusion of scientific credits and the ranking of scientists[J]. Physical Review E, 2009, 80(5):056103.
[54] Garfield E. Citation Indexes for Science:A new dimension in documentation through association of ideas[J]. Science, 1955, 122(3159):108-111.
[55] Radicchi F, Fortunato S, Castellano C. Universality of citation distributions:Toward an objective measure of scientific impact[J]. PNAS, 2008, 105(45):17268-17272.
[56] Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine[J]. Computer Networks & Isdn Systems, 1998, 30(98):107-117.
[57] Bollen J, Rodriquez M A, van de Sompel H. Journal status[J]. Scientometrics, 2006, 69(3):669-687.
[58] Chen P, Xie H, Maslov S, et al. Finding scientific gems with Google's PageRank algorithm[J]. Journal of Informetrics, 2007, 1(1):8-15.
[59] Fiala D, Rousselot F, Ježek K. PageRank for bibliographic networks[J]. Scientometrics, 2008, 76(1):135-158.
[60] Ma N, Guan J, Zhao Y. Bringing PageRank to the citation analysis[J]. Information Processing & Management, 2008, 44(2):800-810.
[61] Su C, Pan Y T, Zhen Y N, et al. PrestigeRank:A new evaluation method for papers and journals[J]. Journal of Informetrics, 2011, 5(1):1-13.
[62] Fiala D. Time-aware PageRank for bibliographic networks[J]. Journal of Informetrics, 2012, 6(3):370-388.
[63] Nykl M, Ježek K, Fiala D, et al. PageRank variants in the evaluation of citation networks[J]. Journal of Informetrics, 2014, 8(3):683-692.
[64] Walker D, Xie H, Yan K K, et al. Ranking scientific publications using a model of network traffic[J]. Journal of Statistical Mechanics:Theory and Experiment, 2007, 2007(6):P06010.
[65] Yao L, Wei T, Zeng A, et al. Ranking scientific publications:The effect of nonlinearity[J]. Scientific Reports, 2014, 4:6663.
[66] Zhou J, Zeng A, Fan Y, et al. Ranking scientific publications with similarity-preferential mechanism[J]. Scientometrics, 2016, 106(2):805-816.
文章导航

/