材料数据对于国家安全、工程服役安全、科技创新、智能制造等方面的重要性在数据时代越来越彰显出来,在2011 年美国提出的具有变革意义的材料基因组计划中,材料数据与材料计算模拟、材料实验表征一起,为材料发展全流程研究的三大基本工具,使材料研究者与生产管理者进一步充分认识材料数据对加速材料研发进程的推动作用。材料数据具有多样、获取过程复杂、数据间关联关系复杂、知识产权性强等特点,使数据的收集、存储、共享和应用更加复杂。本文就大数据时代下的材料数据的特点、分类、材料数据库的国内外现状对比、中国发展材料数据库与材料数据科学的意义、今后主要发展方向以及存在的问题等方面进行系统分析,提出建设国家材料数据研发与服务公共平台,加大材料数据的收集整合力度,构建国家民用材料数据库与军用材料数据库,开展材料数据及材料数据库相关标准规范建设、定制性专题数据库服务、数据推送服务,同时开展材料信息学、材料数据学等方面的研究,开创并构建材料数据科学这一材料领域新学科。
Materials data play an increasingly vital role in national security, performance safety, scientific and technological innovation and smart manufacturing in the age of information technology. In 2011, the Materials Genome Initiative (MGI) was launched in the US, of which materials data together with materials computation and materials experimentation and characterization consist of the three tools for accelerating the materials development continuum and reducing the cost. Both the researchers and production managers come to realize the significant role materials data play. The attributes of materials data, such as variety, complex interrelationship, acquiring process as well as the intellectual property issues, drive the process of collection, storage and application ever more complicated. In this paper, the characteristics, classification and status quo of materials data are described. The strategies and obstacles are systematically analyzed for materials data development and database construction. A national platform of materials data for research and public service, a materials data hub in China, is essential and urgently required for MGI implementation. Four aspects are emphasized, that is, materials data repository, infrastructure and cloud service, data mining and international collaboration. The materials data repositories for civil and military uses will be constructed. On the platform, the standards are crucial for materials data and database and big data application, which need to be set up first. The customized database and data push will bring great benefit for materials database users. Materials data science will definitely become a brand new subject in materials science, including materials informatics and materials dataology.
[1] Hey T. The fourth paradigm: Data- intensive scientific discovery[M]. Redmond, WA: Microsoft Research, 2009: 109-130.
[2] 维克托·迈尔-舍恩伯格, 肯尼思·库克耶. 大数据时代:生活、工作与思维的大变革[M]. 盛杨燕, 周涛, 译. 杭州: 浙江人民出版社, 2013: 2-18. Mayer- Schönberger V, Cukier K. Big data: A revolution that will transform how we live, work and think[M]. Sheng Yangyan, Zhou Tao, trans. Hangzhou: Zhejiang People's Publishing House, 2013: 2-18.
[3] Xiang X D, Sun X, Briceno G, et al. A combinatorial approach to materials discovery[J]. Science, 1995, 268(5218): 1738-1740.
[4] Liu Z K, Chen L Q, Raghavan P, et al. An integrated framework for multi- scale materials simulation and design[J]. Journal of Computer- Aided Materials Design, 2004, 11(2-3): 183-199.
[5] Olsen G B. Pathways of discovery designing a new material world[J]. Science, 2000, 228(12): 933-998.
[6] Yang Y, Lin T, Weng X L, et al. Data flow modeling, data mining and QSAR in high- throughput discovery of functional nanomaterials[J]. Computers & Chemical Engineering, 2011, 35(4): 671-678.
[7] 王盼盼, 杨迪. 阿里巴巴拟以大数据推动智能制造[EB/OL]. 2014-11- 01. http://news.xinhuanet.com/fortune/2014-11/01/c_1113074082.htm. Wang Panan, Yang Di. Alibaba intends to promote smart manufacturing with big data [EB/OL]. 2014-11-01. http://news.xinhuanet.com/fortune/ 2014-11/01/c_1113074082.htm.
[8] 师昌绪. 材料大辞典[M]. 北京: 化学工业出版社, 1994. Shi Changxu. Materials comprehensive dictionary[M]. Beijing: Chemical Industry Press, 1994.
[9] 36大数据. 中国大数据行业面临的五大挑战以及应对策略[EB/OL]. 2014.11.21. http://www.36dsj.com/archives/17137. 36 big data. The strategies for the five challenges the big data industry are facing[EB/OL]. 2014.11.21. http://www.36dsj.com/archives/17137.
[10] Kietzmann J, Pitt L, Berthon P. Disruptions, decisions, and destinations: Enter the age of 3-D printing and additive manufacturing[J]. Business Horizons, 2014. http://dx.doi.org/10.1016/j.bushor.2014.11.005.
[11] Chou D T, Wells D, Hong D, et al. Novel processing of iron-manganese alloy-based biomaterials by inkjet 3-D printing[J]. Acta Biomaterialia, 2013, 9(10): 8593-8603.
[12] 胡良霖. 科学数据资源的质量控制和评估[J]. E-Science, 2009(1): 50-55. Hu Lianglin. The quality control and evaluation of scientific data resources[J]. E-Science, 2009(1): 50-55.
[13] Sargent P. Data quality in materials information systems[J]. Computer- Aided Design, 1992, 24(9): 477-490.
[14] Bhagwat S A, Patterson K Y, Holden J M. Validation study of the USDA's data quality evaluation system[J]. Journal of Food Composition and Analysis, 2009, 22(5): 366-372.
[15] 罗芳. 土地利用数据综合结果的质量评价[D]. 武汉: 武汉大学, 2013. Luo Fang. Quality evaluation for land use data generalization[D]. Wuhan: Wuhan University, 2013.
[16] 苗海芳. 多元交通信息数据质量评价和控制方法研究[D]. 长春: 吉林大学, 2014. Miao Haifang. Research on evaluation and control methods on multisource traffic information data[D]. Changchun: Jilin University, 2014.
[17] Curtarolo S, Morgan D, Persson K, et al. Predicting crystal structures with data mining of quantum calculations[J]. Physical Review Letters, 2003, 91(13): 135503.
[18] 黎建辉. 科学数据共享的新机制与新趋势[C]//第507次香山会议. 北京, 2014-10-22. Li Jianhui. New mechanism and trend of scientific data sharing[C]// No. 507 Xiangshang Science Conference, Beijing, 2014-10-22.
[19] Morgan D, Ceder G, Curtarolo S. High-throughput and data mining with ab initio methods[J]. Measurement Science and Technology, 2005, 16(1): 296.
[20] 朱扬勇, 熊赟. 数据学[M]. 上海: 复旦大学出版社, 2009. Zhu Yangyong, Xiong Yun. Dataology[M]. Shanghai: Fudan University Press, 2009.
[21] National Research Council (US). Committee on Integrated Computational Materials Engineering. Integrated computational materials engineering: A transformational discipline for improved competitiveness and national security[M]. Washington, DC: National Academies Press, 2008.
[22] Robinson L. New TMS study tackles the challenge of integrating materials simulations across length scales[J]. JOM, 2014, 66(8): 1356- 1359.
[23] Sally T, David L M, Amanda B, et al. Sharing data in materials science[J]. Nature, 2013, 503: 463-464.