语言交互能力是人类认知发展、终身学习的基础,这为人类开启了智慧之门。人工智能时代,语言交互也将是人类和机器之间表达思想、交流知识、相互沟通的重要工具,这就需要让机器听懂复杂场景下的人类语言并且适应人类几千年进化形成的远场语音交互习惯,从而让机器真正认知人类世界,为机器产生类人智能提供一种参考。
The ability of language is a basis of human cognitive development and lifelong learning, which opens the door for human wisdom. In the era of artificial intelligence, language is also an indispensible tool for the machine to express ideas, exchange knowledge and communicate with human world. The key to make the machine truly recognize the human world is to let the machine not only understand human language in complex scenarios but also adapt to the far-field voice interaction habits that have been formed by human evolution for thousands of years. This article hopes to provide a reference for development of machines with human intelligence.
[1] Jackson H, Stockwell P. An introduction to the nature and functions of language[M]. New York & London:Continuum International Publishing Group, 2010.
[2] Jurafsky D, Martin J H. Speech and language processing:An introduction to natural language processing, computational linguistics, and speech recognition[J]. 2000, 36(23):161-187.
[3] Keshet J, Bengio S. Automatic speech and speaker recognition:Large margin and kernel methods[M]. West Susse:Wiley, 2009.
[4] Huang X, Acero A, Hon H W. Spoken language processing:A guide to theory, algorithm, and system development[M]. New Jersey:Prentice Hall, 2001.
[5] Rabiner L, Juang B H. Fundamentals of speech recognition[M]. Beijing:Tsinghua University Press, 1999.
[6] Jurafsky D, Martin J H. Speech and language processing:An introduction to natural language processing, computational linguistics, and speech recognition[J]. 2000, 36(23):161-187.
[7] Li D, Dong Y. Deep learning:Methods and applications[J]. Foundations and Trends® in sigal processing, 2014,7(3-4).
[8] Angus J, Howard D. Acoustics and Psychoacoustics, 3rd edition[J]. Elsevier Ltd Oxford, 2016, 54:365-436.
[9] 程建春. 声学原理[M]. 北京:科学出版社, 2012. Cheng Jianchun. Acoustics principle[M]. Beijing:Science Press, 2012.
[10] Everest F A, Pohlmann K C. Master handbook of acoustics[M]. New York:McGraw-Hill, 2001.
[11] Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules[J]. Neural Information Processing Systems, 2017.
[12] Beranek L L, Mellow T J. Acoustics:Sound fields and transducers[M]. Oxford& Waltham:Elsevier, 2012:449-479.
[13] Ma G, Yang M, Sheng P, et al. Acoustic metamaterial with simultaneously negative effective mass density and bulk modulus:US, US 8857564 B2[P]. 2014.
[14] Greif S, Zsebök S, Schmieder D, et al. Acoustic mirrors as sensory traps for bats[J]. Science, 2017, 357(6355):1045.
[15] Jordan M I, Mitchell T M. Machine learning:Trends, perspectives, and prospects[J]. Science, 2015, 349(6245):255-260.