Articles

Log mining, behavioral analysis and improvement of government website search system

  • YE Xiaorong ,
  • SHAO Qing
Expand
  • 1. Institute of Scientific and Technical Information of China, Beijing 100038, China;
    2. KNET Co., Ltd., Beijing 100190, China

Received date: 2014-10-22

  Revised date: 2015-03-30

  Online published: 2015-06-11

Abstract

In this paper, secondary development was conducted on the search system of one e-government website by adding the log mining module, behavioral analysis module and system improvement module, to improve the search quality and optimize website content. Log mining, processing and analysis of user behaviors have been achieved in the improved search system. The log mining module is able to record, filter and identify the query log. The behavioral analysis module analyzes the characteristics and rules of user behaviors from three aspects including the query process, clustering analysis and hotspot query words, and obtains the results of weights of the webpage and hotspot query words. The system improvement module makes the query results more precise, provides new function of search hotspot and personalized webpage, improves the content of e-government website, and exchanges the data with public opinion system. In this way, the search system and e-government websites will provide users with better service.

Cite this article

YE Xiaorong , SHAO Qing . Log mining, behavioral analysis and improvement of government website search system[J]. Science & Technology Review, 2015 , 33(11) : 94 -102 . DOI: 10.3981/j.issn.1000-7857.2015.11.017

References

[1] 詹圣君. 基于用户行为日志分析的搜索引擎排序算法研究[D]. 武汉: 湖北工业大学, 2011. Zhan Shengjun. Based on user behavior log analysis of search engine ranking algorithm[D]. Wuhan: Hubei University of Technology, 2011.
[2] 岑荣伟, 刘奕群, 张敏. 基于日志挖掘的搜索引擎用户行为分析[J]. 中文信息学报, 2010, 24(3): 49-54. Ceng Rongwei, Liu Yiqun, Zhang Min. Search engine user behavior analysis based on log mining[J]. Journal of Chinese Information Processing, 2010, 24(3): 49-54.
[3] 刘承启, 邓庚盛, 江婕. 基于用户行为分析的搜索引擎研究[J]. 计算机与现代化, 2008(9): 75-77. Liu Chengqi, Deng Gengsheng, Jiang jie. Research on search engine based on user behavior analysis[J]. Computer and Modernization, 2008 (9): 75-77.
[4] 国家信息中心网络政府研究中心. 中国政府网站发展数据报告(2012)[EB/OL]. (2012-12-06) [2013-09-01]. http://www.gwd.gov.cn/uploads/ worddownload/2012_development_report_of_governments'_website.pdf. E-government Research Center of State Information Center. Development data report of Chinese government website(2012) [EB/OL]. (2012- 12- 06) [2013- 09- 01]. http://www.gwd.gov.cn/uploads/worddownload/2012 _development_report_of_governments'_website.pdf.
[5] 中国软件测评中心. 2012年中国政府网站绩效评估总报告[EB/OL]. (2012- 12- 05) [2013- 09- 01]. http://www.cstc.org.cn/zhuanti/fbh2012/ zbg1/zbg.html. China Software Testing Center. The general report of Chinese government website performance evaluation in 2012[EB/OL]. (2012-12- 05) [2013-09-01]. http://www.cstc.org.cn/zhuanti/fbh2012/zbg1/zbg.html.
[6] 陈红涛, 杨放春, 陈磊. 基于大规模中文搜索引擎的搜索日志挖掘[J]. 计算机应用研究, 2008(6): 1663-1665. Chen Hongtao, Yang Fangchun, Chen Lei. Mining query log of largescale Chinese search engine[J]. Application Research of Computers, 2008(6): 1663-1665.
[7] 张磊, 李亚楠, 王斌. 网页搜索引擎查询日志的Session划分研究[J]. 中文信息学报, 2009, 23(2): 54-61. Zhan Lei, Li Yanan, Wang Bin. Session segmentation based on query logs of web search[J]. Journal of Chinese Information Processing, 2009, 23(2): 54-61.
[8] Heasoo H, Hady W L, Lise G, et al. Organizing user search histories[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(5): 912-925.
[9] 邱娣. 基于Web日志挖掘的用户信息需求识别研究[D]. 武汉: 华中师范大学, 2012. Qiu Di. Research on user information demand of recognition based on web log mining[D]. Wuhan: Central China Normal University, 2012.
[10] 叶小榕, 邵晴. 政府网站移动搜索的日志挖掘和个性化改进[J]. 科技导报, 2014, 32(36): 110-116. Ye Xiaorong, Shao Qing. Log mining and personalization improvements for mobile search system of government websites[J]. Science & Technology Review, 2014, 32(36): 110-116.
[11] Qian Xueming, Feng He, Zhao Guoshuai, et al. Personalized recommendation combining user interest and social circle[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 26(7): 1763- 1777.
[12] 宋宇轩. 基于搜索日志和点击日志的同义词挖掘的研究和实现[D]. 北京: 北京交通大学, 2011. Song Yuxuan. The research and implementation of synonyms mining method based on the search log and click log[D]. Beijing: Beijing Jiaotong University, 2011.
[13] 乐嘉锦, 姚岚. 基于Solr的体育视频信息全文搜索研究[J]. 计算机工程, 2012, 38(24): 269-273. Le Jiajin, Yao Lan. Research on full- text search of sports video information based on Solr[J]. Computer Engineering, 2012, 38(24): 269-273.
[14] The Apache Software Foundation. Public websites using Solr[EB/OL]. (2013-09-19) [2013-10-01]. http://wiki.apache.org/solr/PublicServers.
[15] Yadav D, Sonia S C, Jorge M, et al. An approach for spatial search using Solr[C]//Confluence 2013: The Next Generation Information Technology Summit (4th International Conference). Noida, India: IET, 2013: 202-208.
[16] 闻峥. 基于Lucene的搜索引擎优化[D]. 北京: 北京交通大学, 2011. Wen Zheng. Search engine optimization based on lucene[D]. Beijing: Beijing Jiaotong University, 2011.
[17] Saravanakumar K, Aswani K C. Optimized web search results through additional retrieval lists inferred using wordnet similarity measure[C]// International Conference on Data Mining and Intelligent Computing 2014. New Delhi, India: IEEE Conference Publications, 2014: 1-7.
[18] 王宏勇. 网络舆情热点发现与分析研究[D]. 成都: 西南交通大学, 2011. Wang Hongyong. Hot-topic detection and analysis on internet public opinion[D]. Chengdu: Southwest Jiaotong University, 2011.
Outlines

/