ارائه راهکاری خودکار براساس متن کاوی برای شناخت و تحلیل‌ روند تحقیقات حوزه‌های علمی

نوع مقاله: مقاله پژوهشی

نویسندگان

1 دانشجوی دکتری مهندسی فناوری اطلاعات، پژوهشگاه علوم و فناوری اطلاعات ایران (ایرانداک)، تهران، ایران

2 استادیار پژوهشگاه علوم و فناوری اطلاعات ایران (ایرانداک)، تهران، ایران

3 استاد مهمان پژوهشگاه علوم و فناوری اطلاعات ایران (ایرانداک)

4 استاد تمام دانشگاه بوعلی سینا، همدان، ایران

چکیده

بررسی ‌روند تحقیقات یک حوزه علمی (در بازه‌های زمانی مختلف) می‌تواند درک بهتری را برای محققین و سیاست‌گذاران آن حوزه ایجاد نماید تا بتوانند برنامه‌ریزی مناسبی را جهت انجام تحقیقات آتی و تخصیص منابع پژوهشی داشته باشند. یکی از مهمترین رویکردها در تحلیل روند تحقیقات در یک حوزه، بررسی اسناد علمی منتشرشده در آن حوزه با استفاده از روش‌های علم‌سنجی و پیمایش اطلاعات و متون اسناد است. بنابراین تحلیل‌روند ابزار مناسب برای محققین و سیاست‌گذاران در اجرای فعالیت‌های آن‌ها است. از اینرو انتخاب روشی مناسب برای تحلیل وضعیت فعلی و پیش‌بینی آن حائز اهمیت است. با توجه به این نکته که دقت و جامعیت تحلیل‌روند از اهمیت ویژگی ای برخوردار است در این پژوهش رویکردی ارائه‌شده که با استفاده از روش‌های متن‌کاوی و اطلاعات کتابشناختی مقالات منتشرشده در یک حوزه، روند پژوهش در آن حوزه مورد مطالعه قرار می‌گیرد. در این پژوهش کلمات کلیدی استخراج شده از متون با استفاده از یک روش جدید برای محاسبه هم‌رخدادی، خوشه‌بندی می‌شوند. همچنین از ویژگی‌های روش پیشنهادی ارائه شاخص جدید در میزان بدست آوردن بلوغ و مرکزیت یک حوزه علمی در تحلیل‌روند و استفاده از میزان تاثیر و اهمیت کلیدواژه‌های استفاده شده در تشخیص حوزه‌های علمی است. برای بررسی و آزمایش روش پیشنهادی، مجموعه‌ای از مقالات حوزه مهندسی مکانیک طی سال‌های 2012 تا 2016 از پایگاه «وب آف ساینس » استخراج شده است. با بکارگیری شاخص‌های پیشنهادی می‌توان نشان داد که برخی حوزه‌ها به بلوغ خود رسیده‌اند و دیگر روند رو به رشدی ندارند. از طرفی دیگر خوشه‌هایی که با نرخ رشد بالایی در حال رشد هستند و هنوز از نظر میزان بلوغ در میانه راه قرار دارند، این خوشه‌ها نشان دهنده موضوع‌های در حال تکامل هستند.

کلیدواژه‌ها


عنوان مقاله [English]

A new automatic approach for research trend analysis based on scientific text mining

نویسندگان [English]

  • Ashkan Khatir 1
  • Azadeh Mohebi 2
  • Soheil Ganjefar 3 4
1 PhD Candidate in Information Technology Engineering, Iranian Research Institute for Information Scince and Technology (IranDoc), Tehran, Iran
2 Assistant Professor, Iranian Research Institute for Information Science and Technology (IranDoc), Tehran, Iran
3 Visiting Professor in Iranian Research Institute for Information Science and Technology (IranDoc) | Professor, Buali Sina University, Hamedan, Iran
4 Visiting Professor in Iranian Research Institute for Information Science and Technology (IranDoc) | Professor, Buali Sina University, Hamedan, Iran
چکیده [English]

Research trend analysis for a specific research area (through different time frames) can lead to a better understanding for researchers in that area, and for policy makers in research for contributing in assigning research funds and policies. An important and practical approach to analyzing research trends is to study and review research data and publications using scientometrics and research document processing. In this research, we have proposed a text mining approach for analysing research publications in a specific area, in order to analyze and identify important research topics.
In this paper, we propose a method to analysis a scientific trend. The proposed approach is based on clustering keywords through a new co-occurrence matrix. A new metric is also adopted to identify centrality and density (maturity) of a specific research area and identify keywords and contributory topics. For achieving this, we proposed a trend analysis method using co-word matrix with and for deeper analysis we use clustering and strategic diagram with propose indices.
In order to test and evaluate the proposed method, we use comparative evaluation method. In addition, for more analysis, we have selected research publications in a specific period of time (2012-2016) in Mechanical Engineering, which are extracted from WoS (Web of Sciences) database. We have applied the proposed metrics to evaluate research trends and identify contributory areas and topics in the selected documents. The comparative evaluation shows an improvement in proposed trend analysis method.

کلیدواژه‌ها [English]

  • Trend Analysis
  • Density and maturity of research area
  • Co-occurrence Matrix
  • Centrality of a research area

An, Xin Ying, and Qing Qiang Wu. 2011. "Co-word analysis of the trends in stem cells field based on subject heading weighting."  Scientometrics 88 (1):133-144.

Callon, Michel, Jean-Pierre Courtial, William A Turner, and Serge Bauin. 1983. "From translations to problematic networks: An introduction to co-word analysis."  Information (International Social Science Council) 22 (2):191-235.

Chang, Xing, Xin Zhou, Linzhi Luo, Chengjia Yang, Hui Pan, and Shuyang Zhang. 2017. "Hotspots in research on the measurement of medical students’ clinical competence from 2012-2016 based on co-word analysis."  BMC medical education 17 (1):162.

Chao, Chia-Chen, Jiann-Min Yang, and Wen-Yuan Jen. 2007. "Determining technology trends and forecasts of RFID by a historical review and bibliometric analysis from 1991 to 2005."  Technovation 27 (5):268-279.

Chen, Xiuwen, Jianming Chen, Dengsheng Wu, Yongjia Xie, and Jing Li. 2016. "Mapping the research trends by co-word analysis based on keywords from funded project."  Procedia Computer Science 91:547-555.

Choi, Changwoo, and Yongtae Park. 2009. "Monitoring the organic structure of technology based on the patent development paths."  Technological Forecasting and Social Change 76 (6):754-768.

Davies, David L, and Donald W Bouldin. 1979. "A cluster separation measure."  IEEE transactions on pattern analysis and machine intelligence (2):224-227.

Delecroix, Bertrand, and R Epstein. 2004. "Co-word analysis for the non-scientific information example of Reuters Business Briefings."  Data Science Journal 3:80-87.

Dimitriadou, Evgenia, Sara Dolničar, and Andreas Weingessel. 2002. "An examination of indexes for determining the number of clusters in binary data sets."  Psychometrika 67 (1):137-159.

Dunn, Joseph C. 1973. "A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters."

Guo, Daoyan, Hong Chen, Ruyin Long, Hui Lu, and Qianyi Long. 2017. "A co-word analysis of organizational constraints for maintaining sustainability."  Sustainability 9 (10):1928.

He, Qin. 1999. "Knowledge discovery through co-word analysis."  Library trends 48 (1):133-133.

Hu, Chang-Ping, Ji-Ming Hu, Sheng-Li Deng, and Yong Liu. 2013. "A co-word analysis of library and information science in China."  Scientometrics 97 (2):369-382.

Hu, Jiming, and Yin Zhang. 2015. "Research patterns and trends of Recommendation System in China using co-word analysis."  Information processing & management 51 (4):329-339.

Jain, Anil K. 2010. "Data clustering: 50 years beyond K-means."  Pattern recognition letters 31 (8):651-666.

Joung, Junegak, and Kwangsoo Kim. 2017. "Monitoring emerging technologies for technology planning using technical keyword based analysis from patent data."  Technological Forecasting and Social Change 114:281-292.

Khatir, Ashkan, and Soheil Ganjefar. 2016. "The Analysis of the Distribution and Focus of Keywords in Theses and Dissertations: The Compliance with Descriptors, Title, and Abstract."  Journal of Information Processing and Management.

Kung, Yen-Ying, Shinn-Jang Hwang, Tsai-Feng Li, Seong-Gyu Ko, Ching-Wen Huang, and Fang-Pey Chen. 2017. "Trends in global acupuncture publications: An analysis of the Web of Science database from 1988 to 2015."  Journal of the Chinese Medical Association 80 (8):521-525.

Larsen, SE, B Kronvang, J Windolf, and LM Svendsen. 1999. "Trends in diffuse nutrient concentrations and loading in Denmark: statistical trend analysis of stream monitoring data."  Water science and technology 39 (12):197-205.

Law, John, Serge Bauin, J Courtial, and John Whittaker. 1988. "Policy and the mapping of scientific change: A co-word analysis of research into environmental acidification."  Scientometrics 14 (3-4):251-264.

Leydesdorff, Loet, and Adina Nerghes. 2017. "Co‐word maps and topic modeling: A comparison using small and medium‐sized corpora (N< 1,000)."  Journal of the Association for Information Science and Technology 68 (4):1024-1035.

Lv, Peng Hui, Gui-Fang Wang, Yong Wan, Jia Liu, Qing Liu, and Fei-cheng Ma. 2011. "Bibliometric trend analysis on global graphene research."  Scientometrics 88 (2):399-419.

Müller, Andre Matthias, Carol A Maher, Corneel Vandelanotte, Melanie Hingle, Anouk Middelweerd, Michael L Lopez, Ann DeSmet, Camille E Short, Nicole Nathan, and Melinda J Hutchesson. 2018. "Physical Activity, Sedentary Behavior, and Diet-Related eHealth and mHealth Research: Bibliometric Analysis."  Journal of medical Internet research 20 (4):e122.

No, Hyun Joung, and Yongtae Park. 2010. "Trajectory patterns of technology fusion: Trend analysis and taxonomical grouping in nanobiotechnology."  Technological Forecasting and Social Change 77 (1):63-75.

Ozaydin, Bunyamin, Ferhat Zengul, Nurettin Oner, and Dursun Delen. 2017. "Text-mining analysis of mHealth research."  mHealth 3 (12).

Porter, Alan L. 1991. Forecasting and management of technology. Vol. 18: John Wiley & Sons.

Rokaya, Mahmoud, Elsayed Atlam, Masao Fuketa, Tshering C Dorji, and Jun-ichi Aoe. 2008. "Ranking of field association terms using co-word analysis."  Information processing & management 44 (2):738-755.

Rose, Stuart, Dave Engel, Nick Cramer, and Wendy Cowley. 2010. "Automatic keyword extraction from individual documents."  Text Mining: Applications and Theory:1-20.

Rousseeuw, Peter J. 1987. "Silhouettes: a graphical aid to the interpretation and validation of cluster analysis."  Journal of computational and applied mathematics 20:53-65.

Samadi kuchaksaraei, Ali, Hafez Mohammad hassanzadeh, and Farhad Shokraneh. 2013. "A bibliometric trend analysis of stem cells and regenerative medicine research output in Iran: comparison with the global research output."

Shen, Jun, Xiaoxia Li, and Zusha Gu. 2013. "Strategic Diagram Analysis Based on Knowledge Network." 2012 First National Conference for Engineering Sciences (FNCES 2012).

Suh, Yongyoon, and Jeonghwan Jeon. 2018. "Monitoring patterns of open innovation using the patent-based brokerage analysis."  Technological Forecasting and Social Change.

Tao, Hongzhi, Jianfeng Li, Tao Luo, and Cong Wang. 2017. "Research on topics trends based on weighted K-means." Electronics Information and Emergency Communication (ICEIEC), 2017 7th IEEE International Conference on.

Vartiainen, Pirkko. 2002. "On the principles of comparative evaluation."  Evaluation 8 (3):359-371.

White, George O, Orhun Guldiken, Thomas A Hemphill, Wu He, and Mehdi Sharifi Khoobdeh. 2016. "Trends in International Strategic Management Research From 2000 to 2013: text mining and bibliometric analyses."  Management International Review 56 (1):35-65.

Wu, Feng-Shang, Chun-Chi Hsu, Pei-Chun Lee, and Hsin-Ning Su. 2011. "A systematic approach for integrated trend analysis—The case of etching."  Technological Forecasting and Social Change 78 (3):386-407.

Zhang, Wei, Qingpu Zhang, Bo Yu, and Limei Zhao. 2015. "Knowledge map of creativity research based on keywords network and co-word analysis, 1992–2011."  Quality & Quantity 49 (3):1023-1038.

Zhao, Fangkun, Bei Shi, Ruixin Liu, Wenkai Zhou, Dong Shi, and Jinsong Zhang. 2018. "Theme trends and knowledge structure on choroidal neovascularization: a quantitative and co-word analysis."  BMC ophthalmology 18 (1):86.