Using Machine Learning in Extracting the Scientific Similarity of Countries

Document Type : Original Article

Authors
1 Assistant Prof., Department of Computer Science, Faculty of Engineering and Technology, Payam Noor University, Tehran, Iran
2 MSc., Department of Computer Science, Faculty of Engineering and Technology, Payam Noor University, Tehran, Iran
10.22034/aimj.2025.459108.1592
Abstract
Today, the production of science is recognized as an important priority in all countries, because scientific development is the basis for the development of technology, and the development of technology is also the basis of economic growth and social welfare. For this reason, measuring the quantitative and qualitative level of scientific production of societies is very important. Scientometrics and bibliometrics are tools used to measure and evaluate scientific productions in societies. These types of studies and reviews have wide applications in various educational and research fields or for decision-making, policy-making and foresight in institutions and organizations. In this context, one of the useful tools is the Symgo database, which provides valuable data such as the scientific performance of the countries of the world in various scientific fields, and can be used as a suitable source of information for conducting such research. This database provides valuable information and data related to the scientific performance of different countries in various scientific fields and can be used as a scientific database for conducting such research. The purpose of this article is to find the scientific similarity of countries and scientific fields in a certain period of time based on two bibliometric indicators, namely the number of documents and the H-index. Then we will cluster using the similarity obtained by applying Louvain and Leiden community detection algorithms, based on which we will bring analysis. In this research, although the Silhouette value did not improve in the Leiden algorithm, we had a change in the Modularity discussion with a slight difference, and that is because of the nature of this algorithm, which works based on Modularity, and the execution time of the Leiden algorithm was significantly better than the Louvain algorithm.

Keywords


Anuar, S. H. H., Abas, Z. A., Yunos, N. M., Zaki, N. H. M., Hashim, N. A., Mokhtar, M. F., ... & Nizam, A. F. (2021, December). Comparison between Louvain and Leiden algorithm for network structure: a review. In Journal of Physics: Conference Series,  2129 (1). 012028. IOP Publishing.
Bedi, P. & Sharma, C. (2016). Community detection in social networks. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 6(3), 115-135.
Blondel, V. D., Guillaume, J. L., Lambiotte, R. & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10),10008.
Daradkeh, M. Abualigah, L. Atalla, S. & Mansoor, W. (2022). Scientometric analysis and classification of research using convolutional neural networks: A case study in data science and analytics. Electronics, 11(13),2066.
Dikusar A. & Cujba, R. (2024). Scientometric Approach in Determining the Role of Science in Socioeconomic Development of Society. Journal of Social Sciences, 7(2), 159–169.
Dollmann, M. M. (2023). Graph Clustering: A Comparison of Louvain and Leiden. Conf. Ser. 2129 012028.
Erfanmanesh, M., Jahromi, R. B. Hosseini, E. & Gholamhosseinzadeh, Z. (2013). Scientific productivity, impact and collaboration of the top Asian countries in Scopus during 1996-2010. Collnet Journal of Scientometrics and Information Management,  7(1), 97-110.
Gilad, G. & Sharan, R. (2023). From Leiden to Tel-Aviv University (TAU): exploring clustering solutions via a genetic algorithm. PNAS nexus, 2(6), pgad180.
Khokhlov, A. N. (2020). How scientometrics became the most important science for researchers of all specialties. Moscow University Biological Sciences Bulletin, 75(4), 159-163.
Noroozi Chakoli, A., Noroozi Chakoli, S. & Chehrenegar, L. (2023). Is there relationship between cultural-economic indicators and the scientific status of countries? Analysis of Western and Central Asian countries using a neural network algorithm. 27th International Conference on Science, Technology and Innovation Indicators (STI 2023).
Reyes, C. F. (2014). Growth of the number of indexed journals of Latin America and the Caribbean: The effect on the impact of each country. Scientometrics, 98,197-209.
Roldan-Valadez, E., Salazar-Ruiz, S. Y., Ibarra-Contreras, R. & Rios, C. (2019). Current concepts on bibliometrics: a brief review about impact factor, Eigenfactor score, CiteScore, SCImago Journal Rank, Source-Normalised Impact per Paper, H-index, and alternative metrics. Irish Journal of Medical Science (1971-), (188), 939-951.
Sallam, M., Mohammadi, M., Sainsbury, F., Nguyen, N. T., Kimizuka, N., Muyldermans, S. & Benešova-Schäfer, M. (2024). Bibliometric and scientometric analysis of PSMA-targeted radiotheranostics: knowledge mapping and global standing. Frontiers in oncology14, 1397790.
Traag, V.A., Waltman, L. & Van Eck, N.J. (2019). From Louvain to Leiden: guaranteeing well-connected communities. Scientific reports, 9(1), 5233.
Wang, X., Long, S., Zeng, L., Chen, C. & Yishan, L. (2024, June). Mapping the Evolution and Future Trajectories of Network Mining: A Scientometric Analysis (2004–2023). In 2024 International Symposium on Power Electronics, Electrical Drives,  Automation and Motion (SPEEDAM) (pp. 468-473). IEEE.
Wani, Z. A. & Zainab, T. (2017). A review of eminence of scientometric indicators in scientific research productivity’, COLLNET Journal of Scientometrics and Information Management, 11(2), 273–285.
Winarko, B., Abrizah, A. & Tahira, M. (2016). An assessment of quality, trustworthiness and usability of Indonesian agricultural science journals: stated preference versus revealed preference study. Scientometrics, 108, 289-304.

  • Receive Date 23 May 2024
  • Revise Date 12 January 2025
  • Accept Date 28 January 2025