Semantic Search for Scientific Articles by Language Using Cosine Similarity Algorithm and Weighted Tree Similarity

  • Muhamad Aldi Rifai Universitas Muhammadiyah Gresik
  • Indra Gita Anugrah Universitas Muhammadiyah Gresik
Keywords: Semantic Search, Scientific Articles, Cosine Similarity, Weighted Tree Similarity


The activity of writing scientific articles by academics at universities is one of the activities that is often carried out, but when writing scientific articles problems arise regarding the difficulty of finding ideas, literature studies, and reference sources that you want to use as references when writing. Sometimes when searching on a search engine, we have trouble finding the right document, because usually, the keywords we are looking for are not in the title section but another part of the structure. Since most search engines only match titles, other structures are usually excluded from matching. So that the search results that we do sometimes don't match what we want. In addition, usually, each scientific article has many language differences in its structure as found in the abstract section. To detect similarities through the structure of scientific articles, an algorithm is used, namely weighted tree similarity, and to detect language using the N-gram algorithm, then the cosine similarity algorithm can be used to check the level of similarity in keyword text with text in scientific articles.

Engineering and Technology