MathIRs: Retrieval System for Scientific Documents
DOI:
https://doi.org/10.13053/cys-21-2-2743Keywords:
Natural language processing, information retrieval, MathIRs, indexingAbstract
Effective retrieval of mathematical contents from vast corpus of scientific documents demands enhancement in the conventional indexing and searching mechanisms. Indexing mechanism and the choice of semantic similarity measures guide the results of Math Information Retrieval system (MathIRs) to perfection. Tokenization and formula unification are among the distinguishing ¡ features of indexing mechanism, used in MathIRs, which facilitate sub-formula and similarity search. Besides, the scientific documents and the user queries in MathIRs will contain math as well as text contents and to match these contents we require three important modules: Text-Text Similarity (TS), Math-Math Similarity (MS) and Text-Math Similarity (TMS). In this paper we have proposed MathIRs comprising these important modules and a substitution tree based mechanism for indexing mathematical expressions. We have also presented experimental results for similarity search and argued that proposal of MathIRs will ease the task of scientific document retrieval.Downloads
Published
2017-06-30
Issue
Section
Articles
License
Hereby I transfer exclusively to the Journal "Computación y Sistemas", published by the Computing Research Center (CIC-IPN),the Copyright of the aforementioned paper. I also accept that these
rights will not be transferred to any other publication, in any other format, language or other existing means of developing.I certify that the paper has not been previously disclosed or simultaneously submitted to any other publication, and that it does not contain material whose publication would violate the Copyright or other proprietary rights of any person, company or institution. I certify that I have the permission from the institution or company where I work or study to publish this work.The representative author accepts the responsibility for the publicationof this paper on behalf of each and every one of the authors.
This transfer is subject to the following conditions:- The authors retain all ownership rights (such as patent rights) of this work, except for the publishing rights transferred to the CIC, through this document.
- Authors retain the right to publish the work in whole or in part in any book they are the authors or publishers. They can also make use of this work in conferences, courses, personal web pages, and so on.
- Authors may include working as part of his thesis, for non-profit distribution only.