Permutation Based Algorithm Improved by Classes for Similarity Searching

Karina Figueroa; Antonio Camarena-Ibarrola; Luis Valero

doi:10.13053/cys-26-1-4153

Permutation Based Algorithm Improved by Classes for Similarity Searching

Authors

Karina Figueroa Universidad Michoacana de San Nicolas de Hidalgo
Antonio Camarena-Ibarrola Universidad Michoacana de San Nicolas de Hidalgo
Luis Valero Universidad Michoacana de San Nicolas de Hidalgo

DOI:

https://doi.org/10.13053/cys-26-1-4153

Keywords:

Similarity searching, metric spaces, pattern recognition, nearest neighbor

Abstract

Similarity searching is the most important task in multimedia databases, It consists in retrieving the most similar elements to a given query from a database, knowing that an element identical to the query would not be found. Dissimilarity between objects is measured with a distance function (usually expensive to compute), this allows approaching this problem with a metric space. Many algorithms have been designed to address this problem, in particular, the Permutation Based index has shown an unbeatable performance. This technique uses reference objects to determine a string for each element in the database that is a permutation of the same string. However, Huge databases and the memory required for these indexes make this problem a real challenge. In this paper, we present an improvement to the first approach where classes of reference objects were used instead of single references. In this paper, a new way to choose these classes is proposed and a new way to evaluate similarity between permutations. Our experiments show that we can avoid distance evaluations up to 90% with respect to the original technique, and up to 80% to the first approach.

Downloads

Published

2022-03-26

Issue

Vol. 26 No. 1 (2022): Recent Advances in Language & Knowledge Engineering (Guest editors: D. Pinto, B. Beltrán, A. Vázquez, D. Vilariño)

Section

Articles of the Thematic Issue

License

Hereby I transfer exclusively to the Journal "Computación y

Sistemas", published by the Computing Research Center (CIC-IPN),

the Copyright of the aforementioned paper. I also accept that these

rights will not be transferred to any other publication, in any other

format, language or other existing means of developing.

I certify that the paper has not been previously disclosed or simultaneo

usly submitted to any other publication, and that it does not contain

material whose publication would violate the Copyright or other

proprietary rights of any person, company or institution. I certify that

I have the permission from the institution or company where I work or

study to publish this work.

The representative author accepts the responsibility for the publication

of this paper on behalf of each and every one of the authors.

This transfer is subject to the following conditions:

The authors retain all ownership rights (such as patent rights) of this work, except for the publishing rights transferred to the CIC, through this document.
Authors retain the right to publish the work in whole or in part in any book they are the authors or publishers. They can also make use of this work in conferences, courses, personal web pages, and so on.
Authors may include working as part of his thesis, for non-profit distribution only.

Permutation Based Algorithm Improved by Classes for Similarity Searching

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

Developed By

Information

Language