Permutation Based Algorithm Improved by Classes for Similarity Searching

Authors

  • Karina Figueroa Universidad Michoacana de San Nicolas de Hidalgo
  • Antonio Camarena-Ibarrola Universidad Michoacana de San Nicolas de Hidalgo
  • Luis Valero Universidad Michoacana de San Nicolas de Hidalgo

DOI:

https://doi.org/10.13053/cys-26-1-4153

Keywords:

Similarity searching, metric spaces, pattern recognition, nearest neighbor

Abstract

Similarity searching is the most important task in multimedia databases, It consists in retrieving the most similar elements to a given query from a database, knowing that an element identical to the query would not be found. Dissimilarity between objects is measured with a distance function (usually expensive to compute), this allows approaching this problem with a metric space. Many algorithms have been designed to address this problem, in particular, the Permutation Based index has shown an unbeatable performance. This technique uses reference objects to determine a string for each element in the database that is a permutation of the same string. However, Huge databases and the memory required for these indexes make this problem a real challenge. In this paper, we present an improvement to the first approach where classes of reference objects were used instead of single references. In this paper, a new way to choose these classes is proposed and a new way to evaluate similarity between permutations. Our experiments show that we can avoid distance evaluations up to 90% with respect to the original technique, and up to 80% to the first approach.

Downloads

Published

2022-03-26