Data-driven Learned Metric Index: an Unsupervised Approach

Investor logo

Warning

This publication doesn't include Faculty of Economics and Administration. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

SLANINÁKOVÁ Terézia ANTOL Matej OĽHA Jaroslav DOHNAL Vlastislav

Year of publication 2021
Type Article in Proceedings
Conference 14th International Conference on Similarity Search and Applications (SISAP 2021)
MU Faculty or unit

Faculty of Informatics

Citation
Doi http://dx.doi.org/10.1007/978-3-030-89657-7_7
Keywords Index structures; Learned index; Unstructured data; Content-based search; Metric space; Machine learning
Attached files
Description Metric indexes are traditionally used for organizing unstructured or complex data to speed up similarity queries. The most widely-used indexes cluster data or divide space using hyper-planes. While searching, the mutual distances between objects and the metric properties allow for the pruning of branches with irrelevant data -- this is usually implemented by utilizing selected anchor objects called pivots. Recently, we have introduced an alternative to this approach called Lear\-ned Metric Index. In this method, a series of machine learning models substitute decisions performed on pivots -- the query evaluation is then determined by the predictions of these models. This technique relies upon a traditional metric index as a template for its own structure -- this dependence on a pre-existing index and the related overhead is the main drawback of the approach. In this paper, we propose a data-driven variant of the Learned Metric Index, which organizes the data using their descriptors directly, thus eliminating the need for a template. The proposed learned index shows significant gains in performance over its earlier version, as well as the established indexing structure M-index.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.