Data-dependent Metric Filtering

Investor logo

Warning

This publication doesn't include Faculty of Economics and Administration. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

MÍČ Vladimír ZEZULA Pavel

Year of publication 2022
Type Article in Periodical
Magazine / Source Information systems
MU Faculty or unit

Faculty of Informatics

Citation
Web https://www.sciencedirect.com/science/article/pii/S0306437921001666
Doi http://dx.doi.org/10.1016/j.is.2021.101980
Keywords Metric Space Searching;Similarity Search;Metric Filtering;Data Dependent Filtering
Description Filtering is a fundamental strategy of metric similarity indexes to minimise the number of computed distances. Given a triplet of objects for which distances of two pairs are known, the lower and upper bounds on the third distance can be determined using the triangle inequality property. Obviously, tightness of the bounds is crucial for efficiency reasons — the more precise the estimation, the more distance computations can be avoided, and the more efficient the search is. We show that it is not necessary to consider arbitrary angles in triangles formed by pairwise distances of three objects, as specific range of possible angles is data dependent. When considering realistic ranges of angles, the bounds on distances can be much more tight and filtering much more effective. We formalise the problem of the data dependent estimation of bounds on distances and deeply analyse limited angles in triangles of distances. We justify the potential of the data dependent metric filtering both, analytically and experimentally, executing many distance estimations on several real-life datasets.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.