Vector Search pt. 3 - Retrieval Systems ๐ŸŒŒ

Josรฉ Marrugo - Jul 6 - - Dev Community

This is the third part of the Vector Search Series. We've talked about what is a Feature here, and how to automatically obtain those Features using Vision Algorithms here.

If we try to remember, or retrieve something from our memory, we have to navigate the neural circuits in our brains in order to come up with the thing that we were trying to remember.

Our neurons activate in multiple established patterns that brings us a concept, if that concept is close to the thing that we want to remember we keep it, if not, we discard it and came up with another thing.

Here a particular thing that kind of align with this article is the difference on remembering things when we are focused or relaxed. When we are focused, we tend to narrow a lot the things in our remembering space, if we are relaxed, we have a broader remembering space, maybe take more time to remember something, but it is more probable to retrieve it at the end.

Image of a multiple neural circuits activated to remember something, small variations in neurons will give us variations in the remembered thing, depending the focus or disperse modes

The thing is that there are some algorithms that use similar approaches to retrieve things that are close to the one that we are looking for.

Those algorithms are the Vector Retrieval Algorithms, they are at the heart of multiple search engines, ranking results based on similarity.

Here we'll try to explain them a bit.

Search Indexes

Search Indexes could be treated as a sorting algorithm for Vectors, they sort based on the similarity with respect to the input Vector.

So this algorithm will receive an input Vector, then it will compute the similarity between it and others, and finally it will return the identifiers of most similar vectors, as simple as that.

An algorithm that will compute the similarity between the input vector and all the other ones in the Database is called a Brute Force algorithm, as it compute similarity to all vectors, then sort them.

There are other algorithms that make this process a bit (or a lot) more complex, those are very sophisticated algorithms which could be better approaches than brute-force (most of the time).

Great examples of those algorithms are IVF (Inverted File), and HNSW (Hierarchical Navigable Small World)

To understand this better, I'll be using a similar metaphor to the one introduced in the HuggingFace Computer Vision Course.

IVF based algorithms

This kind of algorithm tries to breakdown the Vector Space in Cells, each cell will have a centroid, and that centroid is calculated using only the Vectors inside that cell.

So imagine that you have a lot of puzzle pieces in the floor, they could be of the same puzzle or they could be from different ones. The goal for you is to find a piece that you need (maybe for solving another puzzle), and you'll look up for it using this IVF based algorithm.

In contrast to brute Force where the puzzle pieces are sparse in the floor, and you go piece by piece checking wether it is useful to you or not, the IVF based algorithm organizes the pieces in groups based on the similarity between them, and compute a centroid piece that you can use to check wether that group of pieces could be useful or not.

Image of the puzzle pieces space divided into voronoi cells

At search time, we check if a centroid piece or multiple centroid pieces are somewhat similar to what we are looking for, then after finding most similar centroid pieces, we start looking for the piece that we need inside its group, so instead of searching piece by piece, we'll be searching just in the groups where the centroid piece is most similar to the piece that we have in mind.

This method is very fast, and depending on the number of centroid pieces and comparisons performed at search time (we could check inside one or multiple groups), we could have high precisions when searching for pieces.

Remember that the similarity between pieces is calculated using the features previously extracted, those features could be based on the form, image of the piece, color, size, etc... And they are all saved in a vector form (embedding).

HNSW based algorithms

These algorithms are very precise when retrieving the similar pieces. They organize the pieces in a graph-like form making them faster than brute force, and almost as precise when searching.

The below explanation will combine the puzzle metaphor with this detailed article on HNSW algorihms

The HNSW algorithm would be like having multiple levels of the puzzle pieces. Originally you'll have all the pieces organized in a way that similar pieces are connected by strings little strings (I could not come out with a better methaphor ๐Ÿ˜…) so the pieces and the strings form something like a graph, but the magic of this algorithm comes from having layers of strings.

Imagine that the string's layer could be defined by the diameter of the string (or its color), so there are pieces that have strings of multiple diameters.

Image of the space divided into nodes on different layers

When you are looking for a puzzle piece you could start anywhere in that pieces and strings mesh, but you'll start looking just at the pieces connected with the thicker strings, after finding the most similar piece connected to the thicker strings, you'll start to look for the next thiner group of string connected pieces, that were connected to that most similar piece with the tickest string, and you continue doing this process until you have looked up in all the string's thick levels, or you've find a good enough piece.

Image of the search process

This process is more slow and requiers more memory than the IVF process, and it also consume more memory than the brute force algorithm, because this algorithm have to save all those strings (edges) that connect the pieces (vectors of features), but the results are more precise than the IVF algorithm, and faster than the brute force algorithm.

Conclusion

Having the fundamentals of what is a feature, how to extract Image based features, and how can we lookup for things in a very efficient way, we can now understand how the retreival systems of Google, Netflix, Youtube, Pinterest, and our brains (not so much) work in general.

The full process would be something like:

  1. Adding the Vectors to the database:
    Image description

  2. Extracting the Features of the thing to search:
    Image description

  3. Performing the search using a Vector Retrieval algorithm:
    Image description

And that's it!โ˜€๏ธ

All the articles have links and materials to keep deepeing our knowledge and as the HNSW algorithms explained here that way we'll have more puzzle pieces to retrieve knowledge from, and being better developers.

I hope you've learned something from this short and sparse series. And if you have suggestions or corrections, let me know.

Happy coding! ๐Ÿ˜Ž

. . . .