Implementing Vector Search on Gally

Introduction

In the fast-paced world of e-commerce, the ability to provide relevant search results is paramount to success. As online platforms continue to expand their product offerings, ensuring that customers can easily find what they’re looking for becomes increasingly challenging. 

Recognizing this need for enhanced search capabilities, Gally embarked on a journey to revolutionize its search engine.

Vector search example

Overview of the importance of search relevance in e-commerce

Effective search functionality is not merely a convenience for users; it’s a critical factor that directly impacts conversion rates, customer satisfaction, and ultimately, business success. In an era where consumers demand instant access to the products they desire, the relevance of search results can make or break an online shopping experience. Gally understands this fundamental truth and has committed to delivering an unparalleled search experience for its users.

Introduction to Gally’s implementation of vector search technology

In its quest to elevate search relevance to new heights, Gally made a strategic decision to implement vector search technology. This cutting-edge approach leverages advanced algorithms and machine learning techniques to understand the semantic context of search queries and match them with the most relevant products in real-time. By adopting vector search technology, Gally aims to not only meet but exceed the expectations of its discerning user base.

Stay tuned as we delve deeper into the world of vector search technology and explore how Gally’s implementation is poised to redefine the e-commerce search experience.

Understanding Vector Search

Explanation of Vector Representations (aka Embeddings)

 Vector search represents a sophisticated search approach wherein data is translated into multi-dimensional vectors, effectively encapsulating semantic connections.

In this method, each document or item is transformed into a vector within a multi-dimensional space, thereby retaining contextual information and subtle nuances.

Vector search vs Fulltext search

This advanced technique offers users more than just keyword-matching results; it provides the context and meaning behind their queries, distinguishing it from conventional search mechanisms. Unlike traditional full-text searches, which rely on exact textual matches, vector representations capture the semantic meaning of the data (by computing the similarity between vectors), resulting in more precise and contextually relevant outcomes. Widely utilized in fields such as natural language processing (NLP) and recommendation systems, vector search stands at the forefront of revolutionizing how we access and engage with information.

Role of Large Language Models (LLMs)

At the heart of vector search lies the utilization of Large Language Models (LLMs). These sophisticated models, trained on vast amounts of text data, possess a deep understanding of language semantics and context. 

By leveraging the capabilities of LLMs, vector search algorithms can analyze search queries and product descriptions to identify subtle semantic relationships and infer user intent accurately.

Capabilities in Semantic Search

Vector search technology enables Product Discovery solutions like Gally to move beyond traditional keyword matching and embrace semantic search. Semantic search goes beyond surface-level textual similarities and considers the underlying meaning and context of search queries. This allows for more accurate and relevant search results, even when the exact terms may not match.

By understanding the intricate nuances of language semantics, vector search algorithms can deliver search results that align closely with the user’s intent, leading to a more satisfying and efficient search experience. In the next chapter, we will delve into the implementation process of vector search at Gally and explore how it enhances the search functionality for users.

The Implementation Process

Defining valuable data

The first step towards calculating embeddings on catalog data is to choose carefully which fields we want to use for the computation. The best input for an LLM to calculate a vector is a text that contains the more important data of the product. So at this step we want to assemble the data from various product fields to build this text that we will send to the model.

Merchants know more than anyone which fields of their catalog are containing the most important semantic information to feed the embeddings.

Thanks to the usability of the Gally Back-Office, they are able to tick easily which field should be used for computing a vector representation. More than that, Gally allows them to build a pre-prompt that will enrich even more the data used for calculating.

If there is a field called “Hiking Level” that can contain “beginner, intermediate, advanced”, this can be worth adding a prompt to this field that will tell the engine something like “This product is recommended for %s hikers”, allowing the engine to know that this product is made for hiking, and is dedicated to a particular level of hikers. Of course, this is less needed for fields containing a huge amount of text like the product description.

Vector Search in Gally
Vector Search in Gally

Data Indexing with Vector Representations

As seen previously, the foundation of vector search lies in the creation of vector representations for each data point in the e-commerce catalog. 

Once we have defined the fields that will be used for generating the embeddings, it’s time to get them computed.

Word Embeddings

The first step is to choose which model will be used for the computing phase.

Gally allows the user to use : 

  • plenty of models that are already available as pre-trained models with OpenSearch : MiniLM, Distilbert, Mpnet…
  • any other models available if they match the opensearch format requirement (PyTorch or ONNX).

 

In the near future, Gally will also allow using external models like OpenAI, SageMaker, Claude, Mistral, etc…

The chosen model will then build a vector representation of each product data and store it into the Opensearch index for later usage.

Query Processing and Retrieval

The same logic will be used for the query asked by the end user. This query will be computed into a vector representation by the same model that was used during the indexing phase.

Consequently, when conducting a search, the background operation involves comparing the similarity of embeddings rather than raw text data.

The similarity between the embeddings is calculated with an algorithm dedicated to this, like KNN.

KNN, or k-nearest neighbors, is a machine learning algorithm used for classification and regression tasks. It works by finding the k closest data points (the vector representation of the products) in the feature space to a given query point (the vector representation of the query).

Vector Search query

Benefits for Gally and its Users

Enhanced Relevance and Precision

By implementing vector search technology, Gally significantly elevates the relevance and precision of search results. Unlike traditional keyword-based searches, which may yield numerous irrelevant matches, vector search considers the semantic context of queries and retrieves results based on similarity metrics. This ensures that users are presented with products that closely align with their preferences and intent, ultimately enhancing their shopping experience.

Efficient and Intuitive Search Experience

With vector search, Gally streamlines the search process, making it more efficient and intuitive for users. By leveraging semantic understanding and similarity calculations, users can find relevant products with greater ease and speed. This reduces the time and effort required to locate desired items, resulting in a more satisfying and frictionless shopping journey.

Reduced Dependence on Manual Configuration

Vector search minimizes the need for manual configuration of synonyms and other search parameters. Unlike traditional search mechanisms that rely heavily on manual tuning to optimize relevance, vector search automates much of this process. By encoding semantic relationships directly into vector representations, Gally can deliver accurate and contextually relevant results without the need for extensive manual intervention.

Future-Proofing and Scalability

As Gally continues to grow and evolve, vector search provides a future-proof solution that can scale with the platform’s expanding catalog and user base. By harnessing the power of machine learning and advanced algorithms, Gally ensures that its search capabilities remain robust and effective in the face of increasing complexity and volume. This scalability ensures that Gally can continue to deliver exceptional search experiences to users as the platform continues to evolve.

Future Prospects and Conclusion

Continued Innovation in Vector Search

The implementation of vector search marks just the beginning of Gally’s journey toward advancing search technology. As the field of machine learning and natural language processing continues to evolve, Gally remains committed to staying at the forefront of innovation. This includes exploring new techniques, refining existing algorithms, and harnessing emerging technologies to further enhance search relevance, personalization, and efficiency.

Expanding Applications Beyond E-commerce

While vector search has already proven its value in the realm of e-commerce, its potential extends far beyond product search. Gally recognizes the versatility of this technology and its applicability to various domains, including content recommendation, information retrieval, and data analysis. By leveraging the capabilities of vector search across different facets of its platform, Gally aims to deliver a unified and seamless user experience across all touchpoints.

Empowering Users with Insights and Analytics

In addition to enhancing search functionality, Gally sees an opportunity to empower users with insights and analytics derived from vector search data. By analyzing user search behavior, preferences, and interactions, Gally can gain valuable insights into market trends, customer preferences, and product demand. This data-driven approach not only informs business decisions but also enables Gally to anticipate and respond to changing user needs effectively.

Continuous Focus on User Experience

Throughout its journey, Gally remains steadfast in its commitment to prioritizing user experience above all else. Whether through intuitive search interfaces, personalized recommendations, or seamless navigation, every aspect of Gally’s platform is designed with the user in mind. By continually soliciting feedback, iterating on features, and embracing user-centric design principles, Gally ensures that its search experience remains unparalleled in the industry.