Anyscale, a leading AI application platform, announced a collaboration with MongoDB to enhance multimodal search capabilities. The partnership aims to address the limitations of existing search systems and provide a more sophisticated search experience for enterprises that process large amounts of multimodal data.
Problems with Legacy Search Systems
Businesses often struggle with legacy search systems that can’t handle the complexity of multimodal data, including text, images, and structured data. Legacy systems typically rely on lexical search methods that match text tokens, resulting in low retrieval and irrelevant search results.
For example, an e-commerce platform searching for “green dress” may return items like “Bio Green Apple Shampoo” due to the limitations of lexical search. This is because the search system only matches text tokens and does not understand the semantic meaning behind the query.
Innovative solutions leveraging Anyscale and MongoDB
Anyscale and MongoDB’s collaboration aims to overcome these limitations by leveraging advanced AI models and scalable data indexing pipelines. The solution includes:
Generate product descriptions from images and names by running a multi-modal large-scale language model (LLM) using Anyscale.
After generating embeddings for product names and descriptions, we index them in MongoDB Atlas Vector Search.
Create a hybrid search backend that combines traditional text matching with advanced semantic search capabilities.
This approach improves search relevance and user experience by understanding the semantic context of the query and returning more accurate results.
Use Case: E-commerce Platform
The example use case is an e-commerce platform with a large product catalog. The platform aims to improve search capabilities by implementing a scalable multimodal search system that can handle both text and image data. The dataset used in this implementation is the Myntra dataset, which contains product images and metadata from Myntra, an Indian fashion e-commerce company.
Legacy search systems match only text tokens, producing irrelevant search results. With Anyscale and MongoDB, the platform can now return more relevant results by understanding the semantic meaning of queries and enriching the search context with images.
System Structure
The system is divided into two main phases: offline data indexing phase and online search phase. The data indexing phase processes, embeds, and updates text and images into MongoDB, while the search phase handles search requests in real time.
Data indexing step
This step includes:
Metadata enhancement to generate product descriptions and metadata fields using multi-modal LLM.
Generate embeddings for product names and descriptions.
Ingest data with MongoDB Atlas Vector Search.
Search Steps
The search phase combines legacy text matching and advanced semantic search. It includes:
Send a search request from the frontend.
Process requests in an Ingress deployment.
Generates embeddings for query text.
Performing a vector search in MongoDB.
Returns search results to the frontend.
conclusion
Anyscale and MongoDB’s collaboration represents a significant advancement in multimodal search technology. By combining advanced AI models with scalable data indexing pipelines, enterprises can now deliver more relevant and efficient search experiences. This solution is particularly useful for e-commerce platforms looking to improve search capabilities and user experience.
For more information, visit the Anyscale blog.
Image source: Shutterstock