Pratik Patre | Naveen Pai

May 9, 2018

Evolution of Autosuggest at Bloomreach

Written by Naveen and Patre, Member of Technical Staff, Bloomreach.

 

 

eCommerce platforms often struggle to understand customer intent. At Bloomreach, we use multiple tools, algorithms and systems to figure out just what customers are searching for, and our platform is continuously learning how to improve accuracy and speed of intent fulfilment. Over the years, Autosuggest systems have proven their value as a tool to not just understand user intent, but to also promote discovery of products, as well as act as a personalized guide for the user browsing the website.

The Autosuggest system at Bloomreach has gone through multiple iterations in terms of code, algorithms, and product scope. Each iteration has allowed us to provide more value to our merchants, and to scale the system to power Autosuggest on some of the most popular eCommerce websites on the internet.

 

Inception

The very first Autosuggest system at Bloomreach was developed at an internal hackathon. Over the years, the annual Bloomreach Hackathon has become a testbed for Bloomreachers to demo cool projects developed over 48 hours, from scratch to proof of concept (PoC) The initial Autosuggest system was developed using the Solr Suggester component with data for suggestions consumed from user generated queries from the Data Analytics systems at Bloomreach. The system was rough, but the hackathon demo was awesome, and Bloomreachers rightfully recognized the system as a game-changer, especially since it tied in perfectly with our state of the art ranking engine. The very first Autosuggest system looked like this:

 

Success at the hackathon led to the Engineering team using the PoC as a springboard and it wasn’t long before we had Suggest v1.0, which was more polished but used a very similar approach as the hackathon project.

 

Scaling

The Autosuggest system was very well received by our initial set of pilot merchants. We were driving a great user experience while also making an impact on the revenue per visit, and that led to very happy merchants. Over the years, the system was tweaked to allow features such as blacklisting suggestions, boosting suggestions, and tweaking ranking algorithms, while also making performance improvements across the table. However, it was not long before we started hitting the limitations of the Solr Suggester, which formed the crux of our system. So it was time for some re-engineering of the system. This gave us an opportunity to look at Autosuggest from the ground up. We used data and feedback we gathered from our merchants to break down requirements of the new system into 3 major buckets: quality, speed and extensibility.

  1. Quality of Suggestions: Since Autosuggest forms, the first interface that a user interacts with, the quality of suggestions can be the difference between a good and an awful experience. We define good quality suggestions as those which are:

    1. Relevant to the merchant catalogue.

    2. Valid suggestions for the prefix typed by the user.

    3. Ranked such that the suggestions shown higher up the list are more likely to be clicked on than the ones shown lower on the list.
       

  2. The speed of Autosuggest: The human mind finds a lag greater than 150ms between typing a query and a change in Autosuggest very jarring. As such, it is of critical importance to have a highly performant system. This means that, even with Bloomreach’s globally distributed data centers, we need to keep a strong focus on performance numbers to ensure the best Autosuggest experience for our users. The existing system was fast (in the sub-50ms TP95 range) but we knew we could do much better.
     

  3. System Extensibility: There were multiple improvements we had in mind to better identify and surface customer intent. With the initial system, we simply could not explore these ideas. So from the bottom up, we wanted to design the system to allow further exploration of ways to improve Autosuggest.
     

For quality, we went to the best source of user intent we have -  actual queries typed in by users on the merchant website. As it turned out, user queries cover a massive percentage of the merchant catalogue, so the quality and coverage are both really good. Running a set of noise removal algorithms,frequency analysis and auto-correction algorithms on these gives us a set of queries which are both relevant as well as orderable, since we have statistics on how often a user searches for a product or even how often a user misspells a product (The quintessential example we saw was ‘gucci’ misspelled as ‘guci’).

For speed, we decided to push almost all the processing to the backend, leaving the frontend to simply perform a lookup on a pre-generated prefix map. The generated map also took into account misspellings and corrections, so if you searched for “guci” you would be shown the same suggestions as if you had typed “gucci”. Solr continued to power our serving infrastructure, happily performing the prefix matching for user queries and lookup.

 

Getting Smarter

Now we had a stable system powering Autosuggest, delivering a delightful experience to our users. The speed and intent identification of Autosuggest was much appreciated, both by merchants and our sales teams! ;)

So it was the perfect time to improve quality even further. The Autosuggest team decided to focus on the following focus areas:

  • A/B Testing: There is no better way to prove that system A is better than B than by having data to back the claim. A/B testing goes a long way to gather data which can help test theories. The Autosuggest system was extended to support A/B testing to allow different Autosuggest experiences to be provided to a user based on cookies.
     

  • Smart Query Ranking: So far, the major factor used for ranking suggestions was the frequency with which it was searched. But with an A/B testing framework in place, we could test out ranking algorithms that took into account signals including, but not limited to, (a) revenue generated by a query, which depends on products purchased once the query is typed in, and (b) bounce rate of a query, which is the number of times a user didn’t buy anything after searching for a particular query.
     

  • Increasing Coverage with Structured Queries: The Autosuggest system worked great for high traffic merchants, but for low traffic merchants, we simply did not have enough user queries to create a list of quality suggestions which would cover the entire merchant catalogue. We built a system we call Structured Queries, which generates queries for product and category pages using ontologies and grammars combined with product attributes extracted from the product.

    Suppose we had a product P1 with {color: “red”, product: “shoes”, brand: “nike”, style: “sporty”} and an ontology containing <color> <product>, <color> <brand> <style> <product> and <style> <brand> <product>. We would generate “red shoes”, “red nike sporty shoes” and “sporty nike shoes” as valid inputs to the Autosuggest system.
     

  • Smarter Prefix Matching: Prefix matching for Autosuggest presents it’s own challenges, which get exemplified with multi-word suggestions. When a user types in “iph” in a search box, “apple iphone” would be considered a good suggestion, but if a user types “it” in the same search box then “post it notes” is a bad suggestion. In both these cases the search query is a prefix of the second word of the suggestion, but the end user experience is vastly different.

To solve such problems, we built a model which captures the prefix along with the suggestion clicked from the list. This model helps us understand the relative importance of a prefix for a particular suggestion. We updated our ranking algorithm to use this model to incorporate the prefix-importance for a suggestion. This meant that for “iph”, “apple iphone” was boosted up, while for “it”,”post it notes” was pushed much lower. This allowed the system to perform smarter prefix matching in a multi-word suggestions.

 

 

Up, Up and Away

When the Autosuggest system was built, it was meant to handle only text-based suggestions. However, over time, we noticed that Autosuggest systems also play the role of an instant-search by taking a user from the home page directly to a product page they are interested in. This epiphany lead to the platform being extended to provide an instant-search solution, known as Product Suggestions. We didn’t want to modify the base Autosuggest response so we developed a Django-based web service which implemented the Decorator pattern on top of the base Autosuggest response, allowing suggestions to be enriched by data from external systems. Architecturally, it looked like this:

 

 

With this service in place, developing Product Suggestions was straightforward. The Autosuggest system could easily be extended to use the existing Bloomreach search infrastructure to generate a list of products which are the top search results for the first suggestion for the base Autosuggest response. The reason we used only the first suggestion was to promote similarity in suggested products. Also, the system trusted that the suggestion at the top was most likely to be clicked so products relevant to this suggestion are also more likely to be clicked.

The architecture now looked like this:

 

 

At its core, Autosuggest with Product Suggestions meant there were 2 API requests being made behind the scenes: one to get the base Autosuggest response, and the other to get the products to show alongside the base response. However, these APIs have very different latencies. Our Autosuggest system had a TP95 latency of < 30ms while our search API request had a TP95 latency of ~300ms. It would be impossible to meet Suggest API Service Level Agreement (SLA) requirements for merchants if we made a search API call for every query and waited for the response. Similarly, making a search API call alongside every suggest API call would overload the search API system since, on average, the suggest API sees about 4x the number of search API requests.

When the search calls were analyzed, we observed that the frequency of different search queries made by the Search API from the Autosuggest system follows a long tail distribution. This discovery implied that most search API calls are repetitive, allowing us to perform massive caching of search API responses within the Autosuggest system. We use Redis, which is tried, tested and scales beautifully as a caching system. When a cache miss happens, we return the base Autosuggest response as is, and make an asynchronous call to the Search API and populate the results in our cache. The next API request for the same query has products in the cache and can decorate the base response with product suggestions without having the API latency take a hit. With this solution, 98% of suggest requests are enriched with product suggestions in response while latencies remain well within the < 30ms range.

 

 

Harder, Better, Faster, Stronger

A massive amount of product design and engineering development has gone into making Bloomreach AutoSuggest the system it is today. It has come very far from the ~1000 lines of code written over 48 hours in the BloomReach hackathon not so long ago. The system has grown from its humble beginnings to a production grade workhorse, handling an Autosuggest API load of upwards of 1500 QPS and an assured uptime of 99.99%. Today, nearly 41% of all user sessions on our merchant websites begin with a clicked Autosuggest suggestion, which is a testament to the quality of suggestions served. Every change, every iteration, every architectural decision has grown the system bit by bit to ensure Bloomreach can power Autosuggest on a vast variety of eCommerce sites, with vastly differing catalogue sizes, traffic, and revenue, with no manual tweaking required. The next frontier for the Autosuggest product is to serve personalized suggestions and to improve suggestions based on user context. If I were to describe the story of the growth of Autosuggest within Bloomreach, then I don’t think I could find better words than those spoken by our eternally helmeted French friends who said it best:  “Harder, Better, Faster, Stronger”.

Special thanks to Naveen Vardhi, for the massive help in writing and editing this post, and for being an amazing mentor throughout! :)

Acknowledgments: The autosuggest system was worked on by Pratik Patre, Naveen Pai, Ankesh Sengar, Shaifali Gupta, Naveen Vardhi, Jasvinder Singh, Vaibhav Rastogi and others.

REPORT

2018 Site Search Report

Download the Report
2018 Site Search Report