DEV Community

Rory Warin for Bloomreach

Posted on

Discovery | Bloomreach's System Performance during the 2019 Holiday Season

Holiday preparedness is a yearly practice in which the engineering team at Bloomreach scales the already highly available production systems to break its own limits without compromising on the performance and latency for customers.

The Bloomreach Serving Infrastructure

In last year's blog, the team gave a glimpse of the dependency graph between different services at Bloomreach. A lot has changed since then and in this article, we will focus more on what has changed since last time.

Bloomreach has dockerized the autosuggest service and started running it inside Kubernetes. The Search API server, which runs the Django Python app, has been changed to auto-scale based on the traffic demand.

Below is the bird's eye view of the Bloomreach Search & Merchandising (brSM) serving infrastructure. During the holidays, we scale up our API servers along with the hot backup that we maintain in the west coast datacenter.

Load Testing

Capacity provisioning is useless if we don’t load test our systems to validate our assumptions and check them for any failure points.

Up until last year, we were running load tests solely using Vegeta but this year, we spiced that up with a touch of Kubernetes.

We created a prebaked Docker image with Vegeta for running load tests which take log files from S3 and the endpoint on which we wanted to run the load tests. From there on, it was simply a matter of writing k8s job and we were good to go. Load testing is run for a couple of days. You can find the sample dockerfile for reference below:

###

# python 3 has some issues with s3cmd, currently 2.0.2 is latest

# https://github.com/s3tools/s3cmd/issues/930

###



FROM python:2.7.17-slim-buster



ARG VERSION="12.7.0"

ARG PLATFORM="linux-amd64"



# Add new user to run the whole thing as non-root

RUN addgroup mobile \

&& useradd -g mobile -d /loadtest mobile



RUN apt-get update \

&& apt-get install --no-install-recommends -y vim-tiny wget lzop procps \

&& apt-get clean \

&& rm -rf /var/lib/apt/lists/*



RUN pip install s3cmd=="2.0.2"



# install vegeta

RUN wget -O "./vegeta.tar.gz" "https://github.com/tsenart/vegeta/releases/download/v${VERSION}/vegeta-${VERSION}-${PLATFORM}.tar.gz" \

         && tar -xvf "./vegeta.tar.gz" \

         && mv "./vegeta" "/usr/local/bin/" \

         && rm "./vegeta.tar.gz"



COPY . /loadtest

RUN chown -R mobile:mobile /loadtest

WORKDIR /loadtest

USER mobile

ENTRYPOINT ["/loadtest/docker/entrypoint.sh"]

Enter fullscreen mode Exit fullscreen mode
Delivering a great experience
Enter fullscreen mode Exit fullscreen mode

The stakes and consequently the expectations are higher during the holiday period, as any downtime will have a revenue impact for the customer. By maintaining a 100% uptime for Search, Autosuggest & Dashboard we upheld the customer confidence in Bloomreach.

Search

Search observed a peak QPS of ~1600 on Cyber Monday

Autosuggest

Autosuggest service observed a peak QPS of ~3000, again on Cyber Monday

Organic (Related Searches, Related Products)

Organic had the peak QPS of ~2400

Global Latencies

The graph below shows the latency during this year's holiday period. Average latency is the average of all API calls in the given time window. The average latency of the brSM API was < 150ms, for Suggest it was < 30ms and for Organic RS/RP it was < 2ms.

2019 Holiday Trends

The Bloomreach BA team also did an analysis of shopping trends for this year's holiday season. The major observations are as follows:

  • Mobile traffic beats desktop users in both the US and EU.
  • Mobile users are spending more when compared to the desktop.
  • Search trends generally stayed the same i.e. consumers are still buying the same things.

Acknowledgements

  1. Purshotam for leading the effort together with Raunak, Naveen, Jyoti, Abhishek & Mayank.
  2. All other internal teams - Connect, Search Quality, Metal, Dashboard & Analytics for making the holiday preparedness a success.
  3. Special thanks to the Bloomreach support team for helping us have an incident-free holiday.

Blog written by: Abishek & Purshotam from Bloomreach, 2020

Top comments (0)