Elasticsearch Data Store
Introduction
The Relevance features Experiments and Trends require Elasticsearch.
Bloomreach Experience Manager 14 supports Elasticsearch versions 6.x and 7.x (since Bloomreach Experience Manager 14.3.0).
Bloomreach Experience Manager 15 supports Elasticsearch versions 6.x, 7.x, and 8.1.
See System Requirements for the exact supported version number.
Install Elasticsearch
Download and install Elasticsearch.
Choose a Stale Data Removal Strategy
To control the data volume of the Elasticsearch index, choose one of the following two strategies:
- Use a scheduled cleanup job provided by the Relevance Module.
When choosing this strategy, entries older than a certain number of days will be automatically deleted by the relevance engine. Configure maximum age in the targeting datasource in the application context configuration (see below). - Use the Rollover Index API provided by Elasticsearch.
When the application connects to Elasticsearch it uploads an index template containing the mapping for the visit type. When rolling over to a new index this mapping will be automatically added by elasticsearch and the alias will be moved to the new index. Configure the template name and alias name in the targeting datasource in the application context configuration (see below).
Configure Visits Data Store
A Relevance Elasticsearch Data Store connects to its database through a JNDI data source lookup which needs to be defined on container level, e.g. Apache Tomcat.
Depending on your stale data removal strategy, add one of the following environment entries in conf/context.xml in your project.
When using the scheduled cleanup job stale data removal strategy:
<Environment name="elasticsearch/targetingDS" type="java.lang.String" value="{'indexName':'visits','maxAgeDays':'60', 'locations':['url-1','url-2]',...]}" />
When using the rollover index stale data removal strategy:
<Environment name="elasticsearch/targetingDS" type="java.lang.String" value="{'templateName':'myproject-hippo_relevance_visit', 'aliasName':'visits', 'locations':['url-1','url-2]',...]}" />
This will register a JNDI environment resource under java/comp:env/elasticsearch/targetingDS when the site web application is started. The JSON string contains the properties needed to instantiate a client that can connect to an Elasticsearch cluster.
Change ['url-1','url-2]',...] to the list of the URLs of your Elasticsearch cluster nodes. For local development, you can set locations to ['http://localhost:9200']'.
The table below lists all available JSON fields:
Field |
Type |
Default |
Description |
indexName1 |
String |
n/a |
The name of the Elasticsearch index (use with the scheduled cleanup job stale data removal strategy). |
templateName2 | String | n/a | The name of the index template (use with the rollover index stale data removal strategy). You are free to choose any name, but it is advised to use a descriptive name to prevent name collisions and confusion. |
aliasName2 | String | n/a | The name of the alias. |
locations3 |
String array |
n/a |
URL locations of nodes in the Elasticsearch cluster to connect to. One location is enough to connect to the cluster. Specifying multiple locations adds robustness for the startup process. |
username |
String |
n/a |
Optional. Username for if elasticsearch requires authenticated access. |
password |
String |
n/a |
Optional. Password for if elasticsearch requires authenticated access. |
maxConnections |
Long |
20 |
Optional. Maximum number of client threads in the connection pool that will be used to connect to Elasticsearch. |
maxAgeDays | Long | 397 | Records older than this are deleted. 0 means records are never deleted. |
cleanupJobCronTrigger4 | String | n/a | Valid cron expression defining the interval at which data store cleanup jobs run. The cleanup jobs will only run if maxAgeDays is greater than 0. If the cleanupJobCronTrigger property is absent then the jobs execute with a fixed delay of one hour. |
1 Required when using the scheduled cleanup job stale data removal strategy.
2 Required when using the rollover index stale data removal strategy.
3 Required regardless of stale data removal strategy.
4 Available since version 13.4.0.
Configure this JNDI environment resource for the visits store, like below default bootstrapped configuration.
Elasticsearch 6:
/targeting:targeting/targeting:datastores/targeting:visits: targeting:storefactoryclass: com.onehippo.cms7.targeting.storage.elastic6.ElasticStoreFactory dataSource: elasticsearch/targetingDS
Elasticsearch 7 (Bloomreach Experience Manager 14.3.0 and later):
/targeting:targeting/targeting:datastores/targeting:visits: targeting:storefactoryclass: com.onehippo.cms7.targeting.storage.elastic7.ElasticStoreFactory dataSource: elasticsearch/targetingDS
Configure Elasticsearch
When using the scheduled cleanup job stale data removal strategy, create the index configured above (referred to by the indexName property) in Elasticsearch, e.g. using curl:
curl -s -S -XPUT http://localhost:9200/visits
If the index does not exist when the CMS is started creating the configured index will be tried.
When using the rollover stale data removal strategy, create the aliased index configured above (referred to by the aliasName property) in Elasticsearch, e.g. using curl:
curl -XPUT 'localhost:9200/%3Cvisits-%7Bnow%2Fd%7D-000001%3E' -d '{ "aliases": { "visits": {} } }'
This creates an initial index named visits-YYYY.MM.dd-000001 where YYYY.MM.dd are the current year, month and day. The alias for this index is visits. Make sure that the index you create is prefixed with the alias because the application will use ${alias}* for queries.
The index must be accessible for reading and writing to the users as configured by the authentication property. How this can be done is out of scope of this document because it depends on the deployment scenario of your Elasticsearch instance. Please consult your administrator to find out how you can create the index in your Elasticsearch instance.