Node name encoding
Introduction
By default, documents and folders in the CMS have two name values:
- display name - a translatable string value that is meant for display on a webpage (like a breadcrumb value, a link name or a page title) or in the document listing in the CMS.
- node name - the actual node name as stored in the repository, which is also the value used by the HST for constructing a URL.
Both values are encoded using an implementation of the org.hippoecm.repository.api.StringCodec interface to ensure unsupported charcters are either removed or replaced with a supported character.
For display names, the class org.hippoecm.repository.api.StringCodecFactory$IdentEncoding is used which simply returns the input value as is. For node names, the class org.hippoecm.repository.api.StringCodecFactory$UriEncoding is used which, as its name implies, performs a one-way encoding (no decoding possible) for translating any UTF-8 String to a suitable set of characters that can be used in URIs [See org/hippoecm/repository/api/doc-files/encoding.html in the Repository API for a detailed explanation].
Configuration
Both codecs can be configured in the repository. The repository location depends on the version of Hippo CMS.
| Hippo CMS version | Repository location | 
|---|---|
| 12.1 and older | /hippo:configuration/hippo:frontend/cms/cms-services/settingsService/codecs | 
| 12.2 and newer | /hippo:configuration/hippo:modules/stringcodec/hippo:moduleconfig | 
Both locations accept two properties named encoding.display and encoding.node. As a property value you need to use the value returned by Class.getName(), e.g. org.hippoecm.repository.api.StringCodecFactory$UriEncoding.
Different node name encoding per locale
In some cases it is desirable to have a different StringCodec for encoding node names per locale. This way, the URLs constructed by the HST will be in the format that users (and machines) expect it to be for a related locale. For example, 'รค' is generally encoded as 'a' but in German it should be 'ae'.
To support this the configuration option for setting a node name codec has been extended. For example, a StringCodec for the German language can be configured with a property named encoding.node.de, or if you need to be more specific (like a different StringCodec for both Austrian and German), two properties should be added with the names encoding.node.de_de and encoding.node.de_at.