SiteMapItem Matching
After the URL has been matched to a Mount, the remaining part of the URL after the Mount is attempted to be matched to a SiteMapItem. So if the URL http://localhost:8080/site/fr/home matches to the Mount fr, the remaining part of the URL to be matched in the sitemap is home.
The SiteMap is configured by default at /hst:hst/hst:configurations/{myproject}/hst:sitemap.
The basic idea behind SiteMap matching is to provide flexible rules for matching specific URLs or complete URL spaces and map URLs on component configurations. The SiteMap (more precise, the inverse of it) is also used to generate page URLs for links to documents. The SiteMap is a composite structure containing a hierarchy of SiteMapItems. It can contain SiteMapItems with explicit names, or with wildcards matching to any name. A SiteMapItem with name _index_ has a special meaning. See the usage of _index_ SiteMapItems below.
From now on we talk about a path segment when we refer to some part of the URL between two slashes. As wildcards the SiteMap supports:
1 _default_ this is equivalent to a *, matching any single path segment 2 _any_ this is equivalent to a **, matching any ending of a URL 3 _default_.ext where 'ext' can be some extension, for example *.html 4 _any_.ext where 'ext' can be some extension, for example **.xml
** and **.xxx matchers are only allowed as leaf SiteMapItem in the composite structure.
During SiteMapItem matching phase, the remainder of the URL after the Mount is attempted to be matched to the best SiteMapItem. The best SiteMapItem is the one that matches earliest path segments more specifically.
1 An exact (explicit) match is considered more specific than a wildcard match 2 * is more specific than a ** 3 *.html is more specific than a * 4 **.html is more specific than ** 5 * is more specific than **.html
Up to rule 4 it is very straightforward. Rule 5 is debatable, but we chose for * to be more explicit than **.html. For example suppose the following (contrived) SiteMap
/hst:hst: /hst:configurations: /example: /hst:sitemap: /home: /news: /_any_.html: /_any_: /agenda: /_any_.html: /_any_: /2011: /_default_: /_default_: /_any_:
The following URLs (after the Mount part) match to the following SiteMapItems:
/home -->home /news -->news /news/2011 -->news/_any_ /news/2011/myNewsItem.html --> news/_any_.html /agenda/2010 --> agenda/_any_ /agenda/2011/foo --> agenda/2011/_default_ /agenda/2011/foo/bar --> agenda/2011/_default_/_default_ /agenda/2011/foo/myAgendaItem.html --> agenda/2011/_default_/_default_ /agenda/2011/foo/bar/lux --> agenda/_any_ /agenda/2011/foo/bar/myAgendaItem.html --> agenda/_any_.html /home/foo/bar --> _any_
Give some special attention to /agenda/2011/foo/bar/lux and /agenda/2011/foo/bar/myAgendaItem.html. Understand that agenda/2011/_default/default_ does not fit, and why the fallback to the _any_ and _any_.html is done. The matcher _any_ at the root of the SiteMap is typically the catch-all matcher that creates a 404 page.
After a SiteMapItem is matched the HST request processing is invoked with a flyweight runtime instance of this SiteMapItem. The most important properties of a SiteMapItem are:
- hst:componentconfigurationid: The relative path to the hst:component (tree) below /hst:hst/hst:configurations/{myproject}. For example, for the SiteMapItem home it might be hst:pages/home and for news/any_ for example _hst:pages/newsoverview. REST pipelines using a sitemap do not use the hst:componentconfigurationid : The componentconfigurationid is only used for website development that is based on aggregating content based on HST Components
- hst:relativecontentpath: The content path relative to /hst:hst/hst:sites/{myproject}/hst:content. For example for the SiteMapItem home it might be common/homepage. Note that the relativecontentpath property can contain references to wildcards from the SiteMap. References are indicated by propertyplaceholders which have the syntax ${integer} or ${parent}, where ${parent} means : use the relativecontentpath of the parent SiteMapItem. Thus, for example the relativecontentpath for the following SiteMapItem could be as follows:
- news/_any_ : news/${1}
- news/default/default/_any_ : news/${1}/${2}/${3}
- Note that the ${1} always refers to the top matched ancestor containing a _default_, ${2} to the second, etc.
- Also note that if one of the property placeholders cannot be resolved for a request, the entire value is resolved to null.
SiteMapItem _index_
There is a special SiteMapItem name that is _index_. It is a bit comparable to Apache DirectoryIndex Directive though we do not require the pathInfo to end with a '/'. The _index_ SiteMapItem works as follows:
-
In case a URL is requested that matches SiteMapItem called foo that contains a child SiteMapItem called _index_, and if the hst:relativecontentpath of that _index_ item points to an existing document or folder,
then the final matched SiteMapItem will be the _index_ SiteMapItem. -
If there exists an _index_ SiteMapItem below the SiteMapItem matched by the PathInfo, but if that _index_ item does not have an hst:relativecontentpath that points to an existing document or folder, then the SiteMapItem matched by the PathInfo will be returned.
-
Linkrewriting for documents that do match an _index_ SiteMapItem do get a link that matches the parent of the _index_ SiteMapItem : During the matching phase, the _index_ SiteMapItem will be used any way.
-
The _index_ SiteMapItem is supported below both explicit SiteMapItems and * SiteMapItems. The _index_ SiteMapItem is not supported directly below the hst:sitemap though. It is not supported below **, **.html or *.html SiteMapItems, either.
-
The hst:relativecontentpath of _index_ SiteMapItems can use propertyplaceholders like ${1}, ${2} and ${parent}.
Additional properties of a SiteMapItem
Property name |
Example |
Description |
hst:namedpipeline |
JaxrsRestContentPipeline |
The pipeline to use for the further HST request processing. If not present, the parent namedpipeline is used and if there is no parent, the Mount namedpipeline is used. If also not configured on the Mount, the default value used by the HST is DefaultSitePipeline which is a pipeline that invokes the HstComponent based request processing. |
hst:refId |
homeId |
Optional property. It must be unique within a single sitemap item tree. With this property value, you can create a link to the SiteMapItem with this refId value instead of a path value of SiteMapItem. For example, instead of using path values to SiteMapItems, you can configure a refId values in ' hst:referencesitemapitem' of a SiteMenuItem, ' hst:homepage', or ' hst:pagenotfound' of a Mount configuration. This can be very useful if you have different SiteMapItem nodes for each language, but each SiteMapItem has the same ' hst:refId' value such as ' home' because those are just multi-lingual variants of the same sitemap item like ' home'. HST Link Creating components will look up a SiteMapItem configuration by refId first and then it will look up a SiteMapItem configuration by the path if not found by refId. |
hst:excludedforlinkrewriting |
true |
Do not use this sitemapitem for linkrewriting if set to true. This is an important property if you want to support REST sitemap items next to normal website sitemap items. |
hst:locale |
en_US |
The locale for the sitemapitem and descendants. If not configured, the value is inherited from the Mount. |
hst:parameternames |
pageSize |
Keys which can be retrieved during HST request processing. The multi-valued property parameternames ans parametervalues must have equal number of items, otherwise, they are all skipped. |
hst:parametervalues |
5 |
values which can be retrieved during HST request processing. Propertyplaceholders like ${1}, ${2} are supported. The multi-valued property parameternames ans parametervalues must have equal number of items, otherwise, they are all skipped. |
hst:authenticated / hst:roles / hst:users |
For securing the sitemapitem. |
|
hst:responseheaders |
["Access-Control-Allow-Origin: http://localhost:3000", "Access-Control-Allow-Credentials: true"] |
Custom HTTP Response Header(s) to be always written for a request on this sitemap item, and its descendant sitemap items unless the property is overriden. For example, when Cross-Origin Resource Sharing (CORS) is required with this sitemap item, you can configure related response headers through this property. This property is to be set to a string array, each of which should be in the form of ( header_name + ':' + header_value ) like the example on the left. |
hst:hiddeninchannelmanager | true |
If true, suppresses the sitemap item from showing up in the Sitemap navigation in the Experience manager. This can be used to hide sidemap items not relevant to the end user, for example a sitemap item representing an RSS feed. |