HST Rewriting rich text field runtime
Introduction
Goal
Rewrite rich text content in documents at runtime.
Background
As a developer you might want to inject runtime changes to the rich text field from some document. You even might want to inject context aware runtime modifications.
For example:
-
When an internal link cannot be resolved, remove the entire <a> element
-
When a link is created, add a tooltip
-
When the channel is a mobile channel, take images of lower resolution
-
Create a lightbox for images (show some small variant that is clickable to show a large one)
-
Create a context aware lightbox for images : Depending on the context, show a different sized image when clicking
-
Etc
To use your own content rewriter, use one of the following options:
- configure it on a per template basis (HST Front Ends)
- override the global default (HST Front Ends)
- override the default content rewriter Spring bean (Delivery API)
Rewrite Rich Text Content in HST Front End
Configure on a Per Template Basis
Normally, when displaying rich text content, you use something like
JSP
<hst:html hippohtml="${requestScope.document.html}"/>
Freemarker
<@hst.html hippohtml=document.html />
This assumes content rewriting is done with the built-in HST SimpleContentRewriter. But, you can use your own custom content rewriter as well. Your script becomes something like:
JSP
<hst:html hippohtml="${requestScope.document.html}" contentRewriter="${requestScope.myContentRewriter}"/>
Freemarker
<@hst.html hippohtml=document.html contentRewriter=myContentRewriter />
Also, you need to have set myContentRewriter on the request as well. Thus for example, you BaseComponent could have something like:
public abstract class BaseComponent extends BaseHstComponent { public static final MyContentRewriter myContentRewriter = new MyContentRewriter(); @Override public void doBeforeRender(HstRequest request, HstResponse response) { // always have the custom content rewriter available request.setAttribute("myContentRewriter", myContentRewriter); }
Configure Global Default
By default, the HST uses the SimpleContentRewriter. You can override this default by configuring default.hst.contentrewriter.class in the file hst-config.properties, which is typically located in the site/webapp/src/main/webapp/WEB-INF directory in a project.
Writing a Custom Content Rewriter
Next, your custom content rewriter needs to be written. It needs to implement org.hippoecm.hst.content.rewriter.ContentRewriter. The easiest way is to extend from org.hippoecm.hst.content.rewriter.impl.AbstractContentRewriter, or even from SimpleContentRewriter which gives you many rewriting utilities already.
Assume you want to write a content rewriter that adds a style=color:red to internal links that are broken. The easiest way to achieve this is to extend SimpleContentRewriter. The SimpleContentRewriter does string based rewriting of the rich text field. For our use case, it is easier to use the org.htmlcleaner.HtmlCleaner to do the job. Hence, our content rewriter will need to override the rewrite method from SimpleContentRewriter. The BrokenLinksMarkerContentRewriter below should pretty much do what we want. Note there is one important thing: The SimpleContentRewriter does content rewriting for html links and images : If you override rewrite to only change the way links are rewritten, then, at the end, you need to call super.rewrite(..) unless you make sure that you also rewrite images. The example below does also do the rewriting of image therefor, and does not need the super.rewrite(..)
If you also need access to the HstRequest / HstResponse in your ContentRewriter, then you can use the following code
HstRequest hstRequest = HstRequestUtils.getHstRequest( requestContext.getServletRequest()); HstResponse hstResponse = HstRequestUtils.getHstResponse( requestContext.getServletRequest(), requestContext.getServletResponse() );
BrokenLinksMarkerContentRewriter:
import javax.jcr.Node; import org.apache.commons.lang.StringUtils; import org.hippoecm.hst.configuration.hosting.Mount; import org.hippoecm.hst.content.rewriter.impl.SimpleContentRewriter; import org.hippoecm.hst.core.linking.HstLink; import org.hippoecm.hst.core.request.HstRequestContext; import org.htmlcleaner.CleanerProperties; import org.htmlcleaner.HtmlCleaner; import org.htmlcleaner.TagNode; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public class BrokenLinksMarkerContentRewriter extends SimpleContentRewriter { private final static Logger log = LoggerFactory.getLogger(SimpleContentRewriter.class); private static boolean htmlCleanerInitialized; private static HtmlCleaner cleaner; private static synchronized void initCleaner() { if (!htmlCleanerInitialized) { cleaner = new HtmlCleaner(); CleanerProperties properties = cleaner.getProperties(); properties.setOmitHtmlEnvelope(true); properties.setTranslateSpecialEntities(false); properties.setOmitXmlDeclaration(true); properties.setRecognizeUnicodeChars(false); properties.setOmitComments(true); htmlCleanerInitialized = true; } } protected static HtmlCleaner getHtmlCleaner() { if (!htmlCleanerInitialized) { initCleaner(); } return cleaner; } @Override public String rewrite(final String html, final Node node, final HstRequestContext requestContext, final Mount targetMount) { if (html == null) { if (html == null || HTML_TAG_PATTERN.matcher(html).find() || BODY_TAG_PATTERN.matcher(html).find()) { return null; } } try { TagNode rootNode = getHtmlCleaner().clean(html); TagNode [] links = rootNode.getElementsByName("a", true); // rewrite of links // THIS IS WHERE THE EXAMPLE IS ABOUT: WHEN A LINK CANNOT BE // RESOLVED, WE REMOVE THE href AND SET A STYLE for (TagNode link : links) { String documentPath = link.getAttributeByName("href"); if(isExternal(documentPath)) { continue; } else { String queryString = StringUtils.substringAfter(documentPath, "?"); boolean hasQueryString = !StringUtils.isEmpty(queryString); if (hasQueryString) { documentPath = StringUtils.substringBefore(documentPath, "?"); } HstLink href = getDocumentLink(documentPath,node, requestContext, targetMount); // if the link is null, marked as notFound or has an // empty path, we mark the link element with a // style=color:red if (href == null || href.isNotFound() || href.getPath() == null) { // mark the element and remove the href link.removeAttribute("href"); setAttribute(link, "style", "color:red"); } else { String rewritterHref = href.toUrlForm( requestContext, isFullyQualifiedLinks()); if (hasQueryString) { rewritterHref += "?"+ queryString; } // override the href attr setAttribute(link, "href", rewritterHref); } } } // BELOW IS FOR REWRITING IMAGE SRC ATTR WHICH RESULTS IN // VERY SAME BEHAVIOR AS SimpleContentRewriter // We could skip the code below altogether, and rewrite the // result below from getHtmlCleaner().getInnerHtml(bodyNode); // with super.rewrite() from SimpleContentRewriter TagNode [] images = rootNode.getElementsByName("img", true); for (TagNode image : images) { String srcPath = image.getAttributeByName("src"); if(isExternal(srcPath)) { continue; } else { HstLink binaryLink = getBinaryLink(srcPath, node, requestContext, targetMount); if (binaryLink != null && binaryLink.getPath() != null) { String rewrittenSrc = binaryLink.toUrlForm( requestContext, isFullyQualifiedLinks()); setAttribute(image, "src", rewrittenSrc); } else { log.warn("Skip href because url is null"); } } } // everything is rewritten. Now write the "body" element // as result TagNode [] targetNodes = rootNode.getElementsByName("body", true); if (targetNodes.length > 0 ) { TagNode bodyNode = targetNodes[0]; return getHtmlCleaner().getInnerHtml(bodyNode); } else { log.warn("Cannot rewrite content for '{}' because there is no 'body' element" + node.getPath()); } } catch (Exception e) { throw new RuntimeException(e); } return null; } private void setAttribute(TagNode tagNode, String attrName, String attrValue) { if (tagNode.hasAttribute(attrName)) { tagNode.removeAttribute(attrName); } tagNode.addAttribute(attrName, attrValue); } }
Rewrite Rich Text Content in Delivery API
In headless implementation scenarios using the Delivery API, the default content rewriter used is org.hippoecm.hst.pagemodelapi.v10.content.rewriter.HtmlContentRewriter and can be overriden using Spring bean configuration.
Extend HtmlContentRewriter
Make sure your custom content rewriter class extends org.hippoecm.hst.pagemodelapi.v10.content.rewriter.HtmlContentRewriter, for example:
site/components/src/main/java/org/example/CustomRewriter.java
package org.example; import javax.jcr.Node; import org.hippoecm.hst.configuration.hosting.Mount; import org.hippoecm.hst.core.request.HstRequestContext; import org.hippoecm.hst.pagemodelapi.v10.content.rewriter.HtmlContentRewriter; import org.htmlcleaner.HtmlCleaner; public class CustomRewriter extends HtmlContentRewriter { // implementation here }
Override HtmlContentRewriter Spring Bean
Override the org.hippoecm.hst.pagemodelapi.v10.content.rewriter.HtmlContentRewriter Spring bean in an XML file placed in classpath*:META-INF/hst-assembly/addon/org/hippoecm/hst/pagemodelapi/v10/*.xml, for example:
site/components/src/main/resources/META-INF/hst-assembly/overrides/addon/org/hippoecm/hst/pagemodelapi/v10/custom.xml
<?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd"> <bean id="org.hippoecm.hst.pagemodelapi.v10.content.rewriter.HtmlContentRewriter" class="org.example.CustomRewriter"> <constructor-arg> <bean class="org.hippoecm.hst.content.rewriter.HtmlCleanerFactoryBean" /> </constructor-arg> <property name="removeAnchorTagOfBrokenLink" value="${pagemodelapi.v10.removeAnchorTagOfBrokenLink:false}" /> </bean> </beans>