Namespace Migration
Introduction
Goal
Make backwards-incompatible changes to a namespace's node type definitions.
Use Case
Changes to existing namespaces fall into two categories:
- Backward-compatible: those changes that can be applied without worrying about existing nodes in the repository that use the namespace's node type definitions.
- Backward-incompatible: those changes that might lead to problems with such existing nodes.
For instance, if you add a non-mandatory property to a node type definition, the existing content is still valid. You may want to provide existing content with a default value, but this can be done with a separate updater script. But if you want to remove a property from a node type definition, nodes of that type might contain that property and will fail to load after you changed the type definition because the node contents now violate the constraints put on that node by the type definition.
Similarly, introducing a new node type to a namespace does not affect existing content. However removing a node type from a namespace renders instances of that type invalid.
For this reason Jackrabbit limits the changes you can make to existing namespaces to backward-compatible changes. Making backward-incompatible changes requires migration to a new namespace.
Migrating Namespaces
To make such changes to namespaces requires the following procedure:
- Remap the namespace URI of the namespace you want to change to another prefix.
- Register a new namespace URI with the prefix previously associated with the old namespace URI.
- Import the existing namespace's node type definitions in the new namespace and make the required changes.
- Rewrite all the nodes in the repository that refer to the old namespace to use the new namespace.
A simple example will hopefully make this more clear.
Say you have the following node type definitions in a namespace http://example.com/1.0:
<nt='http://www.jcp.org/jcr/nt/1.0'> <example='http://example.com/1.0'> [example:foo] > nt:base - example:bar (string)
The namespace URI http://example.com/1.0 is mapped to the prefix example. The first step is to remap this namespace to a different prefix, for instance example_1. The JCR namespace registry is repository-wide. All actual content, including the node types, are naturally stored with the namespace URI instead of the namespace prefix so that after this remapping the node type that was previously called example:foo is now called example_1:foo, and for instance a property called example:bar is now called example_1:bar. JCR even has a property type 'Name': if the value of a Name type property was example:name, internally this was stored as {http://example.com/1.0}name, and so after the remap the value will be example_1:name. This is all very straightforward and in accordance with what one would expect when using namespace mappings, but it is worth to realize the impact of remapping a namespace like this.
The second step is to register a new namespace with the old prefix. Typically in these cases we will choose a new namespace similar to the old URI and and simply up the version: http://example.com/2.0.
Now we can import the node types and make the required changes:
<nt='http://www.jcp.org/jcr/nt/1.0'> <example='http://example.com/2.0'> [example:foo] > nt:base
The last step is to rewrite all the nodes in the old namespace http://example.com/1.0. In this case this involves removing the property example_1:bar from all nodes of type example_1:foo and changing its primary node type to example:foo.
The Migrator Tool
Hippo provides a stand-alone tool for migrating namespaces to perform such incompatible changes using the above described procedure. Like the checker tool this migration tool is an executable jar, but unlike the checker tool it requires you to write a custom updater to rewrite the nodes affected by namespace change. The idea is to build your own migrator jar that includes this custom updater. The following describes how to build such a custom migrator jar.
Implement a Custom Updater
To rewrite the nodes affected by the namespace change you need to write a custom Updater. First declare the maven dependency on the migrator tool to put the Updater interface to implement on the compile path:
<dependencies> <dependency> <groupId>org.onehippo.cms7</groupId> <artifactId>hippo-migrator</artifactId> <version>1.01.01</version> </dependency> </dependencies>
A default implemetation Updater is provided called BasicUpdater that you can extend as a starting point for your own custom implementation. This BasicUpdater contains all the code required to do a vanilla migration where no actual node types have changed but where only the namespace URI is updated. This includes:
- The primary type if it is in the old namespace.
- Renaming any mixin types defined on the node that are in the old namespace.
- Renaming the node itself if it is prefixed with the old namespace.
- Renaming any properties that are prefixed with the old namespace.
- Changing any Name and URI type properties that use the old namespace.
You will only need to add custom code to make the specific changes that apply to the way your node type definitions were updated. Continuing with the example introduced above your custom ExampleUpdater might look something like the following:
package com.example.update; import javax.jcr.Node; import javax.jcr.RepositoryException; import org.onehippo.cms7.repository.migration.BasicUpdater; public class ExampleUpdater extends BasicUpdater { @Override public void update(final Node node) throws RepositoryException { final String primaryNodeTypeName = node.getPrimaryNodeType().getName(); if (primaryNodeTypeName.equals(getOldNamespacePrefix() + ":foo")) { node.addMixin("hipposys:unstructured"); node.setPrimaryType(getNewNamespacePrefix() + ":foo"); node.getProperty(getOldNamespacePrefix() + ":bar").remove(); node.removeMixin("hipposys:unstructured"); } super.update(node); } }
Several things are worth highlighting in this code snippet. First of all, the BasicUpdater provides getters for the old and new namespace URIs and prefixes. In this case the old namespace prefix would be example_1. While you may know the new prefix in advance, this may not be the case for the old prefix because the migrator generates this prefix automatically. Second, in order to make the transition from the primary type example_1:foo to example:foo it is necessary to temporary loosen the node type constraints. This is done by setting the mixin hipposys:unstructured. This mixin contains residual node and property definitions that allow any property and child node to be present. If we wouldn't have first set this mixin we would have gotten a ConstraintViolationException when attempting to set the primary node type to example:foo. This is because the latter does not allow the property example_1:bar to be present. Once you are done changing the node you must remove the temporary mixin again.
Building the Executable Jar
Now that you have implemented the custom updater you can build the executable migrator jar. This is done using the Maven Shade plugin:
<plugin> <artifactId>maven-shade-plugin</artifactId> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> </execution> </executions> </plugin>
Run mvn package to generate the jar.
Running the Migrator
You can run the migrator as follows:
java -jar migrator.jar <command>
Setting up the Migrator
First try the help command to get a basic explanation of the tool.
You will first have to create a migration.properties file. You can do so by running
java -jar migrator.jar props > migration.properties
Open this file and specify the name of the repository.xml to use, the CND of your new node type definitions, and the fully qualified class name of the updater.
# point to your custom repository.xml rep.config=example-repository.xml # your new cnd migration.cnd=example.cnd # your Updater to migrate the nodes migration.updater=com.example.ExampleUpdater
Next you will need a repository configuration file. You can generate a template configuration file by running
java -jar migrator.jar config > example-repository.xml
The template contains example settings for a MySQL database. Edit this file according to your setup.
The new CND should define the new node type definitions in the new namespace. The migrator tool reads both the prefix and the new namespace URI of your namespace and figures out the namespace URI that the prefix is currently mapped to and generates a prefix to move the old namespace to. The migrator will refuse to run if the namespace URI in the CND is already mapped.
Run it
You should now be able to run the migrator.
java -jar migrator.jar migrate
Recovering from Errors
It is highly recommended to test the migration thoroughly before running it on the production database, and even then it goes without saying to always make a backup first! If you do run in to errors in your Updater it is possible to continue the migration without starting all over again. To do so simply start the migrator with --continue flag:
java -jar migrator.jar migrate --continue
This will skip the namespace remapping phase and go straight to the content migration phase.
Reindexing the Cluster Nodes
After a running a namespace migration it is essential that all cluster nodes be reindexed: as in the database, in the index, names are stored with the namespace URI, not with the prefix. Simply remove the lucene indices on the different cluster nodes to initiate a reindex on startup.
Limitations
Be aware that if you have migrated a namespace that was used for document types then the version history of those documents up to the migration point is no longer accessible. That is to say, previous versions of your documents that have a type in a previous namespace can no longer be restored!