Form Data Cleanup
Introduction
Goal
Schedule automatic cleanup of the form data stored in the content repository.
Background
Form data submitted to the site are stored in the repository under /formdata. To prevent the /formdata storage from growing indefinitely, a cleanup job can be configured with the repository scheduler that automatically purges the (temporary) data.
Configure the Form Data Cleanup Repository Job
Cleanup of old hst:formdata nodes is done by the FormDataCleanupJob registered with the repository scheduler. It can be configured using the Console at:
/hippo:configuration/hippo:modules/scheduler/hippo:moduleconfig/system/FormDataCleanup
The following configuration attributes are recognized:
- minutestolive
Time to live in minutes. Formdata nodes older than this will be deleted. The default value is 1440, i.e. 24 hours. A value of -1 means cleanup of formdata nodes is disabled. - excludepaths
Specifies paths to exclude. Multiple exclusion paths can be configured by seperating them with a pipe (|). Formdata nodes below excluded paths are not deleted. Defaults to /formdata/permanent/. - batchsize
Specifies how often (i.e. after every batchsize nodes) the JCR session is saved while iterating through the form data nodes. The default value is 100. A value between 10 and 1000 is recommended. This advanded option allows performance tuning in cases where a very large amount (i.e. millions) of nodes must be processed. Use only when you know what you are doing.
The job is run according to the configured cron expression in the cron trigger below the job node. At that time the formdata cleanup job runs by doing a search for nodes of type hst:formdata, then going through the result and deleting nodes older than the configured time to live, ignoring nodes below excluded paths.
If you want to change the schedule of the job, adjust the cron expression on the trigger. See the quartz cron trigger tutorial for instructions on the syntax.