Data Retention Defaults
There are several levels of retention (keep clauses) you can set, from the general System_Defaults_System Process Definition (which lets you configure global settings) all the way down to retention per object or status. It's important to remember these settings, especially when working with Chain Definitions.
System_Defaults_System
At minimum, every customer should create the System_Defaults_System Process Definition at the beginning of their implementation. Doing so will help you avoid creating Process Definitions that are configured for unlimited retention.
This Process Definition cannot be submitted, but simply creating it sets a few system-wide defaults. If your RunMyJobs instance does not have this definition, create it in the GLOBAL Partition (it does not matter which Definition Type you pick when creating it). RunMyJobs will automatically recognize it and change the settings after you click Save & Close.
Note: When creating or updating System_Defaults_System you are only able to set the Name, Description, Documentation, restart behavior, Retention, and Security settings.
As a base, Redwood recommends setting the Keep clause to 32 days, which ensures one month +1 day of retention. This is applicable to all statuses.
Note: Setting this system retention does not mean other retention limits should not be set. After you set up System_Defaults_System, you will most likely need to fine-tune retention at the Process Definition level.
System_Defaults_Partition
The System_Defaults_Partition Process Definition lets you set default retention at the Partition level. This can be useful if you have a Partition with a lot of processes that run multiple times a day.
Note: Partition-level retention defaults override system-level retention defaults.
This Process Definition works the same way as System_Defaults_System: Simply create a new Process Definition with this name and set its defaults.
For example, assume the EUROPE Partition contains Month End Close processes that run monthly, and assume that you want to be able to retain those processes for six months, regardless of the System_Defaults_System setting. You can do this by configuring the System_Defaults_Partition Process Definition as follows.
Chain and Process Definition Retention
Generally speaking, the more frequently a process runs, the shorter its retention period should be. When you create a new Chain or Process Definition, confirm the retention you want to use for it by asking the following questions:
- Is the system-wide retention okay according to the defined standards for its frequency?
- Is the Partition-wide retention okay according to the defined standards for its frequency?
- Is there a Chain on the top level where you will set retention?
If none of these options apply, Redwood strongly recommends that you configure the Retention tab for the Chain or Process Definition.
When a Chain Definition is used inside of another Chain Definition, the highest-level Chain determines the retention of everything in the Chain Definition. Consequently, you can select No retention configured for child Chains and Process Definitions as long as you configure retention for the parent Chain Definition.
If you check Keep Force in the Retention tab for a child Process Definition or Chain Definition:
- If the sub-process has lower retention, the Chain will display, but you might miss executions that were already purged.
- If the sub-processes have a higher retention, they will stay visible for a longer period, but you will not see any relation to the parent Chain.
Best Practice: Set Retention at the Highest Level
The following image shows a Chain with Chain Processes and several sub-Chains. Setting the retention on TOP_CHAIN applies that retention to all underlying Chains and processes definitions.
Best Practice: Frequent Processes - Every 15 Minutes
You might need to schedule a process very frequently, for example every 15 minutes. In these situations, ascertain whether the process can be event-based (for example, triggered via a file event or Web Service). If this is not possible, configure retention as shown below.
As a base rule, you can set the retention for frequently run processes to only keep the last 100 executions. If you start to run many of these processes (50 or more), consider lowering this number to keep only ten executions.
If you want to be able to troubleshoot, you can keep processes that have status Error. The screen shot below shows a scenario wherein completed processes are not kept, but errors ARE kept using the Keep Process in Status code E
. This is helpful if you want to be able to trace processes in a certain status for a longer period of time than the standard status in order to troubleshoot, find trends, and so forth.
Note: Keep Process In Status means these statuses will not be automatically deleted. Be sure to specify a keep duration, delete the processes manually, or create a script to delete them.
The processes per dropdown list can be used to store the specified number of processes per User or Key. Keep in mind that setting this to something other than System can increase the number of processes kept.
Example Retention Configurations
This section gives an overview of common and best-practice retention configurations when you are developing Process and Chain Definitions.
Note: Data older than three months should be archived and then removed from RunMyJobs, even for lower-frequency processes.
The table below provides some examples and housekeeping rules to use as a guideline. It shows an ideal retention based on the above three-month statement and general usage, and a best practice retention that will be workable in almost all environments.
Process or Chain Frequency | Ideal Retention | Best Practice Retention |
---|---|---|
Monthly | 3 months | 12 months |
Weekly | 3 months | 6 months |
Multiple times per week (less than daily) | 3 months | 3 months |
Daily | 8 days (using Keep Process in Status to keep error processes for longer) | 1 month |
2 runs per day | 8 days (using Keep Process in Status to keep error processes for longer) | 1 month |
8 runs per day | 8 days (using Keep Process in Status to keep error processes for longer) | 8 days (using Keep Process in Status to keep error processes for longer) |
Hourly or less | Last 25 executions | Last 100 executions |
Reviewing Retention Settings
You can review the retention settings of all your processes by navigating to Definitions > Retention.
This screen shows No retention configured if you set retention at the Chain level. This is because RunMyJobs cannot be sure whether you also submit a process outside of its Chain Definition.
You can use the code below to run a RedwoodScript Process Definition. It will do an additional check on the retention of parent Chains. This list should be a lot smaller, and can be used to review retention on your environment. It does assume you run all processes without retention configuration in Chains where retention is set.
It is normal that you will still see System_ and SAP_ Process Definitions (which are either templates or configuration objects) in this list.
{
String query = "select jd.* from JobDefinition jd where (jd.UniqueId = jd.MasterJobDefinition) and (jd.KeepUnits is StringNull) order by jd.Partition, jd.Name";
for (Iterator it = jcsSession.executeObjectQuery(query, null); it.hasNext();)
{
JobDefinition jd = (JobDefinition)it.next();
if (!parentHasRetention(jd))
{
jcsOut.println(jd.getPartition().getName() + "." + jd.getName());
}
}
}
boolean parentHasRetention(JobDefinition jd)
{
boolean result = false;
String query = "select jcc.* from JobChainCall jcc where (jcc.JobDefinition = ?)";
for (Iterator it = jcsSession.executeObjectQuery(query, new Object [] {jd.getUniqueId()}); it.hasNext();)
{
JobChainCall jcc = (JobChainCall)it.next();
if (jcc.getJobChainStep().getJobChain().getJobDefinition().getKeepUnits() != null)
{
result = true;
break;
}
else
{
result = parentHasRetention(jcc.getJobChainStep().getJobChain().getJobDefinition());
if (result)
{
break;
}
}
}
return result;
}