Solutions Administration > Configuring Risk Mitigation Rules
  
Version 10.2.01P10
Configuring Risk Mitigation Rules
APTARE IT Analytics provides a set of risk mitigation rules to assess areas within your enterprise that may be at risk of meeting data protection objectives. These rules include parameters that can be configured to isolate specific conditions relevant for your environment. For example, the Clients with No Recent Backups rule can be modified to specify the number of days for which no backups occurred and also to exclude retired clients. While various use cases drive how you configure a rule, the goal is to have analytics that help you identify areas at risk and trends that require attention. This on-going process should periodically assess trends and codify business practices. See Risk Mitigation Solution Overview, and Risk Mitigation Reports.
Once configured, a scheduled process gathers historical data for these categories so that you can identify areas that require further scrutiny. Accompanying reports present data that can be monitored over time, enabling an actionable process to reduce risk.
Best Practice
When configuring values for parameters, be as liberal as possible, initially. Then, over time, change parameters to produce a narrower actionable list. For optimal data comparisons, avoid frequent parameter modifications.
To edit a risk mitigation rule
If risk mitigation rules are not modified, the historical data process uses an active rule’s default settings to collect the historical data.
1. Select Admin > Solutions > Risk Mitigation
 
Rule
Rules are listed within relevant categories, such as Cloud and Storage.
Availability
If a particular type of collection is not licensed or collected, risk mitigation data will not be available, regardless of how a rule is configured. In some cases, a Portal may have the necessary license, but collection may not have been enabled and/or completed.
Data Protection rules require a Backup Manager license.
Storage requires a Capacity Manager license.
Description
The full description of the Risk Mitigation rule can be viewed by placing your mouse over the description.
Notes
Enter operational notes for future reference.
Status
Green check mark indicates successful collection of risk mitigation historical data for enabled rules.
Red X indicates failed historical data collection. It could be that collection is attempting to access data for a product module that is not in your Portal environment. Click the red icon to view the Database Error Aggregation report.
A non-colored circle indicates that the background process did not run, typically because the rule is not enabled.
State
Indicates if the rule is Enabled or Disabled.
Last Run
The date and time that the background process ran and evaluated the collected data against the rule’s configured parameters.
2. Select a rule in the Risk Mitigation grid and click Edit. Or, simply double-click the rule to access the edit dialog.
Risk Mitigation Rule
Description
Data Protection Rules
Backup Job Size Variance
Compares client’s average job size, which may help to identify backup issues.
Compliance RTO RPO
Considers RTO (Recovery Time Objectives) and RPO (Recovery Point Objectives) for backups by determining when/if the last full backup was performed. Then, add in the time it takes to apply any incremental backups.
Assists in computing RTO (Recovery Time Objectives) and RPO (Recovery Point Objectives) for backups by determining when/if the last full backup was performed. Then, add in the time it takes to apply any incremental backups to determine if you are meeting your SLAs.
NetBackup Disk Pool Forecast
Provides NetBackup disk pool statistics for the number of weeks in the selected period are examined to forecast the date when storage will run out within the next three years.
If the prediction is beyond three years, a status is returned.
Source Overall Status Summary
Considers sources for which backup jobs were not successful to determine risk. This rule helps in finding such sources by providing Status Summary.
Determining if source backups were successful is complicated, especially if there are multiple policies and schedules defined for that source and if there are multiple streams per backup set. Also there needs to be an established cutoff time to determine what to do if a source is still running or has not made all of its attempts.
The following criteria is considered:
1. If a source fails all of its jobs it is failed.
2. If a source successfully completes all of its jobs it is successful.
3. If a source completes all of its jobs with status 1 (skipped files) it was partially successful and probably OK.
4. If a source has a mixture of successful jobs and failed jobs, it needs further examination to determine if the jobs were truly successful.
Now, there is logic that can be applied to #4 in order to programmatically determine whether a source was successful or not, but that logic varies from customer to customer.
Sources Consecutive Failure
Evaluates sources where consecutive backups have failed or no backups have occurred for consecutive days. This rule examines the past 14 days of history, providing insights to possible problematic areas.
Best Practice: Schedule this rule to run every day at the end of the backup window. This rule works with any backup product.
Sources with No Recent Backups
Reviews details of sources that have not been backed up in a defined number of days to help determine if the sources are at risk.
Specify the number of days for which backups have not occurred to determine the risk.
Storage Rules
Host Multi-Pathing Exposure
Identifies hosts that are at risk because they have less than the specified number of paths. Examines LUN mappings of hosts that do not have multiple HBA ports and Array ports configured between a host and a LUN. Normally, the requirement is 2 HBA and 2 Array ports are configured, so when any HBA or Array port fails, there is another port to keep the connection between the host and the LUN.
Hot Array Ports
Identifies overactive array ports, which may indicate a risk to application performance.
Array port performance data is examined to identify spikes in data transferred.
Hot LUNs by Read IO
Reveals spikes in Read I/O performance metrics, which may indicate an area of risk. This rule uses a unique yet simple algorithm to identify abnormal performance patterns.
Hot LUNs by Read Response
Reveals spikes in Read Response Time metrics, which may indicate an area of risk. This rule uses a unique yet simple algorithm to identify abnormal performance patterns.
Hot LUNs by Write IO
Reveals spikes in Write I/O activity, which may indicate an area of risk. This rule uses a unique yet simple algorithm to identify abnormal performance patterns.
Hot LUNs by Write Response
Reveals spikes in Write Response Time metrics, which may indicate an area of risk. This rule uses a unique yet simple algorithm to identify abnormal performance patterns.
Thin Pool Forecast
Uses multi-vendor and multi-metric pool capacity and forecast data to identify storage at risk.
Virtualization Rules
VM Datastore Forecast
Examines VMWare datastore statistics for the number of weeks in the defined period to forecast the date when storage will run out within a three year period.
VM Guest Disk Forecast
Examines VMWare guest disk statistics for the number of weeks in the defined period to forecast the date when storage will run out within a three year period.