Data Collection Overview
The Data Collector is a centralized and remotely managed Java application responsible for interfacing with enterprise objects, such as backup servers and storage arrays, gathering information related to storage resource management.
The Data Collector continuously collects data and sends this data, using an http or https connection, to another Java application, the Data Receiver. The Data Receiver runs on the Portal Server and stores the data that it receives in the Reporting Database. When you use the Portal to generate a report, the Portal requests this information from the Reporting Database, then returns the results in one of the many available reports.
The Data Collector obtains all of its monitoring rules from a Data Collector configuration file. This file resides in the Reporting Database in XML format. When the Data Collector first starts, it downloads this file from the Reporting Database. The Data Collector uses this file to determine the list of enterprise objects that are to be monitored and included in its data collection process.
Data Collection Component Configuration
1. On the Portal Server:
• Create a Data Collector in the Portal to enable the Portal server to receive data from the Data Collector server. In the Portal, you must first create a Data Collector and then populate it with product-specific or host data collection policies. A single Data Collector can be installed for multiple capacity, fabric, virtualization, file analytics, and backup products. This Data Collector configuration in the Portal contains the configuration details for communicating with the corresponding Data Collector Server.
• Add subsystem-specific Data Collector Policies. A Data Collector Policy provides the configuration details required to communicate with a subsystem to retrieve data that will be stored in the APTARE IT Analytics database. These details are specific to the vendor of the enterprise object from which data is collected. Policies also allow you to set the schedule for data collection. Prior to creating Data Collector Policies, a Portal Data Collector must be created.
2. On the Data Collector Server:
• Add the Portal IP address to the Local Hosts file on the Data Collector server or on any available client with web-browsing capabilities. Note: Only edit the local hosts file if a DNS entry hasn’t already been set up in your enterprise to resolve both http://aptareportal.yourdomain.com and http://aptareagent.yourdomain.com to the Portal IP address.
• Install the Data Collector software. This software component, installed on the Data Collector Server, interfaces with each of the supported subsystems to extract meta-data about the underlying environment. For example, backup data can include job details and tape inventory information. In the case of Capacity Manager, the Data Collector communicates with the storage arrays in your SAN (Storage Area Network) to collect meta-data.
About Data Collection Tasks
A Data Collector regularly queries your enterprise objects for specific information, and each information type is called a collection task. Each collection task runs at specific intervals, and not all collection tasks run at the same intervals.
A collection task does not always return data because sometimes there isn’t any data to return. However, when the collection task returns data, this historical information is used to determine the collection task’s activity pattern or threshold.
Backup Collection Tasks
Most collection tasks run between every 20 minutes to 24 hours. However, one collection task, the Backup Job Completed Event, can post data several times a second at the height of the backup window, thereby setting the historical period for posting data to a very short interval. Subsequently, when the backup window is closed and no backups are being performed, the status monitoring might indicate an alert for this data collection component. If you have a backup window with heavy activity and then no or little activity, you may encounter some false positives for this component. If this component indicates it has not captured any data for more than 24 hours, then the component likely indicates an issue that requires investigation.
Backup Event Data Collector
The Event Data Collector is the software component responsible for capturing backup event data. It is started for the following subsystems: Commvault Simpana, EMC Data Domain, EMC NetWorker, HP Data Protector, and Veritas Backup Exec. This event collection is logged to enable troubleshooting and isolation of collection issues, by processing thread. You can access the logs via the Support Tools utility:
Admin > Advanced > Support Tools. See also,
Data Collector Log Files for a description of the log file naming convention.