The Availability Management module provides the tools and information to operators and administrators to:
Check the correct operation of the infrastructure
Prevent potential failures
Identify possible malfunctions and issue appropriate alerts
Incident management includes the following features: Mapping, Monitoring Active and Passive Monitoring and Alerting
Monitoring and alerts
The monitoring and alerting module is the operating tool whose support is the most immediate on the availability of the information system. Its role is to verify the good functioning of the equipment and computer services that it provides. It makes it possible to detect failures and prevent them whenever possible.
Based on the scheduler of Nagios, SmartReport uses 2 complementary methods, which make it possible to describe the operating status of an equipment or application.
It consists of pretending to be a user of the equipment, and to make use of the numerous services as would a user.
It uses the information provided by the equipment about its operating condition.
Through specific tests to each equipment and to each application, SmartReport checks on a regular basis (at also a specific frequency) that the equipment responds according to what is expected. It is typically about network testing (e.g. ICMP ping to check the connectivity to a router) or application testing (eg HTTP or SQL). The active monitoring tests are carried out “as user” of the equipment or application. The result of the test is therefore generally the same as what a user would obtain.
For the active monitoring, there is no agent or software module installed on the monitored equipment.
If a failure is detected, e.g. if there is no response within the fixed deadline, or if the response does not contain the expected character string, one or more tests are performed again at a shorter pace or longer one depending on the application in order to confirm the diagnosis and avoid false positives, that is the alerts issued while the equipment is working properly.
If the failure is confirmed, an alert is issued by email or SMS via an third party SMS gateway (not supplied). As in Nagios, the alerts modes are flexible and are extended to all that can be “scripted”.
Publishing Services describing the condition of an equipment.
Advantages of Nagios
The motor of Active Monitoring of SmartReport is based on the scheduler of the Opensource software a reference in terms of monitoring: Nagios. This choice makes it possible on the one hand to take advantage of the whole intelligence of the motor, particularly in reducing false positives, and also to easily integrate most already existing monitoring plugins with Nagios and its user community.
It also makes it possible to base it on an existing Nagios installation to save time during the integration of the solution, when recovering the existing.
The passive Monitoring is a monitoring method based on SNMP protocol and more particularly on the SNMP “traps” that regularly back the active equipment. SmartReport fills the role of SNMP Traps server and continuously receives the SNMP events sent by the equipment. If they correspond to events recognized and configured by SmartReport, an alert is issued.
Monitoring through NRPE
The NRPE protocol allows SmartReport to delegate the conduct of tests from third party equipment, a server, most commonly. The equipment then performs the test for which it is configured and sends its result to SmartReport through the NRPE agent.
NRPE is particularly relevant when it comes to perform specific tests to an application or system, and that these tests may not necessarily be performed remotely through an application server. We then use “local” tests or scripts, performed directly on the monitored equipment.
The use of the NRPE protocol is sometimes also necessary to conduct tests from specific points in the network, and in order not to perform the tests centrally from SmartReport, particularly in the case of a WAN network.
The use of NRPE on equipment requires the installation of an agent on the server.
Upon the detection of an incident (active or passive), when a threshold is reached or simply upon a return to normal, SmartReport may send a warning email to a contact or a contact group that you define. The contacts are specific to each equipment and/or each service monitored. As with Nagios, the frequency of re-emission is customizable.
The alerts are sent via email or through “net send”. SmartReport can also send SMS messages through third party SMS gateways, not provided. The alert modes are flexible and are extended to all that can be “scripted”.