The Uptime Infrastructure Monitor Agent monitor determines whether an agent is running on a system that you are monitoring. For a list of the currently-supported platforms, see Monitored Application Platform Support.
To configure Uptime Infrastructure Monitor Agent monitors, do the following:
The Exchange 2003 and Exchange monitors identify when certain performance counters for Microsoft Exchange servers have exceeded user-defined thresholds. These thresholds can be, for example, an inordinately high number of inbound connections or a rapidly-growing message queue. Whenever a threshold exceeds a warning or critical amount, Uptime Infrastructure Monitor generates an alert.
Use Uptime Infrastructure Monitor ’s Exchange 2003 monitor if you are using and monitoring Microsoft Exchange 2000 or 2003; use the Exchange monitor for later versions (e.g., Microsoft Exchange 2007 and 2010).
To configure an Exchange 2003 monitor for your Microsoft Exchange 2000 or 2003 server, do the following:
To configure an Exchange monitor for your Microsoft Exchange 2007 or 2010 server, do the following:
The IIS (Internet Information Server) service monitor checks the performance of an IIS Web server, based on thresholds that you set against common IIS performance counters. You can use this monitor to determine whether IIS is running on a defined port, and according to the thresholds you have set on common performance counters.
To configure IIS monitors, do the following:
Non-anonymous Users / Sec.
The rate, in seconds, at which registered users have made non anonymous requests to the IIS service.
IIS 6.0+ treats both an anonymous and a non-anonymous user request as a new user. |
The WebLogic monitor collect data that enables you to determine whether there is a performance problem or a failure on a WebLogic application server. Using the data that the WebLogic monitor collects, you can determine the root cause of the issue by generating a report (see Reports for J2EE Applications for more information).
The WebLogic monitors collect the following metrics from a WebLogic server:
Variables | Metrics |
---|---|
Connection Pools | FailuresToReconnectCountThe number of times that the connection pool failed to reconnect to a data store. ConnectionDelayTimeThe average time that was required to connect to a connection pool. ActiveConnectionsCurrentCountThe current number of active connections in a JDBC connection pool. ActiveConnectionsHighCountThe highest number of active connections in a JDBC connection pool. LeakedConnectionsCountThe total number of connections that are checked out of, but not returned to, the connection pool. CurrCapacityThe current number of database connections in the JDBC connection pool. NumAvailableThe number of available sessions in the session pool that are not currently used. WaitingForConnectionCurrentCountThe current number of requests that are waiting for a connection to the connection pool. |
Per EJB | AccessTotalCountThe total number of times an attempt was made to get an EJB instance from the free pool. BeansInCurrentUseCountThe number of EJB instances in the free pool which are currently in use. CachedBeansCurrentCountThe total number of EJBs that are in the execution cache. ActivationCountThe number of EJBs that are activated. |
Other | HeapSizeCurrentThe amount of memory, in bytes, that is in the WebLogic server’s JVM heap. HeapFreeCurrentThe current amount of free memory, in bytes, that is in the WebLogic server’s JVM heap. OpenSocketsCurrentCountThe current number sockets on the server that are open and receiving requests. AcceptBacklogThe number of requests that are waiting for a TCP connection. ExecuteThreadCurrentIdleCountThe number of threads in the server’s execution queue that are idle or which are not used to process data. PendingRequestCurrentCountThe number of pending requests that are in the server’s execution queue. TransactionCommittedTotalCountThe total number of transactions that are processed by the WebLogic server. TransactionRolledBackTotalCountThe total number of transactions that are rolled back. InvocationTotalCountThe total number of times that a servlet running on the WebLogic server was invoked. |
Before you can use the WebLogic monitors, you must perform additional steps outside of Uptime Infrastructure Monitor. The steps performed depend on the version of your WebLogic server: WebLogic monitoring requires that you enable the Internet Inter-Orb Protocol (IIOP) on your WebLogic server.
In order for Uptime Infrastructure Monitor to collect information from a WebLogic server, the the Internet Inter-Orb Protocol (IIOP) must be enabled on your WebLogic server.
To enable prepare your WebLogic server for monitoring, do the following:
Enter and confirm a Default IIOP Password for the user.
The user name and password created here are used when configuring a WebLogic service monitor in Uptime Infrastructure Monitor. |
To configure monitors for WebLogic, do the following:
WebSphere is a software platform that provides firms with an environment for developing and deploying Web services and E-Commerce applications. Because WebSphere is large and complex, it can be difficult to pinpoint the source of a problem, especially when that problem is intermittent.
The Uptime Infrastructure Monitor WebSphere monitor collects data that you can use to generate a report, which gives you a historical view of problems that occur on a WebSphere server. For more information about viewing WebSphere issues, see See WebSphere Report.
The WebSphere monitor enables you to collect data so that you can:
The following table lists the counters the WebSphere monitor collects from a WebSphere Application Server.
Variable | Counters |
---|---|
Connection pools | PoolSizeThe size of the connection pool to the data source. FreePoolSizeThe number of free connections in the pool. PercentUsedThe percentage of the connection pool that is currently in use. WaitTimeThe average time, in milliseconds, that a connection is used. The average time is the difference between the time at which the connection is allocated and the time at which it is returned. |
| CreateCountThe total number of connections that were created. CloseCountThe total number of connections that were closed. WaitingThreadCountThe number of threads that are currently waiting for a connection. UseTimeThe average time, in milliseconds, that a connection is used. The average use time is the difference between the time at which the connection is allocated and that time at which it is returned. |
Per EJB | CreateCountThe number of times that the Enterprise JavaBeans that are running on the server were created. RemoveCountThe number of times that the EJBs were removed. PassivateCountThe number of times that EJBs were removed from the cache. Note that passivation preserves the state of the EJBs on the disk MethodCallCountThe total number of method calls that were made to the EJBs. MethodResponseTimeThe average response time, in milliseconds, on the bean methods. |
Java Virtual Machine | cpuUsageThe percent of CPU resources that were used since the last query. HeapSizeThe total amount of memory that is available for the JVM. UsedMemoryThe amount of memory that is used by the JVM. |
Other | ActiveCountThe number of global transactions which are concurrently active. CommittedCountThe total number of global transactions that are committed. RolledBackCountThe total number of global transactions that were rolled back. LiveCountThe number of servlet sessions that are currently cached in memory. PoolSizeThe average number of threads in the servlet connection thread pool. TimeSinceLastActivatedThe difference, in milliseconds, between the previous and current access time stamps of a servlet session. This counter does not include session time out values. |
Before Uptime Infrastructure Monitor can start collecting performance data from a WebSphere server, you must deploy the WebSphere performance servlet.
The WebSphere performance servlet uses WebSphere’s Performance Monitor Interface (PMI) infrastructure to retrieve performance information from a WebSphere Application Server. The information that the servlet collects is saved to an XML file.
By default, the PMI is enabled on the WebSphere server and is set to collect the performance metrics that Uptime Infrastructure Monitor supports. Before Uptime Infrastructure Monitor can begin collecting information from a WebSphere server, you must deploy the performance servlet in the WebSphere directory that contains your Web application.
The following steps must be completed for each Web application server that you want to monitor with Uptime Infrastructure Monitor. |
To deploy the performance servlet do the following:
If you are using WebSphere Application Server, you must change two settings in the WebSphere management console to avoid an Access Denied error when Uptime Infrastructure Monitor attempts to connect to the performance servlet to collect metrics.
To make the changes, do the following:
To configure a WebSphere monitor, do the following:
A Web transaction is a series of Web pages that together fulfill a specific function for end users. A common Web transaction example is the checkout process on an e-commerce site, during which end users select a shipping option, pay for their items, and have their credit card verified. During this transaction, many calls are made to the application and data layers as the end-user provides, and the servers process, information.
Although the type of Web application that is monitored by Uptime Infrastructure Monitor users is typically different (e.g., intranet applications), the structure of the transaction is the same: an end user steps through a sequence of Web pages that take inputted information and initiate appropriate actions with application or database servers.
The Uptime Infrastructure Monitor Web Application Transaction monitor tests the speed and availability of an end-user Web transaction. Specifically, the Web Application Transaction monitor performs two roles:
Both the availability and speed of Web transactions can be used in reports and as triggers for alerts.
Use the Web Application Transaction monitor to record a series of URLs that together make up a transaction. This recording should be of a transaction that acts as a suitable test of your Web application delivery infrastructure.
During the recording process, declare checkpoints that demarcate significant stages in the Web transaction. Isolating the different stages in an end-user transaction allows you to view stage-specific speed tests in reports, which ultimately helps you identify where problem areas exist.
For example, if a transaction relies on processing on the application layer, makes multiple calls to the data layer, and is accessible worldwide, creating checkpoints during the recording phase helps you ascertain whether the application server, database management server, or network may be the reason behind a poorly performing transaction.
The following sample checkpoints could be created for an e-commerce transaction:
The following sample checkpoints could be created for an internal office transaction:
You can define Web application transactions by manually stepping through one and declaring checkpoints at key stages:
In the Uptime Infrastructure Monitor Add Service window, select the Web Application Transaction monitor, then click Continue.
The Web Application Transaction Recorder is displayed, and the monitor is now listening on port 8001 for traffic.
Ensure your browser's Java plugin is updated to the latest, most secure version. |
Begin stepping through the Web transaction as an end user, providing the required data or actions.
Every URL visited during the transaction is logged and displayed in the recorder.
The Web Application Transaction monitor records all data inputted during recording: this includes any login information. It is recommended that you use a test account for the Web application, otherwise any user data is visible in the recorded script. |
At each major step in the Web transaction that signals a new analysis point, enter a checkpoint name in the text box at the top of the window, then click Mark Checkpoint.
For example, create a checkpoint at a transaction step where the application takes user-inputted data and makes database calls.
You can later set Warning and Critical thresholds that apply to every segment declared in your recording. It is recommended that the divisions between your checkpoint intervals are reasonably consistent. |
Complete the monitor information fields.
To learn how to configure monitor information fields, see Monitor Identification.
Even though the Web application performance is not directly tied to an Element’s performance, making this selection is still required: the service based on this monitor needs to be associated with an Element in order to be viewed in areas such as Global Scan or Infrastructure. |
To view Web transaction performance via playback, create a Service Metrics graph for the Web Application Transaction monitor’s system. To generate a Service Metrics graph, either select the system to which the Web Application Transaction monitor is associated in Infrastructure, or the monitor itself in the main Services panel. Click the Graphics tab, then click Service Metrics.
The Service Metrics graph shows how long each transaction segment took to complete during playback, and in doing so, provides an end-to-end performance snapshot of the components of your infrastructure that deliver applications to users. For example, the following metrics graph shows that the execution of the comments found in checkpoint3 took excessively long to complete:
Because other checkpoints performed well, the poor performance of a single checkpoint indicates possible issues with a particular server, and not the network infrastructure. This theory can be further investigated by looking at the performance metrics for the server in question.
Use the Web Application Transaction monitor’s playback script to verify which servers are used during a problem checkpoint. In the Service Instances panel, click the monitor to view the script, then locate the system that is accessed (e.g., with GET and POST commands). Use this as an investigative starting point: although an application or Web server is often referenced in the script, the problem may be found deeper in the application stack (e.g., a database server to which the referenced Web server makes calls during the checkpoint).
Your Web applications typically call on systems on the application and database tiers, as well as make use of internal- and external-facing network devices. Because the Web Application Transaction monitor directly reports on the performance of a Web transaction, it in effect indirectly reports on the health of your IT infrastructure as a whole.
This broad reporting coverage makes the Web Application Transaction monitor an ideal monitor to include in service level agreement reports.
For more information on SLA reports, see Reports for Service Level Agreements.
Although specific Uptime Infrastructure Monitor monitors are available for your POP, IMAP, and SMTP servers, their monitoring duties focus on availability and response time. To test your IT infrastructure’s ability to send or receive emails within a reasonable amount of time, use the Email Delivery monitor.
Typically, email delivery tests include a server that is part of your IT infrastructure and monitored by Uptime Infrastructure Monitor. In these cases, you must test either incoming mail delivery times by supplying information about a monitored POP3 or IMAP server, or test outgoing mail delivery times by supplying information about a monitored SMTP server.
The Email Delivery executes several steps in order to calculate mail delivery and retrieval time:
Define the Email Delivery monitor by providing information about the outgoing and incoming mail servers:
Complete the monitor information fields.
To learn how to configure monitor information fields, see Monitor Identification.
Once created, the Email Delivery monitor service can be included with status reports for the system or group you select. If this monitor is reporting outgoing mail delivery times, the system should be a monitored SMTP server; if incoming mail delivery times are measured, the system should be a monitored POP3/IMAP mail server. |
Destination Email Address
Enter the test email address used by the monitor. The monitor sends an email to this address, and this address is checked for receipt of the test email.
Although the Email Delivery monitor attempts to promptly find and delete test emails, network issues may prevent timely cleanups. To avoid potential Inbox clutter, it is recommended that you create a dedicated test email account as the destination address. |
If the Email Delivery monitor reaches a Critical state, the first investigation step is to review the message produced by Uptime Infrastructure Monitor. In the System Status panel, view the message belonging to the system to which the monitor is attached, which should point you in the right direction. For example, the status message below indicates the monitor reached a critical state because the retrieval time from an external POP3 server exceeded the defined threshold; your SMTP server is most likely not responsible for the delay:
Speculation based on the status message can be confirmed using a Service Metrics graph for the Email Delivery monitor’s system. This graph indicates whether the delivery and retrieval time are within acceptable limits (below left), or if one or both are unusually long (below right):
To generate a Service Metrics graph, either select the system to which the Email Delivery monitors are associated in Infrastructure, or the monitor itself in the main Services panel. Click the Graphics tab, then click Service Metrics.
Even if the Service Metrics graph indicates delivery and retrieval times are not exceeding defined thresholds (and Uptime Infrastructure Monitor is not sending out critical alerts), it is still an ideal investigative starting point if you are getting critical feedback from your users about email delivery times.
If the Email Delivery monitor’s Service Metrics graph confirms that there are delays somewhere within your network infrastructure, you can investigate further by using the service monitor you created for your mail server. Co-ordinate your Email Delivery monitor’s metrics graphs or reports with those from a service monitor you have assigned to your mail server (e.g. Exchange) while focusing on metrics that may be related outgoing or incoming mail time delays. For example, in the Exchange service monitor metrics graph below, the mail server experienced a high SMTP Local Queue Length that did not always coincide with the SMTP Messages Per Second count:
Splunk is a third-party search engine that indexes log files and data from the devices, servers, and applications in your network. Using Splunk, you can quickly analyze your logs to pinpoint problems on a server or in a network, or ensure that you are in compliance with a regulatory mandate, or service level agreements. You install Splunk on a server in your datacenter.
When you integrate Uptime Infrastructure Monitor and Splunk, a Splunk icon appears beside any service that is in a WARN or CRIT state, when viewing the service in a monitoring or diagnostic view (e.g., the My Alerts section of My Portal, the Outages list or Status page for an Element):
Clicking this icon takes you to your Splunk search page.
You can use the Splunk Query monitor to perform Splunk queries on log files to pinpoint an error condition.
Before you can use a Splunk Query monitor, you must add settings to your Uptime Infrastructure Monitor Configuration panel that allow Uptime Infrastructure Monitor to interface with your Splunk installation. See Splunk Settings for more information. |
To configure a Splunk Query monitor, do the following:
Splunk query
The Splunk query string that is used to search the log file for an error condition. You can enter any Splunk query string in this field. For example, the following searches the log files for any instances of sendmail
and error
in relation to the mailServer
host within the last two hours: host::mailServer sendmail error hoursago::2
For more information on the syntax of Splunk queries, refer to search information found in the Splunk User Manual.
To minimize the risk of the monitor timing out, avoid using open-ended queries; instead use relative time ranges for a block of time, or with the snap-to-time modifier such as |
Live Splunks are scheduled searches of Splunk queries that are saved on the Splunk server. A Live Splunk automatically runs a search, can initiate an alert, and can perform actions based on that alert. You can, for example, set up a Live Splunk to search for all critical error conditions.
The Live Splunk Listener monitor enables you to capture the information generated by a Live Splunk (from Splunk 4.x only). This monitor is very similar to the External Check monitor, and uses scripts that are bundled with Uptime Infrastructure Monitor (found in the /scripts subdirectory) to return Live Splunk information to the Monitoring Station.
To use this monitor, you must first modify the two Splunk scripts that are included with Uptime Infrastructure Monitor:
This pair of scripts take the following options:
The following is an example of the script with all of its options specified:
alertUptimeStatusHandler.sh --message="sendmail has some traffic going through new command!" --status=2 --monitorName="Live Splunk" |
Uptime Infrastructure Monitor captures the output from the script, which appears in the service status section of the Global Scan dashboard (see Understanding the Status of Services). The Uptime Infrastructure Monitor monitoring framework picks up any error codes and triggers the appropriate monitoring action.
Before you can monitor Live Splunks generated on a Splunk server, you must do the following:
Find the following entry in the file:
host = "uptime-host" port = "9996" |
To configure a Live Splunk Listener monitor, do the following: