Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Splunk Query

Splunk is a third-party search engine that indexes log files and data from the devices, servers, and applications in your network. Using Splunk, you can quickly analyze your logs to pinpoint problems on a server or in a network, or ensure that you are in compliance with a regulatory mandate, or service level agreements. You install Splunk on a server in your datacenter.

When you integrate Uptime Infrastructure Monitor and Splunk, a Splunk icon appears beside any service that is in a WARN or CRIT state, when viewing the service in a monitoring or diagnostic view (e.g., the My Alerts section of My Portal, the Outages list or Status page for an Element):

Clicking this icon takes you to your Splunk search page.

You can use the Splunk Query monitor to perform Splunk queries on log files to pinpoint an error condition.

Info

Before you can use a Splunk Query monitor, you must add settings to your up.time your Uptime Infrastructure Monitor Configuration panel that allow Uptime Infrastructure Monitor to interface with your Splunk installation. See Splunk Settings for more information.

Configuring Splunk Query Monitors

To configure a Splunk Query monitor, do the following:

  1. Complete the monitor information fields.
    See Monitor Identification for more information.
  2. Complete the Splunk Query Settings:
    • Splunk query
      The Splunk query string that is used to search the log file for an error condition. You can enter any Splunk query string in this field. For example, the following searches the log files for any instances of sendmail and error in relation to the mailServer host within the last two hours:
      host::mailServer sendmail error hoursago::2
      For more information on the syntax of Splunk queries, refer to search information found in the Splunk User Manual.

      Info

      To minimize the risk of the monitor timing out, avoid using open-ended queries; instead use relative time ranges for a block of time, or with the snap-to-time modifier such as -1d@d (yesterday starting from 12:00:00 AM).

    • Result count of splunk query
      Enables Uptime Infrastructure Monitor to alert you when the number of results that match your Splunk query exceeds the defined warning and critical thresholds.
      For example, you can configure the monitor to issue a Warning alert when five or more Splunk results matching your query are returned, and a Critical alert when 10 or more results for your query are returned.
    • Response Time
      Enter the Warning and Critical Response Time thresholds. For more information, see Configuring Warning and Critical Thresholds.
  3. To save the result-count or response-time data for graphing or reporting, click the Save for Graphing checkbox beside the appropriate checkbox(es).
  4. Complete the following settings:
  5. Click Finish.

Live Splunk Listener

Live Splunks are scheduled searches of Splunk queries that are saved on the Splunk server. A Live Splunk automatically runs a search, can initiate an alert, and can perform actions based on that alert. You can, for example, set up a Live Splunk to search for all critical error conditions.

The Live Splunk Listener monitor enables you to capture the information generated by a Live Splunk (from Splunk 4.x only). This monitor is very similar to the External Check monitor, and uses scripts that are bundled with Uptime Infrastructure Monitor (found in the /scripts subdirectory) to return Live Splunk information to the Monitoring Station.

To use this monitor, you must first modify the two Splunk scripts that are included with Uptime Infrastructure Monitor:

  • alertUptimeStatusHandler.sh
  • alertUptime.py

This pair of scripts take the following options:

  • --message
    A message that is returned to the Uptime Infrastructure Monitor Monitoring Station. For example, if the Live Splunk is configured to search for warning conditions, you can enter the message “Changed to WARN” .
  •  --status
    The script can return the following status codes:
    •  0 - OK
      The services are functioning properly.
    •  1 - Warning
      There is a potential problem with one of more of the monitored services.
    •  2 - Critical
      There is a critical problem with one or more of the monitored services.
    •  3 - Unknown
      There is an error in the configuration of the monitor itself, or Uptime Infrastructure Monitor cannot execute the service check.
  •  --monitor
    The name of the Uptime Infrastructure Monitor monitor to which the information from the Live Splunk is directed.

The following is an example of the script with all of its options specified:

No Format
nopaneltrue
alertUptimeStatusHandler.sh --message="sendmail has some traffic going through new command!"
                            --status=2 --monitorName="Live Splunk"

Uptime Infrastructure Monitor captures the output from the script, which appears in the service status section of the Global Scan dashboard (see Understanding the Status of Services). The Uptime Infrastructure Monitor monitoring framework picks up any error codes and triggers the appropriate monitoring action.

Before You Begin

Before you can monitor Live Splunks generated on a Splunk server, you must do the following:

  1. Edit the alertUptime.py script to point to the Uptime Infrastructure Monitor Monitoring Station:
    • Navigate to the /scripts directory on the Monitoring Station.
    • Open the file alertUptime.py in a text editor.
    • Find the following entry in the file:

      No Format
      nopaneltrue
      host = "uptime-host"
      port = "9996"
    • Change the values for host and port to the host name and port of the Monitoring Station.
    • Save and close the file.
  2. Edit the alertUptimeStatusHandler.sh script to configure how the Live Splunk is reported on the Monitoring Station:
    • Open alertUptimeStatusHandler.sh in a text editor (found in the /scripts directory on the Monitoring Station).
    • For the message option, enter a diagnostic message that accompanies a Live Splunk captured by the Uptime Infrastructure Monitor service monitor.
    • For the status option, enter the status of the monitored service.
    • For the monitorName option, enter the name of the service monitor that is listening to the Live Splunk.
    • Save and close the file.
  3. Copy the alertUptimeStatusHandler.sh and alertUptime.py scripts from the Monitoring Station’s /scripts directory to the
    /data/splunk/bin/scripts directory on the Splunk server.
  4. Configure a Live Splunk. For information on configuring Live Splunks, see the Splunk user manual.
    When setting up your Live Splunk, select the Run the shell script option on the configuration page. Then, enter the path to alertUptimeStatusHandler.sh in the field.

Configuring the Live Splunk Listener Monitor

To configure a Live Splunk Listener monitor, do the following:

  1. Complete the monitor information fields.
    To learn how to configure monitor information fields, see Monitor Identification.
  2. Complete the following settings:
  3. Click Finish.