Virtual systems restarted (uptime low)-cisco-nxos

Virtual systems restarted (uptime low)-cisco-nxos

Vendor: cisco

OS: nxos

Description:
Indeni will alert when a virtual system has restarted.

Remediation Steps:
Determine why the virtual system(s) was restarted.
|
|1. Use the “show version” or “show system reset-reason” NX-OS commands to display the reason for the reload
|2. Use the “show cores” command to determine if a core file was recorded during the unexpected reboot
|3. Run the “show process log” command to display the processes and if a core was created.
|4. With the show logging command, review the events that happened close to the time of reboot

How does this work?
This script logs in to the Cisco Nexus switch using SSH and retrieves the output of the “show version” command. The output includes the device’s uptime as well as additional hardware and software related details.

Why is this important?
Capture the uptime of the device. If the uptime is lower than the previous sample, the device must have reloaded.

Without Indeni how would you find this?
The administrator would have to manually log into the Nexus switch and run the “show version” NX-OS command to review the uptime and the last time/reason that it has been reloaded.

nexus-show-version

name: nexus-show-version
description: Nexus show version
type: monitoring
monitoring_interval: 5 minutes
requires:
    vendor: cisco
    os.name: nxos
comments:
    uptime-milliseconds:
        why: |
            Capture the uptime of the device. If the uptime is lower than the previous sample, the device must have reloaded.
        how: |
            This script logs in to the Cisco Nexus switch using SSH and retrieves the output of the "show version" command. The output includes the device's uptime as well as additional hardware and software related details.
        without-indeni: |
            The administrator would have to manually log into the Nexus switch  and run the "show version" NX-OS command to review the uptime and the last time/reason that it has been reloaded.
        can-with-snmp: true
        can-with-syslog: true
    vendor:
        why: |
            Capture the device vendor name.
        how: |
            This script logs in to the Cisco Nexus switch using SSH and retrieves the output of the "show version" command. The output includes the device's hardware and software related details.
        without-indeni: |
            The administrator would have to manually log into the Nexus switch and run the "show version" NX-OS command to capture hardware/software information such as vendor name, OS name, OS version and hostname.
        can-with-snmp: true
        can-with-syslog: false
    os-name:
        why: |
            Capture the device operating system name.
        how: |
            This script logs in to the Cisco Nexus switch using SSH and retrieves the output of the "show version" command. The output includes the device's hardware and software related details.
        without-indeni: |
            The administrator would have to manually log into the Nexus switch and run the "show version" NX-OS command to capture hardware/software information such as vendor name, OS name, OS version and hostname.
        can-with-snmp: true
        can-with-syslog: false
    os-version:
        why: |
            Capture the device operating system version. The version should be the same across all members of a vPC (cluster).
        how: |
            This script logs in to the Cisco Nexus switch using SSH and retrieves the output of the "show version" command. The output includes the device's hardware and software related details.
        without-indeni: |
            The administrator would have to manually log into the Nexus switch and run the "show verion" NX-OS command to capture hardware/software information such as vendor name, OS name, OS version and hostname.
        can-with-snmp: true
        can-with-syslog: false
    hostname:
        why: |
            Capture the device hostname.
        how: |
            This script logs in to the Cisco Nexus switch using SSH and retrieves the output of the "show version" command. The output includes the device's hardware and software related details.
        without-indeni: |
            The administrator would have to manually log into the Nexus switch and run the "show version" NX-OS command to capture hardware/software information such as vendor name, OS name, OS version and hostname.
        can-with-snmp: true
        can-with-syslog: false
steps:
-   run:
        type: SSH
        command: show version
    parse:
        type: AWK
        file: show_version.parser.1.awk

cross_vendor_uptime_low_vsx

// Deprecation warning : Scala template-based rules are deprecated. Please use YAML format rules instead.

package com.indeni.server.rules.library.templatebased.crossvendor

import com.indeni.apidata.time.TimeSpan
import com.indeni.apidata.time.TimeSpan.TimePeriod
import com.indeni.server.rules.RuleContext
import com.indeni.server.rules.library.templates.TimeThresholdOnDoubleMetricWithItemsTemplateRule
import com.indeni.server.sensor.models.managementprocess.alerts.dto.AlertSeverity
import com.indeni.server.rules.ThresholdDirection
import com.indeni.server.rules.RemediationStepCondition

/**
  *
  */
case class cross_vendor_uptime_low_vsx() extends TimeThresholdOnDoubleMetricWithItemsTemplateRule(
      ruleName = "cross_vendor_uptime_low_vsx",
      ruleFriendlyName = "All Devices (VSX): Virtual systems restarted (uptime low)",
      ruleDescription = "Indeni will alert when a virtual system has restarted.",
      severity = AlertSeverity.CRITICAL,
      metricName = "uptime-milliseconds",
      threshold = TimeSpan.fromMinutes(60),
      metricUnits = TimePeriod.MILLISECOND,
      thresholdDirection = ThresholdDirection.BELOW,
      applicableMetricTag = "vs.name",
      alertItemsHeader = "Affected Virtual Systems",
      alertItemDescriptionFormat = "The current uptime is %.0f seconds which seems to indicate the virtual system has restarted.",
      alertItemDescriptionUnits = TimePeriod.SECOND,
      alertDescription = "Some virtual systems on this device have restarted recently. Review the list below.",
      baseRemediationText = "Determine why the virtual system(s) was restarted."
    )(
      RemediationStepCondition.VENDOR_CISCO ->
        """|
       |1. Use the "show version" or "show system reset-reason" NX-OS commands to display the reason for the reload
       |2. Use the "show cores" command to determine if a core file was recorded during the unexpected reboot
       |3.  Run the "show process log" command to display the processes and if a core was created.
       |4.  With the show logging command, review the events that happened close to the time of reboot
    """.stripMargin
    )