errpt disk errors

SC_DISK_PCM_ERR1 Subsystem Component Failure

The storage subsystem has returned an error indicating that some component (hardware or software) of the storage subsystem has failed. The detailed sense data identifies the failing component and the recovery action that is required. Failing hardware components should also be shown in the Storage Manager software, so the placement of these errors in the error log is advisory and is an aid for your technical-support representative.

SC_DISK_PCM_ERR2 Array Active Controller Switch

The active controller for one or more hdisks associated with the storage subsystem has changed. This is in response to some direct action by the AIX host (failover or autorecovery). This message is associated with either a set of failure conditions causing a failover or, after a successful failover, with the recovery of paths to the preferred controller on hdisks with the autorecovery attribute set to yes.

SC_DISK_PCM_ERR3 Array Controller Switch Failure

An attempt to switch active controllers has failed. This leaves one or more paths with no working path to a controller. The AIX MPIO PCM will retry this error several times in an attempt to find a successful path to a controller.

SC_DISK_PCM_ERR4 Array Configuration Changed

The active controller for an hdisk has changed, usually due to an action not initiated by this host. This might be another host initiating failover or recovery, for shared LUNs, a redistribute operation from the Storage Manager software, a change to the preferred path in the Storage Manager software, a controller being taken offline, or any other action that causes the active controller ownership to change.

SC_DISK_PCM_ERR5 Array Cache Battery Drained

The storage subsystem cache battery has drained. Any data remaining in the cache is dumped and is vulnerable to data loss until it is dumped. Caching is not normally allowed with drained batteries unless the administrator takes action to enable it within the Storage Manager software.

SC_DISK_PCM_ERR6 Array Cache Battery Charge Is Low

The storage subsystem cache batteries are low and need to be charged or replaced.

SC_DISK_PCM_ERR7 Cache Mirroring Disabled

Cache mirroring is disabled on the affected hdisks. Normally, any cached write data is kept within the cache of both controllers so that if either controller fails there is still a good copy of the data. This is a warning message stating that loss of a single controller will result in data loss.

SC_DISK_PCM_ERR8 Path Has Failed

The I/O path to a controller has failed or gone offline.

SC_DISK_PCM_ERR9 Path Has Recovered

The I/O path to a controller has resumed and is back online.

SC_DISK_PCM_ERR10 Array Drive Failure

A physical drive in the storage array has failed and should be replaced.

SC_DISK_PCM_ERR11 Reservation Conflict

A PCM operation has failed due to a reservation conflict. This error is not currently issued.

SC_DISK_PCM_ERR12 Snapshot™ Volume’s Repository Is Full

The snapshot volume repository is full. Write actions to the snapshot volume will fail until the repository problems are fixed.

SC_DISK_PCM_ERR13 Snapshot Op Stopped By Administrator

The administrator has halted a snapshot operation.

SC_DISK_PCM_ERR14 Snapshot repository metadata error

The storage subsystem has reported that there is a problem with snapshot metadata.

SC_DISK_PCM_ERR15 Illegal I/O – Remote Volume Mirroring

The I/O is directed to an illegal target that is part of a remote volume mirroring pair (the target volume rather than the source volume).

SC_DISK_PCM_ERR16 Snapshot Operation Not Allowed

A snapshot operation that is not allowed has been attempted.

SC_DISK_PCM_ERR17 Snapshot Volume’s Repository Is Full

The snapshot volume repository is full. Write actions to the snapshot volume will fail until the repository problems are fixed.

SC_DISK_PCM_ERR18 Write Protected

The hdisk is write-protected. This can happen if a snapshot volume repository is full.

SC_DISK_PCM_ERR19 Single Controller Restarted

The I/O to a single-controller storage subsystem is resumed.

SC_DISK_PCM_ERR20 Single Controller Restart Failure

The I/O to a single-controller storage subsystem is not resumed. The AIX MPIO PCM will continue to attempt to restart the I/O to the storage subsystem.


AIX 2020 PCMPATH Replacement

With AIX PCM, we no longer have pcmpath or datapath.
Most of the info you’d want for disks is from lsmpio or lspath.
For some adapter queries, you’re stuck. Here is a simulated output generator.
This is not super efficient, but it’s on par with the kind of things many AIX scripts do.

pcmpath_query() {
  printf "%-8s %8s %15s %6s %6s %6s\n" Adapter Status Selects Errors Paths Failed 
  for fscsi in $(lsdev -C | grep fscsi | cut -f 1 -d \  ) ; do
    enabled=$(lspath -p $fscsi | grep Enabled | wc -l)
    failed=$(lspath -p $fscsi | grep -v Enabled | wc -l)
    if [[ $(( $enabled + $failed )) -eq 0 ]] ; then status=UNUSED ; 
    elif [[ $failed -eq 0 ]] ; then status=NORMAL 
    elif [[ $enabled -eq 0 ]]; then status=FAILED
    else status=DEGRADED; fi
    fcs=$(lsdev -CFparent -l $fscsi)
    selects=$(fcstat $fcs | grep Requests | tr -d \  | cut -f 2 -d : | paste -sd+ - | bc)
    errors=$(lsmpio -ae  | grep -p $fscsi | grep Total | tr -d \  | cut -f 2 -d : )
    paths=$(lsmpio -ar | grep -p $fscsi | grep 0x | wc -l)
    failed=$(lsmpio -ar | grep -p fscsi0 | grep 0x | awk '{print $3 "\n" $4 "\n" $5;}' | paste -sd+ - | bc)
    printf "%-8s %8s %15s %6s %6s %6s\n" $fscsi $status $selects $errors $paths $failed
  done
}