os/400 i5os IBM i remove stale LUN paths

This may need to happen if you remove/re-add NPIV mapping, if SAN topology changes, or if you have removed maps, ports, cables, or entire LUNs.

1. Change the configuration on the array to remove the extra ports, etc.
2. Physically remove any fibre cables if needed.
3. IPL the server.
4. Run the STRSST command., then Press ENTER
5. Option 1, Start a service tool, then Press ENTER
6. Option 4, Display/Alter/Dump, then Press ENTER
7. Option 1, Display/Alter storage, then Press ENTER
8. Option 2, Licensed Internal Code (LIC) data, then Press ENTER
9. Option 14, Advanced analysis (scroll down to see this option), then Press ENTER
a. Scroll down and type 1 by MULTIPATHRESETTER, then press ENTER
b. Options, -RESETMP -ALL -CONFIRM, then press ENTER

DISPLAY/ALTER/DUMP
Running macro: MULTIPATHRESETTER -CONFIRM -ALL
Reset the paths for Multiple Connections

********************************************************************
***CONFIRM RESET MULTIPATH UNIT PATHSTO NUMBER CURRENTLY ENLISTED***
********************************************************************

This service function should be run only under the direction of the
IBM Hardware Service Support.

You have selected to reset the number of paths on a multipath unit
to equal the number of paths that have currently enlisted.

Attempting to reset path for resource name: DMP004
Attempting to reset path for resource name: DMP075
...

*** Your request completed successfully ***
The number of paths connected to your multipath unit have been reset
to match the number of paths that are currently enlisted.
NOTE: If ALL paths are removed, the disk resource name will still show as DMPxxx rather than DDxxxx.NOTE: DMPxxx resources remaining in Hardware Service Manager removed paths will need manual cleanup.

REF: https://www.ibm.com/support/pages/reducing-or-removing-paths-multipath-lun
REF: https://www.ibm.com/support/pages/san-disks-missing-paths


PowerHA holds my disks

I did some testing and needed to document command syntaxen, even though I was not successful.
node01 / node02 – cannot remove EMC disks
aps are stopped

The fuser command will not detect processes that have mmap regions where that associated file descriptor has since been closed.

lsof | grep hdisk   ### nothing
fuser -fx /dev/hdisk2 ### nothing
fuser -d /dev/hdisk2 ### nothing
sudo filemon -O all -o 2.trc ; sleep 10 ; sudo trcstop   ### only shows hottest 2 dsks

### Cannot remove disks after removign from HA, is related to this defect.
http://www-01.ibm.com/support/docview.wss?uid=isg1IV65140
/usr/es/sbin/cluster/events/utils/cl_vg_fence_term -c vgname

In PowerHA 7.1.3, with the shared VG varied off, and the
disk in closed state, rmdev may fail and return a
busy error, eg:

# rmdev -dl hdisk2
Method error (/usr/lib/methods/ucfgdevice):
0514-062 Cannot perform the requested function because
         the specified device is busy.
.

# cl_set_vg_fence_height
Usage: cl_set_vg_fence_height [-c]  [rw|ro|na|ff]

JDSD NOTE: The levels are:
* rw = readwrite
* ro = read only
* na = no access
* ff = fail access

jdsd@node01  /home/jdsd
$ sudo ls -laF /usr/es/sbin/cluster/events/utils/cl*fence*
-rwxr--r--    1 root     system        12832 Nov  7 2013  /usr/es/sbin/cluster/events/utils/cl_fence_vg*
-rwxr--r--    1 root     system        15624 Nov  7 2013  /usr/es/sbin/cluster/events/utils/cl_set_vg_fence_height*
-r-x------    1 root     system         5739 Nov  7 2013  /usr/es/sbin/cluster/events/utils/cl_ssa_fence*
-rwxr--r--    1 root     system        22508 Nov  7 2013  /usr/es/sbin/cluster/events/utils/cl_vg_fence_init*
-rwxr--r--    1 root     system         4035 Feb 26 2015  /usr/es/sbin/cluster/events/utils/cl_vg_fence_redo*
-rwxr--r--    1 root     system        15179 Oct 21 2014  /usr/es/sbin/cluster/events/utils/cl_vg_fence_term*


jdsd@node01  /home/jdsd
$ sudo ls -laF /usr/es/sbin/cluster/events/cspoc/cl*disk*
-r-x------    1 root     system       109726 Feb 26 2015  /usr/es/sbin/cluster/cspoc/cl_diskreplace*
-rwxr-xr-x    1 root     system        20669 Nov  7 2013  /usr/es/sbin/cluster/cspoc/cl_getdisk*
-r-x------    1 root     system       105962 Feb 26 2015  /usr/es/sbin/cluster/cspoc/cl_lsreplacementdisks*
-r-x------    1 root     system       103433 Feb 26 2015  /usr/es/sbin/cluster/cspoc/cl_lsrgvgdisks*
-rwxr-xr-x    1 root     system        12259 Feb 26 2015  /usr/es/sbin/cluster/cspoc/cl_pviddisklist*
-rwxr-xr-x    1 root     system         4929 Nov  7 2013  /usr/es/sbin/cluster/cspoc/cl_vg_non_dhb_disks*


jdsd@node01  /home/jdsd
$ sudo /usr/es/sbin/cluster/cspoc/cl_lsrgvgdisks
#Volume Group   hdisk    PVID             Cluster Node
#---------------------------------------------------------------------
caavg_private   hdisk38  00deadbeefcaff53 node01                        node01,node02 
datavg          hdisk22  00deadbeefca8643 node02                        node01,node02 demo_rg
datavg          hdisk23  00deadbeefca86f9 node02                        node01,node02 demo_rg
datavg          hdisk24  00deadbeefca8752 node02                        node01,node02 demo_rg
datavg          hdisk25  00deadbeefca87ac node02                        node01,node02 demo_rg
datavg          hdisk26  00deadbeefca880e node02                        node01,node02 demo_rg
datavg          hdisk27  00deadbeefca886c node02                        node01,node02 demo_rg
datavg          hdisk28  00deadbeefca88d7 node02                        node01,node02 demo_rg
datavg          hdisk29  00deadbeefca8965 node02                        node01,node02 demo_rg
datavg          hdisk30  00deadbeefca89c5 node02                        node01,node02 demo_rg
datavg          hdisk31  00deadbeefca8a52 node02                        node01,node02 demo_rg
datavg          hdisk32  00deadbeefca8ad2 node02                        node01,node02 demo_rg
datavg          hdisk33  00deadbeefca8b50 node02                        node01,node02 demo_rg
datavg          hdisk34  00deadbeefca8c26 node02                        node01,node02 demo_rg
datavg          hdisk35  00deadbeefca8c9a node02                        node01,node02 demo_rg
datavg          hdisk36  00deadbeefca8cf7 node02                        node01,node02 demo_rg
journalvg       hdisk37  00deadbeefca8d53 node02                        node01,node02 demo_rg


jdsd@node01  /home/jdsd
$ sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
Disk name:                      hdisk2
Disk UUID:                      1edeadbeefcafe04 b512d9e3b580fb13
Fence Group UUID:               0000000000000000 0000000000000000 - Not in a Fence Group
Disk device major/minor number: 18, 2
Fence height:                   2 (Read/Only)
Reserve mode:                   0 (No Reserve)
Disk Type:                      0x01 (Local access only)
Disk State:                     32785

Concurrent vg, so updating on node2 shows up on node1.

From node 2

sudo extendvg journalvg hdisk2 hdisk3 hdisk4 hdisk5 hdisk6 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11 hdisk12
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk37
# Shows RW

From node 1

sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk37
# Shows RW

From node1

sudo /usr/es/sbin/cluster/events/utils/cl_set_vg_fence_height -c journalvg rw
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
# Shows RW

From node2

sudo reducevg journalvg hdisk2 hdisk3 hdisk4 hdisk5 hdisk6 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11 hdisk12
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
# Shows RO

### OK, try again
From node 1

sudo mkvg -y dummyvg hdisk2 hdisk3 hdisk4 hdisk5 hdisk6 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11 hdisk12
sudo varyoffvg dummyvg

From node 2

sudo importvg  -y dummyvg hdisk2
sudo /usr/es/sbin/cluster/events/utils/cl_set_vg_fence_height -c dummyvg rw
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
### Still RO
sudo /usr/es/sbin/cluster/events/utils/cl_vg_fence_term -c dummyvg
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
### Still RO
sudo varyoffvg dummyvg
sudo rmdev -Rl hdisk2

Both nodes

sudo exportvg dummyvg
sudo importvg -c -y dummyvg hdisk2
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
### Still RO
sudo /usr/es/sbin/cluster/events/utils/cl_set_vg_fence_height -c dummyvg rw
sudo /usr/es/sbin/cluster/events/utils/cl_vg_fence_init -c dummyvg rw hdisk2
cl_vg_fence_init[279]: sfwAddFenceGroup(dummyvg, 1, hdisk2): No such device
sudo chvg -c dummyvg
sudo varyonvg -n -c -A -O dummyvg
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk3
### Still RO
sudo varyoffvg dummyvg

From Node 2
sudo rmdev -Rl hdisk2
Method error (/etc/methods/ucfgdevice):
        0514-062 Cannot perform the requested function because the
                 specified device is busy.

sudo /usr/es/sbin/cluster/events/utils/cl_vg_fence_redo -c dummyvg rw hdisk2
 /usr/es/sbin/cluster/events/utils/cl_vg_fence_redo: line 109: cl_vg_fence_init: not found
 cl_vg_fence_redo: Volume group dummyvg fence height could not be set to read/write

This is related to this defect, but later version:
http://www-01.ibm.com/support/docview.wss?uid=isg1IV52444

sudo su -
export PATH=$PATH:/usr/es/sbin/cluster/utilities:/usr/es/sbin/cluster/events/utils/:/usr/es/sbin/cluster/cspoc/:/usr/es/sbin/cluster/sbin:/usr/es/sbin/cluster
/usr/es/sbin/cluster/events/utils/cl_vg_fence_redo -c dummyvg rw hdisk2
 cl_vg_fence_init[279]: sfwAddFenceGroup(dummyvg, 11, hdisk2, hdisk3, hdisk4, hdisk5, hdisk6, hdisk7, hdisk8, hdisk9, hdisk10, hdisk11, hdisk12): No such device
 cl_vg_fence_redo: Volume group dummyvg fence height could not be set to read/write#
cd /dev
/usr/es/sbin/cluster/events/utils/cl_vg_fence_redo -c dummyvg rw hdisk2
 cl_vg_fence_init[279]: sfwAddFenceGroup(dummyvg, 11, hdisk2, hdisk3, hdisk4, hdisk5, hdisk6, hdisk7, hdisk8, hdisk9, hdisk10, hdisk11, hdisk12): No such device
 cl_vg_fence_redo: Volume group dummyvg fence height could not be set to read/write#

SIGH!

I give up. We will probably have to reboot.