IT42905 TSM / Spectrum Protect SSL cert expiration

TSM 7.1.8 / Spectrum Protect 8.1.2 and later create SSL certs with a 10 year expiration.

IBM reference: https://www.ibm.com/support/pages/apar/IT42905

The fix is:

Delete the instance keystore (cert.kdb)
Set all clients, admins, servers to SESSIONSECURITY=TRANSITIONAL
HALT the server and restart it – this will make a NEW key.
FORCESYNC for server connections.
Delete the client keystore (dsmcert.idx)
Restart the client and make sure it connects for the new key.

If you don’t wipe the client keystore:

ANS1695E The certificate is not valid.
ANS8023E Unable to establish session with server.
ANS8002I Highest return code was -370.

From the actlog/SERVER_CONSOLE

ANR8599W The connection with someserver:port failed due to an untrusted server certificate. An attempt to reconnect and establish certificate trust might follow.

IBM is considering automating this.

There is no automation yet as of 8.1.18.0 in 2023-03, and once the key expires, you’re stuck doing it manually.
You may be able to use DEFINE CLIENTACTION to delete the keystores on the clients if you use dsmcad.


0x8007003B timeout copying large file to Samba server

PROBLEM: 0x8007003B timeout copying large file to Samba server

SOLUTION: This is an SMB.CONF issue, solved / fixed with this line:

strict allocate = no

DESCRIPTION: I had this issue for a long time, and mostly the web mocks people, tells them to do stupid things, or generally is unhelpful.  Lots of 2GB, or “your network” or “your firewall” or “turn off DPI” or whatever, none of it applicable to me.  I just accepted it, but decided to dig a little deeper today.

The exact amount of data written before it fails would vary, but the size from LS would always be the full file size.  Higher performance filesystems such as XFS, EXT4, JFS, all of them on NVMe arrays, I found I could get about 55GB allocated before timeout.  On spinning disk, it was much less, which is probably why many people fell down the rabbit hole of claiming 2GB limits, etc.

Strict Allocate = YES tells it to allocate the whole file upon request, which is what Windows does.  Samba says “OK, hold on”, and then times out.  Some people used powershell on a client to change the smbclient timeout to 600 seconds, or whatever, but that’s not really ideal, since it does not scale.

Strict Allocate = NO says to use normal UNIX semantics, where the file has no pre-allocated blocks, and allocates blocks only as the data comes in.  This starts with a fully sparse file, and data copy status on the windows client shows it processing immediately.  This is what we want for large files.  If it was only small files, then we don’t care.

I made this a global change.  I don’t need fully pre-allocated, non-sparse files on my file server.  It’s possible someone writing databases might need this, and you’d want to make sure you didn’t feed data faster than the kernel can allocate blocks.  Another one of those multiple filesystems kind of solutions.

When you play with tunables, you run into things that people don’t really know how to troubleshoot.  That’s what this is for, just so it shows up in web searches.


SW/FS/SVC Volume Mobility

SAN Volume Controller / Storwize / Flash System version 8.4.2 allows you to non-disruptively migrate a LUN between array controller clusters. It’s set up like remote copy, except you can map the remote copy to the same host at the same time. The remote copy becomes non-preferred paths for the same vdisk and vdisk ID. Then you can switch who is primary. Then you can remove the old copy.

Here is someone who did a demo video: https://www.youtube.com/watch?v=NpcOoshkm4w


errpt disk errors

SC_DISK_PCM_ERR1 Subsystem Component Failure

The storage subsystem has returned an error indicating that some component (hardware or software) of the storage subsystem has failed. The detailed sense data identifies the failing component and the recovery action that is required. Failing hardware components should also be shown in the Storage Manager software, so the placement of these errors in the error log is advisory and is an aid for your technical-support representative.

SC_DISK_PCM_ERR2 Array Active Controller Switch

The active controller for one or more hdisks associated with the storage subsystem has changed. This is in response to some direct action by the AIX host (failover or autorecovery). This message is associated with either a set of failure conditions causing a failover or, after a successful failover, with the recovery of paths to the preferred controller on hdisks with the autorecovery attribute set to yes.

SC_DISK_PCM_ERR3 Array Controller Switch Failure

An attempt to switch active controllers has failed. This leaves one or more paths with no working path to a controller. The AIX MPIO PCM will retry this error several times in an attempt to find a successful path to a controller.

SC_DISK_PCM_ERR4 Array Configuration Changed

The active controller for an hdisk has changed, usually due to an action not initiated by this host. This might be another host initiating failover or recovery, for shared LUNs, a redistribute operation from the Storage Manager software, a change to the preferred path in the Storage Manager software, a controller being taken offline, or any other action that causes the active controller ownership to change.

SC_DISK_PCM_ERR5 Array Cache Battery Drained

The storage subsystem cache battery has drained. Any data remaining in the cache is dumped and is vulnerable to data loss until it is dumped. Caching is not normally allowed with drained batteries unless the administrator takes action to enable it within the Storage Manager software.

SC_DISK_PCM_ERR6 Array Cache Battery Charge Is Low

The storage subsystem cache batteries are low and need to be charged or replaced.

SC_DISK_PCM_ERR7 Cache Mirroring Disabled

Cache mirroring is disabled on the affected hdisks. Normally, any cached write data is kept within the cache of both controllers so that if either controller fails there is still a good copy of the data. This is a warning message stating that loss of a single controller will result in data loss.

SC_DISK_PCM_ERR8 Path Has Failed

The I/O path to a controller has failed or gone offline.

SC_DISK_PCM_ERR9 Path Has Recovered

The I/O path to a controller has resumed and is back online.

SC_DISK_PCM_ERR10 Array Drive Failure

A physical drive in the storage array has failed and should be replaced.

SC_DISK_PCM_ERR11 Reservation Conflict

A PCM operation has failed due to a reservation conflict. This error is not currently issued.

SC_DISK_PCM_ERR12 Snapshot™ Volume’s Repository Is Full

The snapshot volume repository is full. Write actions to the snapshot volume will fail until the repository problems are fixed.

SC_DISK_PCM_ERR13 Snapshot Op Stopped By Administrator

The administrator has halted a snapshot operation.

SC_DISK_PCM_ERR14 Snapshot repository metadata error

The storage subsystem has reported that there is a problem with snapshot metadata.

SC_DISK_PCM_ERR15 Illegal I/O – Remote Volume Mirroring

The I/O is directed to an illegal target that is part of a remote volume mirroring pair (the target volume rather than the source volume).

SC_DISK_PCM_ERR16 Snapshot Operation Not Allowed

A snapshot operation that is not allowed has been attempted.

SC_DISK_PCM_ERR17 Snapshot Volume’s Repository Is Full

The snapshot volume repository is full. Write actions to the snapshot volume will fail until the repository problems are fixed.

SC_DISK_PCM_ERR18 Write Protected

The hdisk is write-protected. This can happen if a snapshot volume repository is full.

SC_DISK_PCM_ERR19 Single Controller Restarted

The I/O to a single-controller storage subsystem is resumed.

SC_DISK_PCM_ERR20 Single Controller Restart Failure

The I/O to a single-controller storage subsystem is not resumed. The AIX MPIO PCM will continue to attempt to restart the I/O to the storage subsystem.


os/400 i5os IBM i remove stale LUN paths

This may need to happen if you remove/re-add NPIV mapping, if SAN topology changes, or if you have removed maps, ports, cables, or entire LUNs.

1. Change the configuration on the array to remove the extra ports, etc.
2. Physically remove any fibre cables if needed.
3. IPL the server.
4. Run the STRSST command., then Press ENTER
5. Option 1, Start a service tool, then Press ENTER
6. Option 4, Display/Alter/Dump, then Press ENTER
7. Option 1, Display/Alter storage, then Press ENTER
8. Option 2, Licensed Internal Code (LIC) data, then Press ENTER
9. Option 14, Advanced analysis (scroll down to see this option), then Press ENTER
a. Scroll down and type 1 by MULTIPATHRESETTER, then press ENTER
b. Options, -RESETMP -ALL -CONFIRM, then press ENTER

DISPLAY/ALTER/DUMP
Running macro: MULTIPATHRESETTER -CONFIRM -ALL
Reset the paths for Multiple Connections

********************************************************************
***CONFIRM RESET MULTIPATH UNIT PATHSTO NUMBER CURRENTLY ENLISTED***
********************************************************************

This service function should be run only under the direction of the
IBM Hardware Service Support.

You have selected to reset the number of paths on a multipath unit
to equal the number of paths that have currently enlisted.

Attempting to reset path for resource name: DMP004
Attempting to reset path for resource name: DMP075
...

*** Your request completed successfully ***
The number of paths connected to your multipath unit have been reset
to match the number of paths that are currently enlisted.
NOTE: If ALL paths are removed, the disk resource name will still show as DMPxxx rather than DDxxxx.NOTE: DMPxxx resources remaining in Hardware Service Manager removed paths will need manual cleanup.

REF: https://www.ibm.com/support/pages/reducing-or-removing-paths-multipath-lun
REF: https://www.ibm.com/support/pages/san-disks-missing-paths


TSM/ISP Recovering from “SKIP UPGRADING THIS INSTANCE”

Recovering from “SKIP UPGRADING THIS INSTANCE”

REFS:
https://www.ibm.com/support/pages/manually-upgrading-ibm-spectrum-protect-server-instances
https://www.ibm.com/support/pages/anr0187e-failure-during-server-startup
http://issen007.blogspot.com/2017/05/manual-upgrade-ibm-spectrum-protect-71x.html

###################################################
### Stop the instance completely
su - tsminst1

### This may not work if your environment or links are bad.
db2 list db directory

### If db2sysc is still running
ps | grep db2 
db2stop
db2stop force
db2 terminate ### Kill off db2bp fragments

### kill everything else other
ps | grep db2

### Remove IPC
ipcrm -a


###################################################
### Clean up remainders
su - root
/opt/tivoli/tsm/db2/instance/db2ilist
/opt/tivoli/tsm/db2/instance/db2idrop tsminst1

### Verify nothing left
/opt/tivoli/tsm/db2/instance/db2ilist


###################################################
### Redefine the instance
su - root
#/opt/tivoli/tsm/db2/instance/db2icrt -a server -u tsminst1 tsminst1
/opt/tivoli/tsm/db2/instance/db2icrt -u tsminst1 tsminst1

DBI1446I The db2icrt command is running.
DB2 installation is being initialized.
Total number of tasks to be performed: 4
Total estimated time for all tasks to be performed: 309 second(s)

Task #1 start
Description: Setting default global profile registry variables
Estimated time 1 second(s)
Task #1 end

Task #2 start
Description: Initializing instance list
Estimated time 5 second(s)
Task #2 end

Task #3 start
Description: Configuring DB2 instances
Estimated time 300 second(s)
Task #3 end

Task #4 start
Description: Updating global profile registry
Estimated time 3 second(s)
Task #4 end

The execution completed successfully.
For more information see the DB2 installation log at "/tmp/db2icrt.log.21176".
DBI1070I Program db2icrt completed successfully.


###################################################
### Set up Db2 environment variables
# NOTE: userprofile and db2profile get reset after db2icrt
su - tsminst1
/opt/tivoli/tsm/db2/adm/db2set -i tsminst1 "DB2_SKIPINSERTED=ON"
/opt/tivoli/tsm/db2/adm/db2set -i tsminst1 "DB2_KEEPTABLELOCK=ON"
/opt/tivoli/tsm/db2/adm/db2set -i tsminst1 "DB2_EVALUNCOMMITTED=ON"
/opt/tivoli/tsm/db2/adm/db2set -i tsminst1 "DB2_SKIPDELETED=ON"
/opt/tivoli/tsm/db2/adm/db2set -i tsminst1 "DB2CODEPAGE=819"
/opt/tivoli/tsm/db2/adm/db2set -i tsminst1 "DB2_PARALLEL_IO=*"

cat <<EOF >>${HOME}/sqllib/userprofile
export LD_LIBRARY_PATH=${HOME}/sqllib/lib64/gskit:${HOME}/sqllib/lib32:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/tivoli/tsm/server/bin/dbbkapi:/opt/ibm/lib:/opt/ibm/lib64:/usr/lib64:${HOME}/sqllib/lib64
export PATH=$PATH:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/tivoli/tsm/server/bin64
export PATH=$PATH:/opt/tivoli/tsm/server/bin:/usr/tivoli/tsm/server/bin64:/usr/tivoli/tsm/server/bin
export PATH=$PATH:/opt/tivoli/tsm/client/ba/bin64:/opt/tivoli/tsm/client/ba/bin:/usr/tivoli/tsm/client/ba/bin64
export PATH=$PATH:/usr/tivoli/tsm/client/ba/bin:/usr/tivoli/tsm/client/api/bin64:/usr/tivoli/tsm/client/api/bin
export PATH=$PATH:/opt/tivoli/tsm/client/api/bin64:/opt/tivoli/tsm/client/api/bin:/opt/tivoli/tsm/db2/bin
export PATH=$PATH:${HOME}/sqllib/bin:${HOME}/sqllib/adm:${HOME}/sqllib/misc

DSMI_CONFIG=${HOME}/tsmdbmgr.opt
DSMI_DIR=/opt/tivoli/tsm/server/bin/dbbkapi
DSMI_LOG=${HOME}
export DSMI_CONFIG DSMI_DIR DSMI_LOG 
EOF

cat <<EOF >>${HOME}/.profile
. ${HOME}/sqllib/db2profile
. ${HOME}/sqllib/userprofile

alias ll='ls -laF --color=auto'
set -o vi
EOF

. ./.profile


###################################################
### Catalog the DB to make sure it is okay.
db2start

### Find the TSMDB1 instances).
DBDIR=$(find /home /sp /tsm -name sqldbdir -exec strings {} \; 2>/dev/null | grep inst | cut -c 2-99 | sort | uniq)
echo $DBDIR

### Register the instance(s).
for i in $DBDIR ; do db2 catalog db TSMDB1 on $i ; done
# SQL6028N Catalog database failed because database "tsminst1" was not found in the local database directory.

### List the instances.
db2 list db directory


###################################################
### Upgrade the DB2 system catalog tables
#db2 upgrade db tsminst1
#SQL1013N The database alias name or database name "TSMINST1" could not be found. SQLSTATE=42705
db2 upgrade db TSMDB1
SQL1103W The UPGRADE DATABASE command was completed successful.

### Stop DB2 to make sure it flushes everything.
db2stop
SQAL1064N DB2STOP processing was successful.


###################################################
### Upgrade the TSM database schema
/opt/tivoli/tsm/server/bin/dsmserv upgradedb
ANR7800I DSMSERV generated at 18:03:03 on Nov 19 2021.

IBM Spectrum Protect for AIX
Version 8, Release 1, Level 13.000

Licensed Materials - Property of IBM

(C) Copyright IBM Corporation 1990, 2021.
All rights reserves.
U.S. Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corporation.

ANR7801I Subsystem process ID is 64684398.
ANR7811I Using instance directory /tsm/tsminst1/
ANR3339I Default Label in key database is TSM Server SelfSigned SHA Key.
ANR4726I The ICC support module has been loaded.
ANR0990I Server restart-recovery in progress.

###################################################
### Make sure the server accepts workload

# start the server normally (rc script, systemd, or inittab line run from an at-job).

### Run these from dsmadmc
REG LIC FILE=*ee.lic
enable ses all

 


ext keeps going read only

During backups, I get this, and the root filesystem goes read only.

I’ve replaced disks, rebuilt filesystems, arrays, LVM, etc.

 

dmesg shows:

[19501.355932] EXT4-fs error (device dm-7): ext4_get_verity_descriptor_location:295: inode #6032: comm dsmc: verity file doesn’t use extents
[19501.403496] Aborting journal on device dm-7-8.
[19501.414969] EXT4-fs (dm-7): Remounting filesystem read-only
[19501.414974] fs-verity (dm-7, inode 6032): Error -117 getting verity descriptor size

 

Find the file by inode:

find / -xdev -inum 6032 -print

 

This showed that there is junk in lost+found from when I had FS corruption.  I rebuilt the root filesystem, but never cleaned out lost+found.  Deleting the damaged files should solve the problem.


Ubuntu LTS UEFI NVME Mirror Boot

This is super touchy, and here is what I did to make it happy and stable.

This does not address if UEFI decides to write to one of these mirrors.  Someone else has a systemd unit to assemble with resync.

In the past, I used someone else’s bypass script, but this was cleaner, and works in 18 and 20 LTS.

 

### Filesystem / Mirror for EFI / UEFI booting:
mdadm raid 1, metadata 1.0
vfat filesystem for /boot/EFI

### Proper GRUB package
apt-get purge grub\*
apt-get install grub-efi
apt-get autoremove

### Settings that seem to not stick
dpkg-reconfigure -p low grub-efi-amd64
Update NVRAM variables to automatically boot into Debian? NO
echo "grub-pc grub2/update_nvram boolean false" | debconf-set-selections
echo "grub-pc grub-efi/install_devices multiselect /dev/md0" | debconf-set-selections

### Grub config
update-grub
grub-install --no-nvram /dev/md0

### UEFI boot list (variables)
[root@tsm2: /root]
/bin/bash# efibootmgr -?
efibootmgr: invalid option -- '?'
efibootmgr version 17
usage: efibootmgr [options]
-a | --active sets bootnum active
-A | --inactive sets bootnum inactive
-b | --bootnum XXXX modify BootXXXX (hex)
-B | --delete-bootnum delete bootnum
-c | --create create new variable bootnum and add to bootorder
-C | --create-only create new variable bootnum and do not add to bootorder
-D | --remove-dups remove duplicate values from BootOrder
-d | --disk disk (defaults to /dev/sda) containing loader
-r | --driver Operate on Driver variables, not Boot Variables.
-e | --edd [1|3|-1] force EDD 1.0 or 3.0 creation variables, or guess
-E | --device num EDD 1.0 device number (defaults to 0x80)
-g | --gpt force disk with invalid PMBR to be treated as GPT
-i | --iface name create a netboot entry for the named interface
-l | --loader name (defaults to "\EFI\ubuntu\grub.efi")
-L | --label label Boot manager display label (defaults to "Linux")
-m | --mirror-below-4G t|f mirror memory below 4GB
-M | --mirror-above-4G X percentage memory to mirror above 4GB
-n | --bootnext XXXX set BootNext to XXXX (hex)
-N | --delete-bootnext delete BootNext
-o | --bootorder XXXX,YYYY,ZZZZ,... explicitly set BootOrder (hex)
-O | --delete-bootorder delete BootOrder
-p | --part part partition containing loader (defaults to 1 on partitioned devices)
-q | --quiet be quiet
-t | --timeout seconds set boot manager timeout waiting for user input.
-T | --delete-timeout delete Timeout.
-u | --unicode | --UCS-2 handle extra args as UCS-2 (default is ASCII)
-v | --verbose print additional information
-V | --version return version and exit
-w | --write-signature write unique sig to MBR if needed
-y | --sysprep Operate on SysPrep variables, not Boot Variables.
-@ | --append-binary-args file append extra args from file (use "-" for stdin)
-h | --help show help/usage

[root@tsm2: /root]
/bin/bash# efibootmgr -v
BootCurrent: 0019
Timeout: 5 seconds
BootOrder: 0005,0006,0007,000C,0019,0018
Boot0000 Startup Menu FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)....ISPH
Boot0001 System Information FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0002 Bios Setup FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0003 3rd Party Option ROM Management FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0004 System Diagnostics FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0005* nvme0_grub HD(1,GPT,e41eb9e0-6606-411a-bb83-bed7577f29b3,0x800,0x8e800)/File(\EFI\ubuntu\grub.efi)
Boot0006* nvme1_grub HD(1,GPT,aa23256a-95c6-4148-b56c-c8861fc7966a,0x800,0x8e800)/File(\EFI\ubuntu\grub.efi)
Boot0007* nvme2_grub HD(1,GPT,1f7f7f5b-2a89-4d87-a617-6ccaf15078dd,0x800,0x8e800)/File(\EFI\ubuntu\grub.efi)
Boot0008 Boot Menu FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0009* Kingston DataTraveler 3.0 408D5CE57214E331293064F6 BBS(USB,USB1,0x900)/PciRoot(0x0)/Pci(0x1d,0x0)......ISPH
Boot000B Network Boot FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot000C* nvme3_grub HD(1,GPT,cb8bc8b4-affc-4765-97c2-72af0c615d44,0x800,0x8e800)/File(\EFI\ubuntu\grub.efi)
Boot000E* IPV6 Network - Aquantia AQtion 10Gbit Network Adapter PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(88c9b3bfa1e9,0)/IPv6([::]:<->[::]:,0,0)N.....YM....R,Y.....ISPH
Boot0010* IBA GE Slot 00C8 v1550 BBS(Network,Network1,0x0)/PciRoot(0x0)/Pci(0x19,0x0)......ISPH
Boot0011 USB: PciRoot(0x0)/Pci(0x1d,0x0)N.....YM....R,Y.....ISPH
Boot0012 HP Recovery FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0013* hp PLDS DVDRW DU8AESH PciRoot(0x0)/Pci(0x1f,0x2)/Sata(0,0,0)N.....YM....R,Y.....ISPH
Boot0014* hp PLDS DVDRW DU8AESH BBS(CDROM,CDROM1,0x400)/PciRoot(0x0)/Pci(0x1f,0x2)......ISPH
Boot0018* ubuntu HD(1,GPT,e41eb9e0-6606-411a-bb83-bed7577f29b3,0x800,0x8e800)/File(\EFI\ubuntu\shimx64.efi)....ISPH
Boot0019* ubuntu HD(1,GPT,e41eb9e0-6606-411a-bb83-bed7577f29b3,0x800,0x8e800)/File(\EFI\grub\shimx64.efi)....ISPH
Boot001A* IPV4 Network - Aquantia AQtion 10Gbit Network Adapter PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(88c9b3bfa1e9,0)/IPv4(0.0.0.00.0.0.0,0,0)N.....YM....R,Y.....ISPH

[root@tsm2: /root]
/bin/bash# efibootmgr -B -b 0005
/bin/bash# efibootmgr -B -b 0006
/bin/bash# efibootmgr -B -b 0007
/bin/bash# efibootmgr -B -b 0009
/bin/bash# efibootmgr -B -b 000c
/bin/bash# efibootmgr -B -b 0018
BootCurrent: 0019
Timeout: 5 seconds
BootOrder: 0019
Boot0000 Startup Menu
Boot0001 System Information
Boot0002 Bios Setup
Boot0003 3rd Party Option ROM Management
Boot0004 System Diagnostics
Boot0008 Boot Menu
Boot000B Network Boot
Boot000E* IPV6 Network - Aquantia AQtion 10Gbit Network Adapter
Boot0010* IBA GE Slot 00C8 v1550
Boot0011 USB:
Boot0012 HP Recovery
Boot0013* hp PLDS DVDRW DU8AESH
Boot0014* hp PLDS DVDRW DU8AESH
Boot0019* ubuntu
Boot001A* IPV4 Network - Aquantia AQtion 10Gbit Network Adapter

[root@tsm2: /root]
/bin/bash# efibootmgr -v
BootCurrent: 0019
Timeout: 5 seconds
BootOrder: 0019
Boot0000 Startup Menu FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)....ISPH
Boot0001 System Information FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0002 Bios Setup FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0003 3rd Party Option ROM Management FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0004 System Diagnostics FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0008 Boot Menu FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot000B Network Boot FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot000E* IPV6 Network - Aquantia AQtion 10Gbit Network Adapter PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(88c9b3bfa1e9,0)/IPv6([::]:<->[::]:,0,0)N.....YM....R,Y.....ISPH
Boot0010* IBA GE Slot 00C8 v1550 BBS(Network,Network1,0x0)/PciRoot(0x0)/Pci(0x19,0x0)......ISPH
Boot0011 USB: PciRoot(0x0)/Pci(0x1d,0x0)N.....YM....R,Y.....ISPH
Boot0012 HP Recovery FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0013* hp PLDS DVDRW DU8AESH PciRoot(0x0)/Pci(0x1f,0x2)/Sata(0,0,0)N.....YM....R,Y.....ISPH
Boot0014* hp PLDS DVDRW DU8AESH BBS(CDROM,CDROM1,0x400)/PciRoot(0x0)/Pci(0x1f,0x2)......ISPH
Boot0019* ubuntu HD(1,GPT,e41eb9e0-6606-411a-bb83-bed7577f29b3,0x800,0x8e800)/File(\EFI\grub\shimx64.efi)....ISPH
Boot001A* IPV4 Network - Aquantia AQtion 10Gbit Network Adapter PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(88c9b3bfa1e9,0)/IPv4(0.0.0.00.0.0.0,0,0)N.....YM....R,Y.....ISPH

[root@tsm2: /root]
/bin/bash# efibootmgr -c -d /dev/nvme0n1 -L nvme0_grub -l '\EFI\grub\shimx64.efi'
/bin/bash# efibootmgr -c -d /dev/nvme1n1 -L nvme1_grub -l '\EFI\grub\shimx64.efi'
/bin/bash# efibootmgr -c -d /dev/nvme2n1 -L nvme2_grub -l '\EFI\grub\shimx64.efi'
/bin/bash# efibootmgr -c -d /dev/nvme3n1 -L nvme3_grub -l '\EFI\grub\shimx64.efi'
/bin/bash# efibootmgr -o 0005,0006,0007,0009,00019
BootCurrent: 0019
Timeout: 5 seconds
BootOrder: 0005,0006,0007,0009,0019
Boot0000 Startup Menu
Boot0001 System Information
Boot0002 Bios Setup
Boot0003 3rd Party Option ROM Management
Boot0004 System Diagnostics
Boot0005* nvme0_grub
Boot0006* nvme1_grub
Boot0007* nvme2_grub
Boot0008 Boot Menu
Boot0009* nvme3_grub
Boot000B Network Boot
Boot000E* IPV6 Network - Aquantia AQtion 10Gbit Network Adapter
Boot0010* IBA GE Slot 00C8 v1550
Boot0011 USB:
Boot0012 HP Recovery
Boot0013* hp PLDS DVDRW DU8AESH
Boot0014* hp PLDS DVDRW DU8AESH
Boot0019* ubuntu
Boot001A* IPV4 Network - Aquantia AQtion 10Gbit Network Adapter

[root@tsm2: /root]
/bin/bash# efibootmgr -v
BootCurrent: 0019
Timeout: 5 seconds
BootOrder: 0005,0006,0007,0009,0019
Boot0000 Startup Menu FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)....ISPH
Boot0001 System Information FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0002 Bios Setup FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0003 3rd Party Option ROM Management FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0004 System Diagnostics FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0005* nvme0_grub HD(1,GPT,e41eb9e0-6606-411a-bb83-bed7577f29b3,0x800,0x8e800)/File(\EFI\grub\shimx64.efi)
Boot0006* nvme1_grub HD(1,GPT,aa23256a-95c6-4148-b56c-c8861fc7966a,0x800,0x8e800)/File(\EFI\grub\shimx64.efi)
Boot0007* nvme2_grub HD(1,GPT,1f7f7f5b-2a89-4d87-a617-6ccaf15078dd,0x800,0x8e800)/File(\EFI\grub\shimx64.efi)
Boot0008 Boot Menu FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0009* nvme3_grub HD(1,GPT,cb8bc8b4-affc-4765-97c2-72af0c615d44,0x800,0x8e800)/File(\EFI\grub\shimx64.efi)
Boot000B Network Boot FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot000E* IPV6 Network - Aquantia AQtion 10Gbit Network Adapter PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(88c9b3bfa1e9,0)/IPv6([::]:<->[::]:,0,0)N.....YM....R,Y.....ISPH
Boot0010* IBA GE Slot 00C8 v1550 BBS(Network,Network1,0x0)/PciRoot(0x0)/Pci(0x19,0x0)......ISPH
Boot0011 USB: PciRoot(0x0)/Pci(0x1d,0x0)N.....YM....R,Y.....ISPH
Boot0012 HP Recovery FvVol(cdbb7b35-6833-4ed6-9ab2-57d2acddf6f0)/FvFile(9d8243e8-8381-453d-aceb-c350ee7757ca)......ISPH
Boot0013* hp PLDS DVDRW DU8AESH PciRoot(0x0)/Pci(0x1f,0x2)/Sata(0,0,0)N.....YM....R,Y.....ISPH
Boot0014* hp PLDS DVDRW DU8AESH BBS(CDROM,CDROM1,0x400)/PciRoot(0x0)/Pci(0x1f,0x2)......ISPH
Boot0019* ubuntu HD(1,GPT,e41eb9e0-6606-411a-bb83-bed7577f29b3,0x800,0x8e800)/File(\EFI\grub\shimx64.efi)....ISPH
Boot001A* IPV4 Network - Aquantia AQtion 10Gbit Network Adapter PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/MAC(88c9b3bfa1e9,0)/IPv4(0.0.0.00.0.0.0,0,0)N.....YM....R,Y.....ISPH

 

[root@tsm2: /root]
/bin/bash# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS
Release: 20.04
Codename: focal

[root@tsm2: /root]
/bin/bash# uname -a
Linux tsm2 5.4.0-97-generic #110-Ubuntu SMP Thu Jan 13 18:22:13 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux


Spectrum Protect (TSM) Operations Center on Ubuntu LTS

Per IBM, the Spectrum Protect server is supported on Ubuntu LTS 14, 16, 18, and 20 (aka 2014.04, 2016.04, etc.) 

https://www.ibm.com/support/pages/overview-ibm-spectrum-protect-supported-operating-systems

However, Operations Center (web GUI) is not supported on Ubuntu, only RHEL and SLES.

https://www.ibm.com/support/pages/ibm-spectrum-protect-operations-center-software-and-hardware-requirements

./install.sh -c
Validating package prerequisites...
=====> IBM Installation Manager> Update> Prerequisites
Validation results:
* [ERROR] IBM Spectrum Protect Operations Center 8.1.12000.20210326_0723 contains validation errors.
1. ERROR: The operating system on which you are installing the product is not supported. For more information, see http://www.ibm.com/support/docview.wss?uid=swg21243309.

Enter the number of the error or warning message above to view more details.

To skipp the OS and platform checks, and convert the ERROR into WARNING:

./install.sh -c -vmargs "-DBYPASS_TSM_REQ_CHECKS=true"
Validation results:
* [WARNING] IBM Spectrum Protect Operations Center 8.1.12000.20210326_0723 contains validation warning.
1. WARNING: The operating system on which you are installing the product is not supported. For more information, see http://www.ibm.com/support/docview.wss?uid=swg21243309.

Enter the number of the error or warning message above to view more details.

I recommend ONLY install/update Operations Center with this, and then exit and go back in normally to make sure the other filesets validate okay.


ANR1812E DELETE FILESPACE VMFULL failed because replication

ERROR:

ANR1812E DELETE FILESPACE VMFULL failed because replication

DESCRIPTION:

Decommed VMs fail to auto-delete during expiration because replication is happening. In an ideal world, there would be enough system resources to perform DB Backup in 2 hours, expiration in 2 hours, and replication in 4-8 hours. In this environment, replication overlaps a lot of other processes, and can get in the way. 

ANR1812E DELETE FILESPACE VMFULL-SOMENODENAME for node failed deletion because of a replication in progress. (SESSION: 123456)

 

WORKAROUND:

Identify the server

 

Cancel replication
CANCEL REPLICATION

 

Identify the filespace
VMFULL-SOMENODENAME in the example

 

Find the node that owns the filespace.
Protect: TSM>q occ * *VMFULL-SOMENODENAME *
NODE_NAME       Type     FILESPACE_NAME          FSID   Files   Phys MB   Logical MB
VM_DATACENTER    Bkup     \VMFULL-SOMENODENAME 4   53084         –    6,782,908

 

Delete the filespace on both local and replica:
DELETE FI VM_DATACENTER    ‘\VMFULL-SOMENODENAME ‘
TSM2: DELETE FI VM_DATACENTER    ‘\VMFULL-SOMENODENAME ‘

 

Monitor Progress until complete
Protect: TSM>q occ * *VMFULL-SOMENODENAME *
NODE_NAME       Type     FILESPACE_NAME          FSID   Files   Phys MB   Logical MB
VM_DATACENTER     Bkup     \VMFULL-SOMENODENAME      4   50848         –    6,469,955

Protect: TSM>q act search=ANR1812E
03/07/21   23:14:23      ANR2017I Administrator ADMIN issued command: QUERY ACTLOG search=ANR1812E  (SESSION: 438180)

Protect: TSM>q proc
Process      Process Description          Job Id     Process Status                                   
——–     ——————–     ———-     ————————————————-
     395     DELETE FILESPACE                        Deleting file space \VMFULL-SOMENODENAME
                                                      (fsId=4) (which can include backup and archive
                                                      data) for node VM_DATACENTER    : 0 objects deleted,
                                                      0 objects retained, and 0 objects skipped.

Protect: TSM>TSM2: q proc
ANR1699I Resolved TSM2 to 1 server(s) – issuing command Q PROC against server(s).
ANR1687I Output for command ‘Q PROC’ issued against server TSM2 follows:
Process      Process Description          Job Id     Process Status                                   
——–     ——————–     ———-     ————————————————-
   5,756     DELETE FILESPACE                        Deleting file space \VMFULL-SOMENODENAME
                                                      (fsId=4) (which can include backup and archive
                                                      data) for node VM_DATACENTER: 0 objects deleted,
                                                      0 objects retained, and 0 objects skipped.
ANR1688I Output for command ‘Q PROC’ issued against server TSM2 completed.
ANR1694I Server TSM2 received the request to process command ‘Q PROC’.
ANR1697I Command ‘Q PROC’ processed by 1 server(s):  1 successful, 0 with warnings, and 0 with errors.

 

CAUSE:

Replicate Node, a normal operation, creates locks on any filespace to be processed.

The long-term resolution would be to have enough system resources to not have to overlap daily operations processes.

The benchmark set by IBM for this would be the ability to complete BACKUP DB in 2 hours.  This environment take 8-12 hours for most servers.


Posted in Reference, Work | Tagged , , | Comments Off on ANR1812E DELETE FILESPACE VMFULL failed because replication