SMB/CIFS 3 on AIX

Mounting should be vaguely similar to the SMB1 mounting you had before.

Download and install SMB Client 3, and “Network Authentication Service” (aka kerberos 5) from here:

https://www-01.ibm.com/marketing/iwm/iwm/web/pickUrxNew.do?source=aixbp

Ensure your Windows 201x server has SMB v3 enabled.

You want a service account in AD to use for your SMB3 mounts on AIX.

 

Notes about options:

encryption should be desired and secure_negotiate should be desired.
signing should be enabled
​​​​​pver should be 3.0.2
The kerberos realm specified in the “wrkgrp” option must be in all UPPERCASE if your domain is in uppercase.
The username provided for mounting is used for all read/write permissions/access.  
UID and GID default to root.system, but you can specify others.
fmode is the inverse of umask, and what the files’ permissions look like across the whole share.  Default is 755.
port can be 139 (ipv4) or 445 (ipv4 or ipv6).  Default is 445.

 

/etc/filesystems format:

/mnt:
     dev = /corpshare
     vfs = smbc
     mount = true
     options = “wrkgrp=CORP.DOMAIN,signing=enabled,pver=3.0.2,encryption=desired,secure_negotiate=desired”
     nodename = win2016server.corp.domain/sambauser

 

Command line example

mount -v smbc -n win2016server.corp.domain/sambauser/Passw0rd! \
-o “wrkgrp=CORP.DOMAIN,port=445,signing=required,encryption=required, \
secure_negotiate=desired,pver=auto” /corpshare /mnt

 

Store the samba credentials

mksmbcred -s win2016server.corp.comain -u sambauser [-p password]

See also lssmbcred, chsmbcred, and rmsmbcred.

 

Reference 2021:

https://www.ibm.com/docs/en/aix/7.2?topic=protocol-server-message-block-smb-client-file-system 


GPSD / NTPD / Debian 10 Buster

I think I finally have my GPS NTP server tweaked. Average deviance yesterday was 0.33ms.
 
Just threw that last adjustment in, and we’ll see tomorrow how it aligns (eg, am I just -0.33 now, or am I close to 0.03 off?)
 
Without the time1 offset, it was getting silently ignored. Obscure.
 
Config is simple once you understand it, but for me, the understanding part was tough.
 
Quick-Reference:
apt update && apt install gpsd ntp ntpdate
ntpdate time.nist.gov
 
Plug in the VK* or uBlox style GPS receiver and put it in a window.
dmesg | grep tty
cat <<‘EOF’ >> /etc/default/gpsd
START_DAEMON=”true”
USBAUTO=”false”
DEVICES=”/dev/ttyACM0″
EOF
systemctl disable gpsd.socket
systemctl enable gpsd.service
systemctl restart gpsd.service
 
cat <<‘EOF’> /etc/ntp.conf
### Public servers and permissions
pool time.nist.gov burst minpoll 5 maxpoll 5
pool us.pool.ntp.org burst minpoll 5 maxpoll 5
pool pool.ntp.org burst minpoll 5 maxpoll 5
server ntp01.frontier.com burst minpoll 5 maxpoll 5
server ntp02.frontier.com burst minpoll 5 maxpoll 5
restrict source notrap nomodify noquery
restrict default kod limited nomodify notrap nopeer noquery
restrict -6 default kod limited nomodify notrap nopeer noquery
restrict 127.0.0.1
restrict -6 ::1
 
### Stats needed for accuracy
driftfile /var/lib/ntp/ntp.drift
leapfile /usr/share/zoneinfo/leap-seconds.list
statsdir /var/log/ntpstats/
statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable
 
### GPS time service (PPS does not work on my device)
server 127.127.28.0 minpoll 3 maxpoll 4
fudge 127.127.28.0 time1 0.0445 refid GPS
server 127.127.28.1 prefer minpoll 3 maxpoll 4
fudge 127.127.28.1 refid PPS
broadcast 192.168.1.255 minpoll 4 maxpoll 4
EOF
systemctl enable ntp
systemctl restart ntp

Supermicro BMC firmware

Setting up the new replica server, I kept running into problems during the initial firmware update.  All IPMI settings and hardware data were inaccessible after BMC firmware update. System otherwise works as expected.  This condition persisted no matter if I went to DEL setup or F11 boot menu.  I could hang out in the UEFI shell, or etc.

It was not that the sensor data was “not present”.  It was that the list of sensors was missing.  Also, System MAC address, and BIOS version info was missing.  The FRU data was empty, as well as the Hardware Information.  All of the IPMI settings were blank, and could not be set.  The diagnostic data page gave “File not found”.  The iKVM was inaccessible, and the system could not be put into maintenance mode, powered off, powered on, or reset from the IPMI interface.  All of the system logs were blank and inaccessible.  Support was not super helpful, but they were responsive.  Supermicro is one of the top tier system makers.  They OEM for IBM, but that equipment is not quite as touchy.

The solution is to be very finicky about the BMC firmware update.

Get the right version for your system here:
https://www.supermicro.com/support/resources/bios_ipmi.php

My system used this code:
https://www.supermicro.com/Bios/softfiles/12085/X11SDVN_BIOS_1_3a_IPMI_1_31_03.zip
Manufacturer Name: Supermicro
Product PartNum: SYS-E300-9D
Chassis Part Number: CSE-E300
Board Product Name: X11SDV-4C-TLN2F
BIOS Vendor: American Megatrends Inc.
Processor: Intel(R) Xeon(R) D-2123IT CPU @ 2.20GHz

Update the code with extra patience
Use the AwUpdate utility to update the IPMI/BMC firmware
.\AwUpdate -f ......\WS_X11AST2500_131_03.bin -i lan -h 192.168.1.210 -u ADMIN -p ADMIN
NOTE: This can be on some other system as long as you can connect TCP between the two.
NOTE: No -r, and in the web UI, we would uncheck all of the “preserve settings” options.

Let all 5 parts (0 through 4) complete
Wait for the “New firmware is updating” to complete
Wait for the system to reboot.

Monitor the console
Wait for a longer version of IPMI Initialization
Wait for a longer than usual DXE — ACPI Initialization
Wait for the red LED to come on
Wait at least 5 more minutes (try 10)

At this point, you should see that it responds to F11 or DEL, but stays hung.
CTRL-ALT-DEL and everything should be populated and working.

The Unit IDentity LED may be stuck red.
You cannot clear the UID red state any way other than pulling the power cord.
Let it drain for 30 seconds, and plug back in.

After this, everything works, AND the UID LED setting in the IPMI web interface will switch from blink blue to off.


OVM CPU Pinning

If you clone or recover an Oracle VM guest, and the source used CPU pinning (Hard Partitioning), the target may not work.  The error is entirely non-intuitive, and I could not find it on the interwebs, so here is a sanitized version.

OVMAPI_5001E Job: 1416254413024/QueuedVmStartDbImpl_1416254413023/OVMJOB_1500J Start/resume vm: PRODVM, on server: PRODSERVER, failed. 
Job Failure Event: 1416254413902/Server Async Command Failed/OVMEVT_00C014D_001 Async command failed on server: PRODSERVER. 
Object: PRODVM, PID: 15431, Server error: 
Command: [‘xm’, ‘create’, ‘/OVS/Repositories/000dead000beef00cafe0421cab55bad/VirtualMachines/000dead000beef00cafef207cabdbbad/vm.cfg’] failed (1): 
stderr: Error: (22, ‘Invalid argument’) 
stdout: Using config file “/OVS/Repositories/000dead000beef00cafe0421cab55bad/VirtualMachines/000dead000beef00cafef207cabdbbad/vm.cfg”. , 
on server: PRODSERVER, associated with object: 000dead000beef00cafef207cabdbbad [Thu Apr 15 00:12:19 EDT 2021]

 

You can remove the “cpus = ‘#-#'” line from vm.cfg to reset this.

References about OVM hard partitioning includes:

xm info

xm list

xenpm get-cpu-topology

xm vcpu-list

# cd /u01/app/oracle/ovm-manager-3/ovm_utils
# ./ovm_vmcontrol -u admin -p YourPassword -h ovm-manager -v my-first-vm -c vcpuset -s 0-7
Oracle VM VM Control utility 0.6.3.
Connected.
Command : vcpuset
Pinning virtual CPUs
Pinning of virtual CPUs to physical threads  '0-7' 'my-first-vm' completed.

After that, vcpu-list will show VM names in column 1 for dedicated CPUs.

NDMP TOC failure – datamover type incorrect

ERROR:
ANR4950E The server is unable to retrieve NDMP file history information while building table of contents for node NASNODE01, file space /SVM_NASNODE01_VIRTUALFS. NDMP node ID is 90156245149. Table of contents creation fails.

CAUSE:
One possible cause of this can be if the datamover was defined with the wrong scope (TYPE).  
TYPE can be NAS, NASVSERVER, or NASCLUSTER.  NAS is for node context.  VSERVER is for SVM ccontext.  CLUSTER is for the whole cluster context.

NOTE: There are other possible causes, such as corrupt inodes, or other issues; however, this one bit me and was not clearly define anywhere else.

CORRECTION:
You cannot UPDATE DATAMOVER TYPE=blah, but you can simply DELETE DATAMOVER and DEFINE DATAMOVER to fix.

DELETE DATAMOVER NASNODE01
DEFINE datamover NASNODE01 type=nascluster dataformat=netappdump hla=192.168.128.1 user=NDMPADMIN password=PASSWORDHERE

TRACING INFO:

trace disable
trace enable spi spid toc
trace begin /tmp/server.trc

Once tracing has been enabled, I would then like for you to initiate another backup of the /SVM_SBNAS01_OU_ABOD volume. When the backup completes/fails, you can then issue the following commands to disable tracing:

trace flush
trace end
trace disable
QUERY ACTLOG

grep NDMP dsmffdc.log

NASNODE01::> node run -node SBNAS01-01
Type ‘exit’ or ‘Ctrl-D’ to return to the CLI
NASNODE01> rdfile /etc/log/backup


mdadm fewer number of larger devices

I could not find where people were confident in the possibility of reshaping an MDAdm array to a fewer number of larger devices.  Plenty of recent people said you cannot do this.  I made this happen, and the biggest concern is making sure you provide enough space on the new devices.  There are some safety warnings that help with this.  I did have to resize my new partitions a couple of times during the process.

I did this because my rootvg needed to move to NVMe, and I only had room for 4 devices, vs the 5 on SATA.  The OS I used was Debian 10 Buster, but this should work on any vaguely contemporary GNU/Linux distribution.

There are always risks with reshaping arrays and LVM, so I recommend you back up your data.
There are always risks with reshaping arrays and LVM, so I recommend you back up your data.
There are always risks with reshaping arrays and LVM, so I recommend you back up your data.
There are always risks with reshaping arrays and LVM, so I recommend you back up your data.

First, build the new NVME partitions
I have p1 for /boot (not UEFI yet, and I’m on LILO still, so unused right now).
I have p2 for rootvg, and p3 for ssddatavg

parted /dev/nvme0n1
mklabel gpt
y
mkpart boot ext4 4096s 300MB
set 1 raid on
set 1 boot on
mkpart root 300MB 80GB
set 2 raid on
### Resizing last
rm 3
resizepart 2 80G
mkpart datassd 80G 100%
set 3 raid on
print
quit

Repeat for the other devices so they match.
My devices looked like this after:

Model: INTEL SSDPEKNW020T8 (nvme)
Disk /dev/nvme3n1: 2048GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number Start End Size File system Name Flags
1 2097kB 300MB 298MB boot boot, esp
2 300MB 80.0GB 79.7GB root raid
3 80.0GB 2048GB 1968GB datassd raid

Clear superblocks if needed
If you are retrying after 37 attempts, these commands may come in handy:

### wipe superblock
for i in /dev/nvme?n1p1 ; do mdadm –zero-superblock $i ; done

### Wipe FS
for i in /dev/nvme?n1p1 ; do dd bs=256k count=4k if=/dev/zero of=$i ; done

Rebuild /boot – high level
This is incomplete, because I have not changed my host to UEFI mode.  The reference is good, but incomplete.

### Make new array and filesystem
mdadm –create –verbose /dev/md3 –level=1 –raid-devices=4 /dev/nvme*p1
mkfs.ext4 /dev/md3
mount /dev/md3 /mnt
rsync -avSP /boot/ /mnt/

### Install GRUB2
mkdir /boot/grub

apt update
apt-get install grub2
### From dpkg-reconfigure: kopt=nvme_core.default_ps_max_latency_us=0

### Make the basic config
[root@ns1: /root]

/bin/bash# grub-mkconfig -o /boot/grub/grub.cfg
Generating grub configuration file …
Found linux image: /boot/vmlinuz-4.19.0-10-amd64
Found initrd image: /boot/initrd.img-4.19.0-10-amd64
Found linux image: /boot/vmlinuz-4.19.0-5-amd64
Found initrd image: /boot/initrd.img-4.19.0-5-amd64
done

### Install the bootloader
[root@ns1: /root]

/bin/bash# grub-install /dev/md3
Installing for i386-pc platform.
grub-install: warning: File system `ext2′ doesn’t support embedding.
grub-install: error: embedding is not possible, but this is required for cross-disk install.

[root@ns1: /root]
/bin/bash# grub-install /dev/nvme0n1
Installing for i386-pc platform.
grub-install: warning: this GPT partition label contains no BIOS Boot Partition; embedding won’t be possible.
grub-install: error: embedding is not possible, but this is required for RAID and LVM install.

Need to convert to uefi before installing the bootloader will work.  I also rsync’d my old /boot into the new one, etc.  That is moot until this is corrected.

Swap out my SATA members with SSD

Original members are 37GB, and new are 77GB.  It was time to go bigger, and I found that I kept coming up a few gigs short trying to match size (5×37 vs 4×57).

The goal is to fail a drive, remove a drive, then add a larger SSD replacement. After the last drive is removed, we reshape the array while it is degraded, because we don’t have a 5th device to add.

### Replace the first device
mdadm -f /dev/md1 /dev/sda2

mdadm -r /dev/md1 /dev/sda2
mdadm –add /dev/md1 /dev/nvme0n1p2

### wait until it’s done rebuilding
#mdadm –wait /dev/md1
while grep re /proc/mdstat ; do sleep 20 ; date ; done
mdadm -f /dev/md1 /dev/sdb2
mdadm -r /dev/md1 /dev/sdb2
mdadm –add /dev/md1 /dev/nvme1n1p2

### wait until it’s done rebuilding
#mdadm –wait /dev/md1
sleep 1 ; while grep re /proc/mdstat ; do sleep 20 ; date ; done
mdadm -f /dev/md1 /dev/sdc2
mdadm -r /dev/md1 /dev/sdc2
mdadm –add /dev/md1 /dev/nvme2n1p2

### wait until it’s done rebuilding
#sleep 1 ; while grep re /proc/mdstat ; do sleep 20 ; date ; done
mdadm –wait /dev/md1
mdadm -f /dev/md1 /dev/sdd2
mdadm -r /dev/md1 /dev/sdd2
mdadm –add /dev/md1 /dev/nvme3n1p2

### Remove last smaller device
#sleep 1 ; while grep re /proc/mdstat ; do sleep 20 ; date ; done
mdadm –wait /dev/md1
mdadm -f /dev/md1 /dev/sde2
mdadm -r /dev/md1 /dev/sde2

Reshape the array

Check to make sure your required array resize is larger than the LVM space used in your PV.  

[root@ns1: /root]
/bin/bash# mdadm –grow /dev/md1 –raid-devices=4 –backup-file=/storage/backup
mdadm: this change will reduce the size of the array.
use –grow –array-size first to truncate array.
e.g. mdadm –grow /dev/md1 –array-size 155663872

[root@ns1: /root]
/bin/bash# pvs /dev/md1
PV VG Fmt Attr PSize PFree
/dev/md1 rootvg lvm2 a– 102.50g 8.75g

If you come up short, you can shrink a PV a little, but often, there are used blocks scattered around.  There is no defrag for LVM, so you would have to manually migrate extents.  I was too lazy to do that, and instead, grew my PV from 103GB to 155GB.  I kind of need the space anyway.

# pvresize –setphysicalvolumesize 102G /dev/md1

Final reshape here

Now that I know the size MDADM wants to use, I use that exactly (or smaller, but larger than the PV size currently set.)

mdadm –grow /dev/md1 –array-size 155663872
mdadm –grow /dev/md1 –raid-devices=4 –backup-file=/storage/backup1
sleep 1 ; while grep re /proc/mdstat ; do sleep 20 ; date ; done

One of the drives was stuck as a spare.

This is not guaranteed to happen, but it does happen sometimes.  Just an annoyance, and one of the many reasons using RAID6 is much better than RAID6.  Also, errors can be properly identified better than with RAID5, and various other things.  Just use RAID6 for 4 drives and up.  I promise, it’s worth it.  3 drives can be RAID5, or RAID10 on Linux, but it’s not ideal.  Also, if you have a random-write-intensive workload, then you can use RAID10 to save some IOPS at the expense of more drives used to protect larger arrays, and inferior protection.  (eg, it is possible to lose 2 drives on a 6 drive RAID10 and still lose data, if they are both copies of the same data.)

[root@ns1: /root]
/bin/bash# mdadm /dev/md1 –remove faulty

[root@ns1: /root]
/bin/bash# mdadm –detail /dev/md1
/dev/md1:
State : active, degraded

Number Major Minor RaidDevice State
0 259 15 0 active sync /dev/nvme2n1p2
1 259 17 1 active sync /dev/nvme3n1p2
2 259 11 2 active sync /dev/nvme0n1p2
– 0 0 3 removed

4 259 13 – spare /dev/nvme1n1p2

Remove and re-add the spare

The fix was easy.  I just removed and re-added the drive that was stuck as a spare.

[root@ns1: /root]
/bin/bash# mdadm /dev/md1 –remove /dev/nvme1n1p2
mdadm: hot removed /dev/nvme1n1p2 from /dev/md1

[root@ns1: /root]
/bin/bash# mdadm /dev/md1 –add /dev/nvme1n1p2
mdadm: hot added /dev/nvme1n1p2

Check status on rebuilding
[root@ns1: /root]
/bin/bash# mdadm –detail /dev/md1
/dev/md1:
State : active, degraded, recovering

Number Major Minor RaidDevice State
0 259 15 0 active sync /dev/nvme2n1p2
1 259 17 1 active sync /dev/nvme3n1p2
2 259 11 2 active sync /dev/nvme0n1p2
4 259 13 3 spare rebuilding /dev/nvme1n1p2

Alternatively, this might have been frozen
cat /sys/block/md1/md/sync_action
frozen
echo idle > /sys/block/md1/md/sync_action
echo recover > /sys/block/md1/md/sync_action

Grow to any extra space

Once it is done recovering and/or resyncing, then you can grow into any additional space.  Since we used the value above to set the size “smaller”, we do not have to do this.  Note, when resizing “UP”, it is technically possible to overrun the bitmap.  This example drops the bitmap during the resize.  That is a risk you’ll have to weigh.  A power outage during restructure without a bitmap could be a bad day.

mdadm –grow /dev/md1 –bitmap none
mdadm –grow /dev/md1 –size max
mdadm –wait /dev/md1
mdadm –grow /dev/md1 –bitmap internal

Expand LVM to use the new space

[root@ns1: /root]
/bin/bash# pvresize /dev/md1

[root@ns1: /root]
/bin/bash# pvs
PV VG Fmt Attr PSize PFree
/dev/md1 rootvg lvm2 a– <148.38g 54.62g

Other Notes 1:

I also dropped/readded a drive with pending reallocation sectors.  That is entirely unrelated to the reshaping above, but I’ll dump the log here for my own reference.

### See the errors
/bin/bash# smartctl -a /dev/sda
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-10-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red
Device Model: WDC WD30EFRX-68EUZN0
Firmware Version: 82.00A82
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always – 1
196 Reallocated_Event_Count 0x0032 199 199 000 Old_age Always – 1
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always – 2
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline – 0

### See what arrays use this disk
[root@ns1: /root]
/bin/bash# cat /proc/mdstat | grep -p sda
md0 : active raid1 sda1[4] sdd1[1] sde1[3] sdc1[2] sdb1[0]
271296 blocks [5/5] [UUUUU]
bitmap: 0/1 pages [0KB], 65536KB chunk

md2 : active raid6 sda3[3] sdb3[2] sdd3[4] sdc3[1] sde3[0]
8682399744 blocks level 6, 512k chunk, algorithm 2 [5/5] [UUUUU]
bitmap: 0/11 pages [0KB], 131072KB chunk

### Remove and re/add so it re-writes
[root@ns1: /root]
/bin/bash# mdadm /dev/md0 –fail /dev/sda1
mdadm: set /dev/sda1 faulty in /dev/md0

[root@ns1: /root]
/bin/bash# mdadm /dev/md0 –remove /dev/sda1
mdadm: hot removed /dev/sda1 from /dev/md0

[root@ns1: /root]
/bin/bash# mdadm /dev/md0 –add /dev/sda1
mdadm: hot added /dev/sda1

[root@ns1: /root]
/bin/bash# cat /proc/mdstat | grep -p sda
md0 : active raid1 sda1[5] sdd1[1] sde1[3] sdc1[2] sdb1[0]
271296 blocks [5/4] [UUUU_]
[=================>…] recovery = 87.3% (237440/271296) finish=0.0min speed=118720K/sec
bitmap: 0/1 pages [0KB], 65536KB chunk

md2 : active raid6 sda3[3] sdb3[2] sdd3[4] sdc3[1] sde3[0]
8682399744 blocks level 6, 512k chunk, algorithm 2 [5/5] [UUUUU]
bitmap: 0/11 pages [0KB], 131072KB chunk

### Remove/Readd the bigger array member
[root@ns1: /root]
/bin/bash# mdadm /dev/md2 –fail /dev/sda3
mdadm: set /dev/sda3 faulty in /dev/md2

[root@ns1: /root]
/bin/bash# mdadm /dev/md2 –remove /dev/sda3
mdadm: hot removed /dev/sda3 from /dev/md2

[root@ns1: /root]
/bin/bash# mdadm /dev/md2 –add /dev/sda3
mdadm: hot added /dev/sda3

Other Notes 2:

I also made a new array on partition 3.  That is entirely unrelated to the reshaping above, but I’ll dump the log here for my own reference.

[root@ns1: /root]
/bin/bash# mdadm /dev/md4 –create -l 6 -n 4 /dev/nvme?n1p3
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md4 started.

[root@ns1: /root]
/bin/bash# pvcreate /dev/md4
vgcreate Physical volume “/dev/md4” successfully created.

[root@ns1: /root]
/bin/bash# vgcreate ssdvg /dev/md4 -Ay -Zn
Volume group “ssdvg” successfully created

[root@ns1: /root]
/bin/bash# vgs
VG #PV #LV #SN Attr VSize VFree
datavg 1 7 0 wz–n- <8.09t 704.12g
rootvg 1 7 0 wz–n- <148.38g 54.62g
ssdvg 1 0 0 wz–n- 3.58t 3.58t

[root@ns1: /root]
/bin/bash# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10]
md4 : active raid6 nvme3n1p3[3] nvme2n1p3[2] nvme1n1p3[1] nvme0n1p3[0]
3844282368 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/4] [UUUU]
[>………………..] resync = 1.0% (20680300/1922141184) finish=216.5min speed=146336K/sec
bitmap: 15/15 pages [60KB], 65536KB chunk

md3 : active raid1 nvme3n1p1[3] nvme2n1p1[2] nvme1n1p1[1] nvme0n1p1[0]
289792 blocks super 1.2 [4/4] [UUUU]

md0 : active raid1 sda1[4] sdd1[1] sde1[3] sdc1[2] sdb1[0]
271296 blocks [5/5] [UUUUU]
bitmap: 0/1 pages [0KB], 65536KB chunk

md1 : active raid6 nvme1n1p2[3] nvme3n1p2[1] nvme2n1p2[0] nvme0n1p2[2]
155663872 blocks level 6, 256k chunk, algorithm 2 [4/4] [UUUU]
bitmap: 1/1 pages [4KB], 65536KB chunk

md2 : active raid6 sda3[5] sdb3[2] sdd3[4] sdc3[1] sde3[0]
8682399744 blocks level 6, 512k chunk, algorithm 2 [5/4] [UUU_U]
[=>……………….] recovery = 6.6% (191733452/2894133248) finish=349.9min speed=128713K/sec
bitmap: 0/11 pages [0KB], 131072KB chunk

unused devices: <none>


lancache

TLDR: I now only have to download microsoft and steam updates once for all 13 systems in the house.
 
I finally set up a LAN Cache. I got tired of windows update sneaking in and eating all of my bandwidth, killing movies, etc. We have 4 regular Steam clients, plus 3 that don’t run very often; and we have 13 Windows 10 systems. It seems like settings always revert, and they update whenever they want, or at 100% bandwidth a few months after setting the throttles low.
 
https://lancache.net/ caches steam, windows updates, and several others. It was much easier to set up than a squid webproxy on my router. This should make it so anything that is downloaded only downloads once. only have 200GB to throw at it right now, but that should help a bunch. I need to set it to auto-start on boot, and to give it more space eventually, but I’m just really happy it’s working now. And apache, SSLH, and DNSMASQ on the same host still working.
 
My router already pointed to my server for DNS Masquerading, so I could manually override things. I added a second IP address, and modified lancache.yml to put all services only on the new IP address. I updated dnsmasq.conf to forward to lancache only, because it was not obeying the fallback rules.
 
This means if lancache dies, I have to edit dnsmasq to keep the home network functional. So many layers.

reducevg very slow

This is an APAR, but really it’s a description. Reducevg sends the equivalent of TRIM commands, but on a storage array, this is writing nulls. On a big LUN, or with a busy array, this can take a long time. If you do not need to worry about this, then you can disable that space reclaim.

ioo -o -dk_lbp_enabled=0

Here is the IBM doc about it.

 

IJ23045: REDUCEVG UNCLEAR ON DELAY WHEN WAITING FOR INFLIGHT RECLAIM REQ APPLIES TO AIX 7100-05

 

A fix is available

APAR status

  • Closed as program error.

Error description

  • reducevg may be unclear, why there is some delay
    when waiting on inflight reclaim requests.
    

Local fix

  • Disable space reclamation by running:
    ioo -o dk_lbp_enabled=0
    

Problem summary

  • reducevg may be unclear, why there is some delay
    when waiting on inflight reclaim requests.
    

Problem conclusion

  • reducevg displays message incase there are space reclamation
    IOs inflight to indicate reducevg may take some time to
    complete.

TSM SP Remove ReplServer

PROBLEM:
Every 5.5 minutes, this shows up in the actlog

08/13/20 08:05:25 ANR1663E Open Server: Server OLDSERVER not defined
08/13/20 08:05:25 ANR1651E Server information for OLDSERVER is not available.
08/13/20 08:05:25 ANR4377E Session failure, target server OLDSERVER is not defined on the source server.
08/13/20 08:05:25 ANR1663E Open Server: Server OLDSERVER not defined
08/13/20 08:05:25 ANR1651E Server information for OLDSERVER is not available.
08/13/20 08:05:25 ANR4377E Session failure, target server OLDSERVER is not defined on the source server.
08/13/20 08:05:26 ANR1663E Open Server: Server OLDSERVER not defined
08/13/20 08:05:26 ANR1651E Server information for OLDSERVER is not available.
08/13/20 08:05:26 ANR4377E Session failure, target server OLDSERVER is not defined on the source server.
08/13/20 08:05:28 ANR1663E Open Server: Server OLDSERVER not defined
08/13/20 08:05:28 ANR1651E Server information for OLDSERVER is not available.
08/13/20 08:05:28 ANR4377E Session failure, target server OLDSERVER is not defined on the source server.

SOLUTION:
QUERY REPLSERVER shows the GUID
REMOVE REPLSERVER (GUID) to cause the errors to stop.