HOWTO: AIX support for R/W filesystem on USBMS

JFS2 Unsupported
Putting JFS2 on non-LVM block devices has been working for a long time. I​ wrote up how to put JFS2 on a ramdisk back at AIX 4.3.3. I lost the techdoc from back then, but IBM has a newer re-write dated 2008 here: http://www-01.ibm.com/support/docview.wss?uid=isg3T1010722

JFS2 requires the underlying system to tell it if something goes away, or for it to stay there as long as the filesystem is mounted. LVM does this for disk, and the ramdisk drivers do this as well (mostly because if the ramdisk fails, likely the system has failed). The key there is that with JFS2, the ramdisk pages are pinned.

I wrote up including performance on USB 1 and USB2 ports in January of 2010 HOWTO: JFS2 on USB device on AIX 5.3.11.1. Everything is fine, and dandy, even mount on boot, except it’s not supported by AIX Development.

JFS2 Problems
The problem for USB Mass Storage Devices is that the device can just go away unexpectedly. If a disk goes into deep sleep, or resets because of a loose connection, the JFS drivers do not get notified. So, they take writes, and JFS2 saves them up until it’s time to flush. It goes to flush, and the I/O channel is gone. Sometimes, this is just loss of everything in cache. If it’s an important file, then the system crashes.

​Because of that, we still cannot put LVM on a USB Mass Storage Device. This would take changes to notification of device availability, perhaps changes to the sync daemon, etc. Who knows, but there’s not been enough push from paying customers to make it a priority for AIX Development. Until that happens, don’t expect formal support for JFS2 on these devices.

UDF is the solution
AIX development supports read/write and even booting from USB Mass Storage Devices, but only with UDFS. The purpose is for writing a mksysb (system boot) image, or tar/cpio files, etc, and exists because of the RDX USB Internal Dock sold with some systems.
https://www.ibm.com/support/knowledgecenter/en/ssw_aix_61/com.ibm.aix.files/usbms_fileref.htm

​Boot support is provided as well: REF: ​http://www-01.ibm.com/support/docview.wss?uid=isg1IZ66737

More info on RDX USB Internal Dock. https://www.ibm.com/support/knowledgecenter/POWER7/p7hdt/fc1103.htm

RDX is just a hot-swap USB to SATA drive bay. Any current USB drive (USB3 is preferred due to performance), should work fine.

HOWTO: Create, Read, and Write UDF on AIX

Create bootable filesystem

  mksysb -eXpi /dev/usbms0

Create empty filesystem

  udfcreate -d /dev/usbms0

Create UDF 2.01 filesystem

  udfcreate -f3 -d/dev/usbms0

NOTE: UDF 2.01 supports a real-time filesystem. It’s still UDF, so don’t try to put a database, or a million files on there.

Access read/write

  mount -vudfs /dev/usbms0 /USBDRIVE

NOTE: The mksysb is a SPOT, plus a mksysb image, so adding files to the UDF will not make the restore huge.

USB Adapters on AIX
Add-in USB3 XHCI adapter for POWER8 is:

  • CCIN 58F9 – PCIE2 4-port USB3 adapter
  • FC EC45 and FRU 00E2932 for Low Profile
  • FC EC46 and FRU 00E2934 for full height.
  • driver is 4c1041821410b204 internal or 4c10418214109e04 PCIe

Add-in USB2 EHCI adapter for POWER7 is:

  • CCIN 57D1 – PCI-E 4-port USB2 adapter
  • driver is 33103500 integrated or 3310e000 PCIe
  • FC 2728 or FRU 46K7394

Add-in USB2 EHCI adapter for POWER6/POWER5 is:

  • CCIN 28EF – PCI 2-port USB2 adapter
  • FC 2738 or FRU 80P2994
  • Belkin F5U219 – exact same card without the sticker.
  • driver is 99172604 internal or 99172704 PCI

Original USB1 OHCI /UHCI adapter for POWER5 and earlier was

  • driver 22106474 on blades or c1110358 PCI
  • This device is not really available anymore.

AIX and PowerHA levels

Research shows these dates for AIX:

  • AIX 7.2.1.3 should come out around October, 2017 (est Week 46)
  http://www-01.ibm.com/support/docview.wss?uid=isg1IV95390   ### 7200-01-03-1720
  • AIX 7.1.4.5 should come out around October, 2017 (est Week 46)
  http://www-01.ibm.com/support/docview.wss?uid=isg1IV95393   ### 7100-04-05-1720
  • AIX 7.1.5.0 may come out around January, 2018 (est Week 5); however, it may be cancelled.
  http://www-01.ibm.com/support/docview.wss?uid=isg1IV86307   ### 7100-05-00-1731

It’s generally 26 weeks, plus or minus, from the initial YYWW date. Once a TLSP APARs releases, the YYWW code is be updated.

My PowerHA selection process would be:

  • 7.1.3 SP06 if I needed to deploy quickly, because I have build docs for that. However, it may be withdrawn from marketing in 2018.
  • 7.2.0 SP03 if they wanted longer support, but had time for me to work up the new procedures during the install.
  • 7.2.1 SP01 when it comes out, but not 7.2.1 base.

My AIX selection process would be:

  • 7.2.1.2 for any NIM server or POWER9. Next updates should be Oct 2017.
  • 7.1.4.4 or later for customer preference. Next updates should be Oct 7.1.4.5 and Jan 7.1.5.0.
  • 6.1.9.9 Minimum level for application compatability. This is is the final TLSP.
  • For anything POWER6 or older, I push hard for p710 to p740 or s81x/s82x as replacements (cost).
  • For anything AIX 5.3 or older, I push hard for app testing on newer AIX (EoS).
    • PTF U866665.bff (bos.mp64.5.3.12.10.U) enables POWER8. AIX must be 5.3.12.9. Must be patched before p8 (install nim or mksysb). p8 must be 840 firmware. VIO must be 2.2.4.10 or later.
    • PTF U866665 requires an active extended support agreement AND p8 systems on file. No free access to biz partnets.

Code sources:

  • rpm.rte and yum ezinstall, then deploy tar, wget, and rsync:
  http://public.dhe.ibm.com/aix/freeSoftware/aixtoolbox/ezinstall/ppc/
  • openssh from the IBM Web Download expansion:
  https://www-01.ibm.com/marketing/iwm/iwm/web/reg/pick.do?source=aixbp&lang=en_US
  • AIX security patches for any DMZ hosts
  http://public.dhe.ibm.com/aix/efixes/security/?C=M;O=D
  ftp://ftp.software.ibm.com/aix/efixes/security/
  • Base media, if I were certain the customer was entitled, but didn’t want to wait for them to provide media, Partnerworld SWAC:
   https://www-304.ibm.com/partnerworld/partnertools/eorderweb/ordersw.do
  • Latest service pack for AIX from Fix Central:
  https://www-945.ibm.com/support/fixcentral/
  https://www-945.ibm.com/support/fixcentral/aix/selectFixes?release=7.2&function=release
  https://www-945.ibm.com/support/fixcentral/aix/selectFixes?release=7.1&function=release
  • Latest service pack for PowerHA from Fix Central:
  https://www-945.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%20software&product=ibm/Other+software/PowerHAClusterManager&release=7.2.0&platform=All&function=all
  https://www-945.ibm.com/support/fixcentral/swg/selectFixes?parent=Cluster%20software&product=ibm/Other+software/PowerHAClusterManager&release=7.1.3&platform=All&function=all

Reference: PowerHA to AIX Support Matrix:

   http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD101347

AIX and PowerHA versions 2017-06

This changes periodically, but for today, here is what I would do.

My PowerHA selection process would be:

  • 7.1.3 SP06 if I needed to deploy quickly, because I have build docs for that.
  • 7.1.4 doesn’t exist, but if it came out before deployment, I would consider it. Whichever was a newer release, latest 7.1.3 SP, or latest 7.1.4 SP.
  • 7.2.0 SP03 if they wanted longer support, but had time for me to work up the new procedures during the install.
  • 7.2.1 SP01 if SP01 came out before I deployed, and had chosen 7.2.0 prior. 7.2.1.0 base is available, but that’s from Dec 2016, and 7.2.0.3 is from May 2017. Newer by date is better.

My AIX selection process would be:

  • Any NIM server would be AIX 7.2, latest TLSP.
  • Any application support limits would win down to AIX 6.1, plus latest TLSP.
  • For POWER9, I would push 7.2, latest TLSP.
  • For POWER8, I would push 7.1 or later. — latest TLSP
  • For POWER7, I would push 6.1 or later. — latest TLSP
  • For POWER6 or older, or AIX 5.3 or older, I would push strongly against due to support and parts limitations.

Code sources:

  • I would make sure to install yum from ezinstall, and deploy GNU tar and rsync:
  http://public.dhe.ibm.com/aix/freeSoftware/aixtoolbox/ezinstall/ppc/
  • I would update openssh from the IBM Web Download expansion:
  https://www-01.ibm.com/marketing/iwm/iwm/web/reg/pick.do?source=aixbp&lang=en_US
  • If any exposure to the public net, or a high-sensitivity system, I would check AIX security patches also.
  http://public.dhe.ibm.com/aix/efixes/security/?C=M;O=D
  ftp://ftp.software.ibm.com/aix/efixes/security/
  • I would get the latest service pack for both AIX and PowerHA from Fix Central:
  https://www-945.ibm.com/support/fixcentral/
  • Base media, if I were certain the customer was entitled, but didn’t want to wait for them to provide media, Partnerworld SWAC:
   https://www-304.ibm.com/partnerworld/partnertools/eorderweb/ordersw.do

Reference: PowerHA to AIX Support Matrix:

   http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD101347

Zero Momentum

QUESTION:
If time slows to a near stop for objects travelling close to the speed of light, what happens to time when all momentum is at a dead stop?

ANSWER:
The short answer is, with true zero momentum, you would cease to exist. If you had very small momentum, then time would pass very very fast for you. This is because relativistic momentum is much more complicated than just a car on a highway.

####
QUESTION:
Can you be relative to nothing?

ANSWER:
Every mass affects every other mass in the universe via gravity. There is no point of zero *inside* the universe. That would be past the margins of the expanding universe, which doesn’t have spacetime, so we can’t exist there. *There* doesn’t even exist.

###
QUESTION:
At what point is a body its own body, and not part of the big thing with gravity it’s sitting on top of?

ANSWER:
When/where do you want it to be? This is not a binary transition. It’s gradual, from the center of a black hole, out to two photons spiraling across the universe in opposite directions.

###
BRAIN DUMP:

Everything is energy.
* Mass is a 4-vector, and relates directly to energy.
* Energy is a 4-vector, and relates directly to momentum.

Because of this, time is affected by both:
* More velocity = slower time, shorter length in the direction of travel
* More mass = slower time, shorter length radial to the mass.

Spacetime is a foam.
* The speed of time, like the size of space, is the size of the bubbles.
* The stretch of the foam is gravity.
* The more energy/mass on the skin of a bubble, the smaller it gets (and the more it pulls on its neighbors).
* Less energy (and mass) means bigger bubbles (ie, more time and space).

Bosons are energy carriers, and they live on a bubble.
* To move a boson, you have to input energy.
* When they have enough energy to move, they move at the speed of light.
* Photons are the most familiar bosons.

Speed of light is actually “speed of light in a perfect vacuum”.
* Put light into a ceramic crystal, and it’s slower.
* Spacetime foam is more dense, so more bubbles to transit.

To travel faster, you have to input more energy.
* More energy means you compress the foam.
* That means more bubbles to transit, which means more energy.
* As a baryonic mass approaches the speed of light, the energy inputs approach infinity.
* Infinite energy (and mass and spacetime) do not exist, so we are constrained.

Bosons and some small particles can seem to violate this on very small scales (tunnelling).
* This is because they can slide through the skin of the bubble rather than having to compress the bubble.
* You cannot do that as baryonic mass, but maybe if your pattern was translated into bosons.
* That high of an energy density would probably condense AND dispese, so you’d lose the pattern along the way.

Special relativity covers “objects at rest”:
* energy-momentum relation: E^2 = (pc)^2 + (m0c^2)^2
* energy-mass relation: E = mc^2 (p is zero, so you have E^2 = (mc^2)^2) which becomes E=mc^2

So, if you were to come to a complete rest relative to the fabric of spacetime,
* the passage of time is still affected by your own mass/energy.
* You could decreate your energy, reduce your mass, dispurse your mass, and you would expand the bubbles.
* This would cause time to pass more quickly for you, if “you” could exist that way.

At zero energy, the bubbles would be infinitely large.
* How do you pump energy out of the bubbles (vacuum fluctuations).
* Time would pass at infinite speed (same issue as photons at infinite velocity).
* Just as there is not infinite energy, there is also not infinite time velocity.

Imaginary mass/energy is described by tachyons.
* They do not travel faster than light,
* nor do they travel backwards in time.

To travel backwards in time:
* You need negative energy.
* This is also the principle behind the Alcubierre Warp Drive.
* This would cause spacetime to move around an object, instead of the object through spacetime.
* There is no known way to form negative mass/energy:

This is not the same as antimatter, which is just opposite quarks.
* Basically, you’d have to pump energy out of the bubbles.
* The excess energy generated would accumulate at the margins of the bubble, trying to get back in.
* When the bubble is allowed to collapse, it would be a giant explosion of radiation.
* If you had a way to direct this to one side, perhaps travel would be possible, leaving a radiation wake.
* Perhaps it would lead to a spike of radiation that pierced the ship, or whatever was in front of it.

That’s a theoretical exercise, which I don’t believe is likely to happen.
* We’re more likely to find a way to connect the quantum foam in different places (wormholes).
* Would a wormhole unravel spacetime, or collapse instantly?

Other thoughts:
* The margins of the universe are probably expanding at the speed of light.
* The volume grows more rapidly as time passes, even though the mass/energy is constant.
* Eventually, the universe will be so dispersed as to be useless (heat death = cold death).
* Even solid matter will disperse given enough time. Bosons trickle away, and atomic forces will decay.


Patching Drywall

I’ve got reasonable experience with it, but nowhere near pro level. I’ll fix things in my house, though I don’t like the painting part. It’s just… it takes longer than I want it to, so small jobs suck. Big jobs are fine, because you do all the first-pass things, then come back for second pass without having to wait too much.

Anyway, here are some things that come to mind when I think about doing this.

0. Preparation! Scrape, clean, and mask the area with twice as much effort as you think it deserves. Also, um, those clothes that “you probably won’t get anything on” will totally have splatters.

1. Don’t be afraid to peel back some of the paper backing to keep it from being humped too high.

2. Finding a way to do the inside of the wall when it’s a big patch is really helpful for stability. Strings tied around things can help press from the inside, and hang a weight on the outside. You can also mount a support (paper tape, mesh tape, wood strips, whatever as appropriate. Don’t just mount tree branches in there though) inside of a large hole (can you fit your hand and a putty knife through the hole?), and let it dry, then use that as backing when you come back to put in the plug.

3. Fiberglass mesh tape is sometimes so much better than paper tape. It’s strings! It takes several coats to cover up though.

4. Once it looks dry-ish, stop messing with it. Once it starts peeling up or crumbling, you really just have to scrape it all out and start over. You can spritz it with water before and during to keep it from drying too fast if needed.

5. Sometimes you have to do a little, let it dry for half a day, then come back for the next part. There are limits to how much can be done at once and not have it crack.

6. Use as wide of a putty knife as you can. If you have a 3mm hump spread over 2″, you will notice it. If it’s spread over 6″, maybe not. I have a 12″ mud knife, and have actually used it before.

7. Texture often needs to be thinned. Paint works better than pure water for this, because it’s sticky, and not as thin. 50% paint+texture is a good starting point for a crow’s foot brush.

8. Overlapping is your friend. When spraying orange peel texture, I start small, and adjust until the blob sizes look just a little smaller than I want. Then, I go back and forth, overlapping the edges, until I cannot see the true edge anymore.


PowerHA holds my disks

I did some testing and needed to document command syntaxen, even though I was not successful.
node01 / node02 – cannot remove EMC disks
aps are stopped

The fuser command will not detect processes that have mmap regions where that associated file descriptor has since been closed.

lsof | grep hdisk   ### nothing
fuser -fx /dev/hdisk2 ### nothing
fuser -d /dev/hdisk2 ### nothing
sudo filemon -O all -o 2.trc ; sleep 10 ; sudo trcstop   ### only shows hottest 2 dsks

### Cannot remove disks after removign from HA, is related to this defect.
http://www-01.ibm.com/support/docview.wss?uid=isg1IV65140
/usr/es/sbin/cluster/events/utils/cl_vg_fence_term -c vgname

In PowerHA 7.1.3, with the shared VG varied off, and the
disk in closed state, rmdev may fail and return a
busy error, eg:

# rmdev -dl hdisk2
Method error (/usr/lib/methods/ucfgdevice):
0514-062 Cannot perform the requested function because
         the specified device is busy.
.

# cl_set_vg_fence_height
Usage: cl_set_vg_fence_height [-c] <volume group> [rw|ro|na|ff]
</volume>

JDSD NOTE: The levels are:

  • rw = readwrite
  • ro = read only
  • na = no access
  • ff = fail access
jdsd@node01  /home/jdsd
$ sudo ls -laF /usr/es/sbin/cluster/events/utils/cl*fence*
-rwxr--r--    1 root     system        12832 Nov  7 2013  /usr/es/sbin/cluster/events/utils/cl_fence_vg*
-rwxr--r--    1 root     system        15624 Nov  7 2013  /usr/es/sbin/cluster/events/utils/cl_set_vg_fence_height*
-r-x------    1 root     system         5739 Nov  7 2013  /usr/es/sbin/cluster/events/utils/cl_ssa_fence*
-rwxr--r--    1 root     system        22508 Nov  7 2013  /usr/es/sbin/cluster/events/utils/cl_vg_fence_init*
-rwxr--r--    1 root     system         4035 Feb 26 2015  /usr/es/sbin/cluster/events/utils/cl_vg_fence_redo*
-rwxr--r--    1 root     system        15179 Oct 21 2014  /usr/es/sbin/cluster/events/utils/cl_vg_fence_term*


jdsd@node01  /home/jdsd
$ sudo ls -laF /usr/es/sbin/cluster/events/cspoc/cl*disk*
-r-x------    1 root     system       109726 Feb 26 2015  /usr/es/sbin/cluster/cspoc/cl_diskreplace*
-rwxr-xr-x    1 root     system        20669 Nov  7 2013  /usr/es/sbin/cluster/cspoc/cl_getdisk*
-r-x------    1 root     system       105962 Feb 26 2015  /usr/es/sbin/cluster/cspoc/cl_lsreplacementdisks*
-r-x------    1 root     system       103433 Feb 26 2015  /usr/es/sbin/cluster/cspoc/cl_lsrgvgdisks*
-rwxr-xr-x    1 root     system        12259 Feb 26 2015  /usr/es/sbin/cluster/cspoc/cl_pviddisklist*
-rwxr-xr-x    1 root     system         4929 Nov  7 2013  /usr/es/sbin/cluster/cspoc/cl_vg_non_dhb_disks*


jdsd@node01  /home/jdsd
$ sudo /usr/es/sbin/cluster/cspoc/cl_lsrgvgdisks
#Volume Group   hdisk    PVID             Cluster Node
#---------------------------------------------------------------------
caavg_private   hdisk38  00deadbeefcaff53 node01                        node01,node02 <not in a Resource Group>
datavg          hdisk22  00deadbeefca8643 node02                        node01,node02 demo_rg
datavg          hdisk23  00deadbeefca86f9 node02                        node01,node02 demo_rg
datavg          hdisk24  00deadbeefca8752 node02                        node01,node02 demo_rg
datavg          hdisk25  00deadbeefca87ac node02                        node01,node02 demo_rg
datavg          hdisk26  00deadbeefca880e node02                        node01,node02 demo_rg
datavg          hdisk27  00deadbeefca886c node02                        node01,node02 demo_rg
datavg          hdisk28  00deadbeefca88d7 node02                        node01,node02 demo_rg
datavg          hdisk29  00deadbeefca8965 node02                        node01,node02 demo_rg
datavg          hdisk30  00deadbeefca89c5 node02                        node01,node02 demo_rg
datavg          hdisk31  00deadbeefca8a52 node02                        node01,node02 demo_rg
datavg          hdisk32  00deadbeefca8ad2 node02                        node01,node02 demo_rg
datavg          hdisk33  00deadbeefca8b50 node02                        node01,node02 demo_rg
datavg          hdisk34  00deadbeefca8c26 node02                        node01,node02 demo_rg
datavg          hdisk35  00deadbeefca8c9a node02                        node01,node02 demo_rg
datavg          hdisk36  00deadbeefca8cf7 node02                        node01,node02 demo_rg
journalvg       hdisk37  00deadbeefca8d53 node02                        node01,node02 demo_rg


jdsd@node01  /home/jdsd
$ sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
Disk name:                      hdisk2
Disk UUID:                      1edeadbeefcafe04 b512d9e3b580fb13
Fence Group UUID:               0000000000000000 0000000000000000 - Not in a Fence Group
Disk device major/minor number: 18, 2
Fence height:                   2 (Read/Only)
Reserve mode:                   0 (No Reserve)
Disk Type:                      0x01 (Local access only)
Disk State:                     32785
</not>

Concurrent vg, so updating on node2 shows up on node1.

From node 2

sudo extendvg journalvg hdisk2 hdisk3 hdisk4 hdisk5 hdisk6 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11 hdisk12
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk37
# Shows RW

From node 1

sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk37
# Shows RW

From node1

sudo /usr/es/sbin/cluster/events/utils/cl_set_vg_fence_height -c journalvg rw
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
# Shows RW

From node2

sudo reducevg journalvg hdisk2 hdisk3 hdisk4 hdisk5 hdisk6 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11 hdisk12
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
# Shows RO
      1. OK, try again

From node 1

sudo mkvg -y dummyvg hdisk2 hdisk3 hdisk4 hdisk5 hdisk6 hdisk7 hdisk8 hdisk9 hdisk10 hdisk11 hdisk12
sudo varyoffvg dummyvg

From node 2

sudo importvg  -y dummyvg hdisk2
sudo /usr/es/sbin/cluster/events/utils/cl_set_vg_fence_height -c dummyvg rw
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
### Still RO
sudo /usr/es/sbin/cluster/events/utils/cl_vg_fence_term -c dummyvg
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
### Still RO
sudo varyoffvg dummyvg
sudo rmdev -Rl hdisk2

Both nodes

sudo exportvg dummyvg
sudo importvg -c -y dummyvg hdisk2
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
### Still RO
sudo /usr/es/sbin/cluster/events/utils/cl_set_vg_fence_height -c dummyvg rw
sudo /usr/es/sbin/cluster/events/utils/cl_vg_fence_init -c dummyvg rw hdisk2
cl_vg_fence_init[279]: sfwAddFenceGroup(dummyvg, 1, hdisk2): No such device
sudo chvg -c dummyvg
sudo varyonvg -n -c -A -O dummyvg
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk2
sudo /usr/es/sbin/cluster/cspoc/cl_getdisk hdisk3
### Still RO
sudo varyoffvg dummyvg

From Node 2
sudo rmdev -Rl hdisk2
Method error (/etc/methods/ucfgdevice):
        0514-062 Cannot perform the requested function because the
                 specified device is busy.

sudo /usr/es/sbin/cluster/events/utils/cl_vg_fence_redo -c dummyvg rw hdisk2
 /usr/es/sbin/cluster/events/utils/cl_vg_fence_redo: line 109: cl_vg_fence_init: not found
 cl_vg_fence_redo: Volume group dummyvg fence height could not be set to read/write

This is related to this defect, but later version:
http://www-01.ibm.com/support/docview.wss?uid=isg1IV52444

sudo su -
export PATH=$PATH:/usr/es/sbin/cluster/utilities:/usr/es/sbin/cluster/events/utils/:/usr/es/sbin/cluster/cspoc/:/usr/es/sbin/cluster/sbin:/usr/es/sbin/cluster
/usr/es/sbin/cluster/events/utils/cl_vg_fence_redo -c dummyvg rw hdisk2
 cl_vg_fence_init[279]: sfwAddFenceGroup(dummyvg, 11, hdisk2, hdisk3, hdisk4, hdisk5, hdisk6, hdisk7, hdisk8, hdisk9, hdisk10, hdisk11, hdisk12): No such device
 cl_vg_fence_redo: Volume group dummyvg fence height could not be set to read/write#
cd /dev
/usr/es/sbin/cluster/events/utils/cl_vg_fence_redo -c dummyvg rw hdisk2
 cl_vg_fence_init[279]: sfwAddFenceGroup(dummyvg, 11, hdisk2, hdisk3, hdisk4, hdisk5, hdisk6, hdisk7, hdisk8, hdisk9, hdisk10, hdisk11, hdisk12): No such device
 cl_vg_fence_redo: Volume group dummyvg fence height could not be set to read/write#

SIGH!

I give up. We will probably have to reboot.


Dell PowerEdge SC 440

This thing is still chugging along. Some of the ones for work needed motherboard caps replaced.
Most of them lost 2 of the SATA ports.
All of them are on replacement power supplies.

Well, I couldn’t remember why I didn’t have a quad core in this one, and I tried a Q9550, 12MB, Quad 2.83GHz.
Intel server boards based on the 3000 series chipset support those.

I forgot the BIOS limitations. Anything with a 533, 800, or 1066MHz FSB is fine (65nm process), which tops out at the Q6700, 8MB, Quad 2.66GHz. Nothing 1333 nor 1600MHz (45nm process) will work.

Some of the Core2 Extreme chips work in there too, but really, they just run hotter, without much practical difference. Only a very small program running entirely in cache would benefit.

RAM is 533 or 667 MHz DDR2

Anyway, this post is so when I go googling again next time, I’ll find it.

The Optiplex 755 still supports add-in video, as does a T100, and works with the 1333MHz FSB chips, so when this finally dies, that’s what will go in there.


Posted in Reference | Comments Off on Dell PowerEdge SC 440

Cleaning up Google space

If your Google quota looks to be filling up, and you’re considering buying more space, check your usage first.

Drive and Mail has Trash folders, but they count against your quota. Drive doesn’t automatically delete things either. I found 3 year old cruft in there. YMMV

See your Google usage here:
https://www.google.com/settings/storage

Drive Trash is here:
https://drive.google.com/drive/trash

GMail trash is here
https://mail.google.com/mail/u/0/#trash
Alternate accounts will be /u/1, /u/2, etc.

Jumbo non-trash emails are here:
https://mail.google.com/mail/u/0/#search/larger_than%3A10mb

Videos and Movies in “Photos” are here:
https://photos.google.com/search/_tv_Videos
https://photos.google.com/movies

You cannot sort Photos by size,
but you can find them in your drive, sorted by size here:
https://drive.google.com/drive/quota


SATA chipset reference

The SIL3132 card (SATA-II, PCIe 1.0) ran at 122MB/sec.

The 88SE9128 card (SATA-III, PCIe 2.0) ran at 75MB/sec, or 35MB/sec with FIS disabled.

The 88SE9235 card runs at 195MB/sec.

My two test enclosures are:

  • SIL3726 based enclosure (RSV-5S)
  • 88SM9715 based enclosure (TR5M6G)
  • Linux, MDADM, RAID6, sequential read, 256k blocks.

Ableton said I should go with a single SSD behind a JMS575 port multiplier to get best performance out of the 88SE9128.

I pointed out that a single drive is not the same as multiple (switching delays),
and that replacing all of my spinning disks with SSD is not a valid solution.


Posted in News, Reference | Comments Off on SATA chipset reference

Tesla 3 Solar Roof

Not much detail yet, but my guess is it’s something like this:
http://onlinelibrary.wiley.com/doi/10.1002/adom.201400103/abstract

Imagine this:

  • Center layer contains IR fluorescing organic salts
  • Refractors bend the new IR out to the edges
  • Edges are high efficiency NIR photovoltaic cells.
  • Inside layer would be reflective coated on the outer face.

This would reduce IR ingress during sunny days, and convert IR and UV to electricity.
The black edges could also be monocrystalline PV cells in the visible spectrum.

I think you could expect a few hundred watts during a sunny TX day, which would be enough to keep things topped off between short commutes.


Bad Subnet Kills DHCPD

One, single bad IP in DHCPD config will kill the entire config file. :(

On an EdgeRouter, and probably anything with Ubiquiti, and maybe anything using the same config style (Brocade and others have the same command set)….

If you add a static reservation outside of the DHCP server’s subnet,
as in, if you typo one octet, or decide to do another subnet just because,
your DHCP server will be offline after reboot. No errors, just silently not serving.

It can be outside of the start/stop range, and that’s fine.

Really, this should give you a warning from the webUI, or it should just say “OKAY, We’ll let you hand out stupid IP addresses.” I mean, what if I wanted this to be my DHCP server, but I had a different router and subnet on the same segment?

From command line, you’ll see the error though:

admin@gw1# commit
[ service dhcp-server ]
Static DHCP lease IP '192.169.1.79' under mapping 'CustomerLaptop'
under shared network name 'LAN' is outside of the DHCP lease network '192.168.1.0/24'.
DHCP server configuration commit aborted due to error(s).
[edit]

Convert EXT3 to EXT4

### Change to EXT4 mount mode (OKAY before conversion)
vi /etc/fstab

### Reboot into single user mode
shutdown -r now
LILO: linux S

### Unmount or read-only every filesystem
umount -a
mount -oremount,ro /usr
mount -oremount,ro /

### Convert all ext4 into new metadata formats
grep ext4 /etc/fstab | tr -s [:space:] | cut -f 1 -d \  | tune2fs -O extents,uninit_bg,dir_index

### Build the directory index and verify metadata
grep ext4 /etc/fstab | tr -s [:space:] | cut -f 1 -d \  | fsck.ext4 -yfD

### Reboot back to multiuser mode
shutdown -r now

### Covert all files in EXT4 filesystems to extent mode (was bitmap)
for dir in `mount | grep ext4 | cut -f 3 -d \  ` ; do LC_ALL=C find $dir -xdev -type d -print0 | LC_ALL=C xargs -r0 -P3 chattr +e ; done
for dir in `mount | grep ext4 | cut -f 3 -d \  ` ; do LC_ALL=C find $dir -xdev -type f -print0 | LC_ALL=C xargs -r0 -P3 chattr +e ; done

### References

apt sandbox permissions

Every repo was giving signature errors in apt:
Err:6 http://security.debian.org stretch/updates InRelease

 At least one invalid signature was encountered.

This was pretty recent. My updates in May were fine.
This ONLY affected apt* update. Not clean, install, purge, etc.

I could bypass the error by telling the sandbox to become root:
apt -o APT::Sandbox::User=root update

/tmp was still 1777. I did find /var/tmp was linked to /tmp, which killed dovecot install.
No idea why that’s a problem, because my /tmp is persistent across reboots.
A snotty developer somewhere indicated it was the end of the universe.
Now, /var/tmp is just part of /var. Whatever.

So, someone did a hard cleanup of cache, and that fixed it for me:
sudo apt-get clean
sudo mv /var/lib/apt/lists /tmp
sudo mkdir -p /var/lib/apt/lists/partial
sudo apt-get clean
sudo apt-get update

Then I compared /tmp/lists and /var/lib/apt/lists.
Exactly the same for everything, except top level permissions.
The old one was 755 and the new one is 750.

WTF?!?!? Why do we care if “other” can read the package lists?
There is ZERO sensitive data in there?

I decided someone was intoxicated, watching Rick and Morty, making out with their significant other, and coding with their non-dominant hand, just to see if they could maintain focus on a dare.


FIXED – NotePad++ not saving

I FINALLY found out why NP++ as not saving my files properly. There’s a newish “Session snapshot & periodic backup” feature that saves a backup copy of all of your open but unchanged files, and any file changes. It also saves the current state when you exit NPP so if you close without saving, all of that is back.

However, it does not work properly. Once the backup interval passes, no further snapshots are saved, so whatever you had when you first created the file is all that will be saved. But, since the dirty flag is cleared, you cannot save the file normally either. Ctrl-S does nothing, silently. Closing a file does not warn you of unsaved changes. Closing NP++ does not warn you of unsaved files. You re-open, and it is back to what it was.

The way around this was to copy the contents, close the file, re-open the file then paste the contents, THEN save. OR, you could save as a new file.

But now that I know it’s this newish feature, I turned it off, and everything works properly.

Thanks to AdiranHHH from here:
http://stackoverflow.com/questions/24447786/notepad-doesnt-save-document-on-exit
And this open bug:
https://github.com/notepad-plus-plus/notepad-plus-plus/issues/337


PPC64 Linux on Intel

QEMU on Windows will run ppc64 and ppc64le emulation.
It emulates the same as what PowerKVM on an S812L would provide.
It’s kind of slow because there is no KVM module, AND Intel vs PPC,
AND emulator mode is single-core/proc/thread.

You can get Windows installer here:
https://qemu.weilnetz.de/

You really want ANSI/VT100 escape codes on you “cmd.exe” also:
https://github.com/adoxa/ansicon

To build a blank disk:
qemu-img create -f qcow2 qemu-disk-ppc64.img 32G

You can boot with this:
set SDL_STDIO_REDIRECT=NO
qemu-system-ppc64 -M type=pseries -m 1G,slots=4,maxmem=8G

  -cpu POWER8E -smp 1 -vga none -nographic 
  -netdev user,id=net0 -device spapr-vlan,netdev=net0 
  -device spapr-vscsi -device scsi-hd,drive=drive0 
  -drive id=drive0,if=none,file=qemu-disk-ppc64.img
  -cdrom D:\Downloads\debian-testing-ppc64el-DVD-1.iso

The QEMU part is all one line. The cdrom image is up to you. I like Debian.

Other Notes:
Any issues with cursor keys, use ctrl-i for TAB, ctrl-n and ctrl-p for next/previous.

Emulation mode is flaky with more than one core.

There is a QEMU AIX build on PERZL.ORG which would be faster, especially for ppc64 BigEndian.

PowerKVM is just PPC Linux, QEMU, KVM, and LIBVIRT. KVM is just a kernel module for spee-dup. LIMVIRT is just a GUI and CLI tool to build VM definitions. QEMU is the emulator. Works best on POWER8, with hypervisor disabled (OPAL mode).

QEMU still does not have enough RTAS and NVRAM to boot AIX. AIX hangs during “Starting AIX”, and Diags just says it’s an unsupported machine type. There is a little bit of dev for this, but not much.​


iPhone, Garmin, Live Tracking

This is a write-up I made for a friend having problems with Garmin Live Tracking on an iPhone. It would get interrupted all the time, and show negative, or tiny percentages, of the real stats, though the map and track would look correct.

iPhone viruses / bugs:
iPhones don’t get generally malware unless they have been jailbroken / hacked.
This can only happen hands-on, and is not currently possible at the current OS version.

Any concerns can be fixed with a reinstall/restore of the phone.
I do this for any major upgrade (iOS 8 to iOS 9), but no more often.

This brings you to latest level, and replaces anything that got messed up.
Takes a couple of hours to finish the restore.

iPhone Restore / Reinstall:

  • Back up to iTunes, plugged in is best.
  • Disable your pin-code lock from Settings on the phone..
  • Do a restore from iTunes.
  • Wait for the OS install to finish (15-20 mins)
  • Answer the 5 “new phone” questions to get back to the home screen.
  • Re-Enable your pin-code lock
  • Re-Enroll your fingerprints if you use that
  • Wait for iTunes to finish restoring your apps and photos.
    Here is info about the LiveTracking problem specifically:

The stats error:
This is a design issue with the garmin app. They really need to fix it.
Reference: https://forums.garmin.com/archive/index.php/t-329984.html

The stats fix:
Replace the livetrack exercise with an upload of the activity.

  • Finish the activity on the device.
  • Delete the bad one from the Connect app calendar.
  • Sync/Upload from your device.

Stability improvement:
The stability during tracking can be helped with:

  • Disable WiFi while livetracking.
  • Make sure all other apps are closed while livetracking
  • Make sure the phone has been hard rebooted in the last week or two.

General Garmin stuff that *may* help:

  • Update the Garmin Connect app from the App Store. Again.
  • Update Garmin Express. Mine doesn’t auto-update anymore.
  • Update Garmin device firmware. Maybe there is an unreleased version from support?
  • Clear off activities every week. You can save the files to dropbox, or upload to Garmin Connect, or both.
  • Maybe to a master reset as a last resort.

Master Reset of the Garmin Device
Plug in the USB cable
Copy all of the files off of your device.
Delete activities from the device
Unplug the USB cable
Power off the device of not already off
Hold Lap/Reset and Start/Stop buttons
Press power button
Wait for spash screen showing Garmin brand
Release Lap/Reset and Start/Stop buttons
Wait for power-up
Take outside for a 5-20 minutes so it can get the initial satellite fix.
Power off the device
Plug in USB cable
Copy the settings.fit, totals.fit, and records.fit back to “NewFiles”
Unplug USB
Power on and make sure all of your settings are there.

    Here are the things I have done that have helped my phone be less crashy in general:

#1 Limit what can use GPS in the background.

  • Settings -> Privacy -> Location Services -> Purple are running now or in the last few minutes. Grey are in the last day. Disable anything that should not be allowed. Keep garmin, strava, etc.

#2 Limit what can run in the background

  • Settings -> General -> Background App Refresh -> Disable anything that should never stay running when not up on the screen. Keep music, maps, chat/messenger, and similar enabled.

#3 Close apps when you’re not using them.
iOS 8 and later seems to have memory control issues. Lots of apps just get killed when they ask for memory, rather than being denied. If you close out everything first, then start the one GPS app, that often helps.

#4 is hard reboot your phone once every week or two.
I find sometimes my phone gets crashy, and only a hard reboot helps:

  • close all of the apps running – double-click home, then swipe or close from there
  • hold power button and swipe off when prompted
  • Power on with both power and home button held down at the same time.
  • Keep both buttons held down until the apple logo appears, then disappears again.
  • Normal power on with 1 second on power button.

unpacking .deb

Reminder to self:
Debian packages are stored in library archive format.
http://www.tldp.org/HOWTO/Debian-Binary-Package-Building-HOWTO/x60.html
https://www.debian.org/doc/debian-policy/ap-pkg-binarypkg.html

ar -xv file.deb
This returns three files, in this specific order:
debian-binary # A small text file. Always “2.0\n” for now.
data.tar.gz # All of the filesystem bits that get deployed
control.tar.gz # control, md5sums, and pre/post scripts

Note also that data.tar can be .xz format as well.

There are dpkg-build tools for this, but all of this can be done manually for more control if desired.


oslevel wrong

I always forget instfix and oslevel -rl….
tags: aix oslevel incorrect backlevel wrong upgrade update

When these things show nothing:
lppchk -v
oslevel -sl `oslevel -sq 2>/dev/null | head -1`

and yout bos.rte.install, and bos.mp64, show the correct level compared to:
https://www-304.ibm.com/support/docview.wss?uid=isg1fileset2063572681

You should see the correct level here as well:
oslevel -sq | head

Check these other two things.
oslevel -r -l `oslevel -rq 2>/dev/null | sed -n '1p'`
and
instfix -icqk 6100-09-06-1543 | grep ":-:"


PowerHA Quickbuild

Because Facebook notes editor has zero formatting functionality in the new version.

####################################
### POWERHA QUICKBUILD - SANITIZED
####################################
This is a list of all the commands I'm using to build this cluster.
It's been sanitized of any customer information.


####################################
### Cleanup
####################################
clrmclstr
rmcluster -n MYCLUSTER
y | rmcluster -r hdisk2
rmdev -Rdl cluster0
/usr/sbin/rsct/bin/cthagsctrl -z
/usr/sbin/rsct/bin/cthagsctrl -d
echo "cthags 12348/udp" >> /etc/services
/usr/sbin/rsct/bin/cthagsctrl -a
/usr/sbin/rsct/bin/cthagsctrl -s
stopsrc -s clcomd ; sleep 2 ; startsrc -s clcomd
rm /var/hacmp/adm/* /var/hacmp/log/* /var/hacmp/clverify/* /usr/es/sbin/cluster/etc/auto_versync.pid
no -po nonlocsrcroute=1
no -po ipsrcrouterecv=1
shutdown -Fr now


####################################
### System config
####################################
# oslevel -s
7100-04-01-1543

# halevel -s
7.1.3 SP4

# emgr -P
PACKAGE INSTALLER LABEL
======================================================== =========== ==========
openssl.base installp 101a_fix
bos.net.tcp.client installp IV79944s1a
openssh.base.server installp IV80743m9a
openssh.base.client installp IV80743m9a
bos.net.tcp.client installp IV80191s1a
bos.rte.control installp IV80586s1a

# cat /etc/hosts
127.0.0.1 localhost
10.0.0.1 gateway
10.0.0.10 mycluster MYCLUSTER
10.0.0.11 node1
10.0.0.12 node2


####################################
### Cluster communication
####################################
echo node1 > /etc/cluster/rhosts
echo node2 >> /etc/cluster/rhosts
cat /etc/cluster/chosts > /usr/es/sbin/cluster/etc/rhosts
echo 10.0.0.11 >> /usr/es/sbin/cluster/etc/rhosts
echo 10.0.0.12 >> /usr/es/sbin/cluster/etc/rhosts
echo 10.0.0.1 > /usr/es/sbin/cluster/netmon.cf
stopsrc -s clcomd ; sleep 2 ; startsrc -s clcomd
sleep 10
cl_rsh -n node1 date
cl_rsh -n node2 date

####################################
### Basic cluster build
####################################
export CLUSTER=MYCLUSTER
export NODES="node2 node1"
export HBPVID=deadbeefcafe1234
clmgr add cluster ${CLUSTER} NODES="$NODES"
clmgr modify cluster $CLUSTER REPOSITORY=$HBPVID HEARTBEAT_TYPE=unicast
cldare -rt


####################################
### Add the service address
####################################
/usr/es/sbin/cluster/utilities/claddnode -Tservice -Bmycluster -wnet_ether_01 # -zignore
cllsif
cldare -rt


####################################
### file collections
####################################
clfilecollection -o coll -c Configuration_Files -'' -'AIX and HACMP config files' yes yes
clfilecollection -o coll -c HACMP_Files -'' -'HACMP Resource Group Files' yes yes
clfilecollection -o time -c 10
clfilecollection -o coll -a User_Files 'System user config' yes yes
clfilecollection -o file -a User_Files /etc/passwd
clfilecollection -o file -a User_Files /etc/group
clfilecollection -o file -a User_Files /etc/security/passwd
clfilecollection -o file -a User_Files /etc/security/limits
clfilecollection -o file -a User_Files /.profile
clfilecollection -o file -a User_Files /etc/environment
clfilecollection -o file -a User_Files /etc/profile
clfilecollection -o file -a User_Files /etc/exports
clfilecollection -o file -a User_Files /etc/sudoers
clfilecollection -o file -a User_Files /etc/qconfig
clfilecollection -o file -l Configuration_Files
clfilecollection -o file -l HACMP_Files
clfilecollection -o file -l User_Files


####################################
### mail events
####################################
/usr/es/sbin/cluster/utilities/claddcustom -t event -n'mail_event' \
-I'mail out when event occurs' -v'/usr/local/cluster/mail_event'
for EVENT in `cat /usr/local/cluster/mail_event.list`; do
/usr/es/sbin/cluster/utilities/clchevent -O"$EVENT" \
-s /usr/es/sbin/cluster/events/$EVENT -b mail_event -c 0
done
/usr/es/sbin/cluster/utilities/clacdNM -MA -nLVM_IO_FAIL -p0 -lLVM_IO_FAIL -m/usr/local/cluster/LVM_IO_FAIL
/usr/es/sbin/cluster/utilities/claddserv -s'my_app' \
-b'/usr/local/cluster/APP_start.ksh' -e'/usr/local/cluster/APP_stop.ksh'
/usr/es/sbin/cluster/utilities/claddserv -s'my_dsmc' \
-b'/usr/local/cluster/DSMC_start.ksh' -e'/usr/local/cluster/DSMC_stop.ksh'
cllsserv


####################################
### Resource group
####################################
/usr/es/sbin/cluster/utilities/claddgrp -g 'myclster_rg' -n 'node2 node1' -S 'OFAN' -O 'FNPN' -B 'FBHPN'
cllsgrp


####################################
### Resources
####################################
/usr/es/sbin/cluster/utilities/claddres -g 'myclster_rg' SERVICE_LABEL='myclster' \
APPLICATIONS='my_app my_dsmc' VOLUME_GROUP='prdappvg prdvg prdjrnvg' \
FORCED_VARYON='false' VG_AUTO_IMPORT='false' FILESYSTEM= FSCHECK_TOOL='fsck' \
RECOVERY_METHOD='sequential' PPRC_REP_RESOURCE='' FS_BEFORE_IPADDR='false' \
EXPORT_FILESYSTEM='' ERCMF_REP_RESOURCE='' MOUNT_FILESYSTEM='' \
NFS_NETWORK='' SHARED_TAPE_RESOURCES='' DISK='' AIX_FAST_CONNECT_SERVICES='' \
COMMUNICATION_LINKS='' MISC_DATA='' WPAR_NAME='' GMD_REP_RESOURCE='' SVCPPRC_REP_RESOURCE=''
cllsres
cllsres -g myclster_rg


####################################
### Application monitor
####################################
/usr/es/sbin/cluster/utilities/claddappmon MONITOR_TYPE=process name=my_dsmc_mon \
RESOURCE_TO_MONITOR=my_dsmc INVOCATION='longrunning' PROCESSES='dsm.opt.cluster' \
PROCESS_OWNER=root STABILIZATION_INTERVAL='60' RESTART_COUNT='3' FAILURE_ACTION='notify' \
INSTANCE_COUNT=1 RESTART_INTERVAL=360 NOTIFY_METHOD='/usr/local/cluster/mail_event' \
CLEANUP_METHOD='/usr/local/cluster/DSMC_stop.ksh' \
RESTART_METHOD='/usr/local/cluster/DSMC_start.ksh'
/usr/es/sbin/cluster/utilities/claddappmon name=my_app_mon \
RESOURCE_TO_MONITOR=my_app INVOCATION='both' MONITOR_TYPE=user \
STABILIZATION_INTERVAL=120 MONITOR_INTERVAL=120 \
RESTART_COUNT=3 RESTART_INTERVAL=800 FAILURE_ACTION=fallover \
NOTIFY_METHOD=/usr/local/cluster/mail_event FAILURE_ACTION='notify' \
CLEANUP_METHOD='/usr/local/cluster/APP_stop.ksh' \
RESTART_METHOD='/usr/local/cluster/APP_start.ksh' \
MONITOR_METHOD=/usr/local/cluster/APP_check.ksh HUNG_MONITOR_SIGNAL=9
cllsappmon
cllsappmon my_app_mon
cllsappmon my_dsmc_mon


####################################
### Sync all the changes
####################################
cldare -rt -C interactive


####################################
### Verify both nodes see it fine
####################################
cllsclstr
lscluster -m


####################################
### Start the cluster
####################################
smitty clstart

This is where it complains that hags is not up.
Rebooting does not bring up hags.
Manually starting, and it wil die after 20 mins or so.
Very little logging.

EVERY time I try to mess with HA, it’s broken. It’s always something different. Such a pain. Truly, I don’t know why people do not just use their own scripts.


Owncloud filled /var/lib/mysql!

I installed owncloud, and set it to indexing a pile of files I wanted easier access to.

Well, /var filled, and the DB stopped. :o

I was on Debian Jessie (stable), and needed some updates to continue.

### Expand /var since I'm not ready to move /var/lib/mysql to its on filesystem
lvextend -L 16G /dev/rootvg/hd9
resize2fs /var


### Stop services using mysql
/etc/init.d/apache2 stop


### Dump all databases
mysqldump --all-databases --opt --routines --complete-insert -uroot -p | gzip -9 > /storage/test/mysqldump.2016-03-03.gz
-- Warning: Skipping the data of table mysql.event. Specify the --events option explicitly.


### Drop all databases except mysql and information_schema
tar -czvf /storage/test/mysql_var_minus_innodb.tgz [dm-z]*
mysql -u root -p
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| owncloud           |
| performance_schema |
| phpmyadmin         |
| roundcube          |
| test               |
+--------------------+
7 rows in set (0.00 sec)

mysql> drop database owncloud;
mysql> drop database performance_schema;
mysql> drop database phpmyadmin;
mysql> drop database roundcube;
mysql> drop database test;
mysql> SET GLOBAL innodb_fast_shutdown = 0;
mysql> exit

### Or for the brave
mysql -e "SELECT DISTINCT CONCAT ('DROP DATABASE ',TABLE_SCHEMA,' ;') FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA <> 'mysql' AND TABLE_SCHEMA <> 'information_schema';" | tail -n+2 | mysql -u root -p
mysql -e "SELECT table_name, table_schema, engine FROM information_schema.tables WHERE engine = 'InnoDB';"


### Stop mysql
/etc/init.d/mysql stop

### Remove the InnoDB files
rm /var/lib/mysql/ib*


### changed from jessie to stretch to get MySQL 5.6
### Not quite ready for MariaDB 1x
vi /etc/apt/sources.list
# Standard repo
deb http://ftp.us.debian.org/debian stretch main contrib non-free
deb-src http://ftp.us.debian.org/debian stretch main contrib non-free

### Volatile
deb http://ftp.debian.org/debian/ stretch-updates main contrib non-free
deb-src http://ftp.debian.org/debian/ stretch-updates main contrib non-free

### Debian Backports
deb http://http.debian.net/debian stretch-backports main

### security updates
deb http://security.debian.org/ stretch/updates main contrib non-free
deb-src http://security.debian.org/ stretch/updates main contrib non-free


####################################
apt-get update
apt-get install mysql-server-5.6
apt-get install mysql-server-5.6  ## going from jessie to stretch, so it was a little tweaky


### Increased log and memory size for mysql from defaults (log 25% of buffer pool)
### Changed to barracuda (supports compressed tables)
### Changed to one file per table for various reasons.
vi /etc/mysql/my.conf
[mysqld]
# * InnoDB
# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.
# Read the manual for more InnoDB related options. There are many!
innodb_file_per_table = ON
innodb_file_format = barracuda
innodb_flush_method=O_DIRECT
innodb_log_file_size=256M
innodb_buffer_pool_size=1G


#####################################
### it recreates the IB files on start
/etc/init.d/mysql start


### Make sure barracuda is set for real
mysql -u root -p
mysql> set global innodb_file_format = 'Barracuda';
mysql> exit


### Import the dump
gunzip < /storage/test/mysqldump.2016-03-03.gz | mysql -u root -p


###########################################################################
###########################################################################
### Repair a problem with MySQL installer / conversion / upgrade
### See http://bugs.mysql.com/bug.php?id=67179
/* 
  temporary fix for problem with windows installer for MySQL 5.6.10 on Windows 7 machines.
  I did the procedure on a clean installed MySql, and it worked for me, at least it stopped
  lines of innodb errors in the log and the use of transient innodb tables. So, do it at
  your own risk..
  
  1. drop these tables from "use mysql":
     innodb_index_stats
     innodb_table_stats
	 slave_master_info
     slave_relay_log_info
     slave_worker_info
	 
  2. delete all .frm & .ibd of the tables above.
  
  3. run this file to recreate the tables above (source five-tables.sql).
  
  4. restart mysqld.
  
  Cheers, 
  CNL
*/

CREATE TABLE `innodb_index_stats` (
  `database_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `table_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `index_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `stat_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `stat_value` bigint(20) unsigned NOT NULL,
  `sample_size` bigint(20) unsigned DEFAULT NULL,
  `stat_description` varchar(1024) COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`database_name`,`table_name`,`index_name`,`stat_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0;

CREATE TABLE `innodb_table_stats` (
  `database_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `table_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `n_rows` bigint(20) unsigned NOT NULL,
  `clustered_index_size` bigint(20) unsigned NOT NULL,
  `sum_of_other_index_sizes` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`database_name`,`table_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0;

CREATE TABLE `slave_master_info` (
  `Number_of_lines` int(10) unsigned NOT NULL COMMENT 'Number of lines in the file.',
  `Master_log_name` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL COMMENT 'The name of the master binary log currently being read from the master.',
  `Master_log_pos` bigint(20) unsigned NOT NULL COMMENT 'The master log position of the last read event.',
  `Host` char(64) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '' COMMENT 'The host name of the master.',
  `User_name` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The user name used to connect to the master.',
  `User_password` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The password used to connect to the master.',
  `Port` int(10) unsigned NOT NULL COMMENT 'The network port used to connect to the master.',
  `Connect_retry` int(10) unsigned NOT NULL COMMENT 'The period (in seconds) that the slave will wait before trying to reconnect to the master.',
  `Enabled_ssl` tinyint(1) NOT NULL COMMENT 'Indicates whether the server supports SSL connections.',
  `Ssl_ca` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The file used for the Certificate Authority (CA) certificate.',
  `Ssl_capath` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The path to the Certificate Authority (CA) certificates.',
  `Ssl_cert` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The name of the SSL certificate file.',
  `Ssl_cipher` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The name of the cipher in use for the SSL connection.',
  `Ssl_key` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The name of the SSL key file.',
  `Ssl_verify_server_cert` tinyint(1) NOT NULL COMMENT 'Whether to verify the server certificate.',
  `Heartbeat` float NOT NULL,
  `Bind` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'Displays which interface is employed when connecting to the MySQL server',
  `Ignored_server_ids` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The number of server IDs to be ignored, followed by the actual server IDs',
  `Uuid` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The master server uuid.',
  `Retry_count` bigint(20) unsigned NOT NULL COMMENT 'Number of reconnect attempts, to the master, before giving up.',
  `Ssl_crl` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The file used for the Certificate Revocation List (CRL)',
  `Ssl_crlpath` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The path used for Certificate Revocation List (CRL) files',
  `Enabled_auto_position` tinyint(1) NOT NULL COMMENT 'Indicates whether GTIDs will be used to retrieve events from the master.',
  PRIMARY KEY (`Host`,`Port`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 STATS_PERSISTENT=0 COMMENT='Master Information';

CREATE TABLE `slave_relay_log_info` (
  `Number_of_lines` int(10) unsigned NOT NULL COMMENT 'Number of lines in the file or rows in the table. Used to version table definitions.',
  `Relay_log_name` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL COMMENT 'The name of the current relay log file.',
  `Relay_log_pos` bigint(20) unsigned NOT NULL COMMENT 'The relay log position of the last executed event.',
  `Master_log_name` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL COMMENT 'The name of the master binary log file from which the events in the relay log file were read.',
  `Master_log_pos` bigint(20) unsigned NOT NULL COMMENT 'The master log position of the last executed event.',
  `Sql_delay` int(11) NOT NULL COMMENT 'The number of seconds that the slave must lag behind the master.',
  `Number_of_workers` int(10) unsigned NOT NULL,
  `Id` int(10) unsigned NOT NULL COMMENT 'Internal Id that uniquely identifies this record.',
  PRIMARY KEY (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 STATS_PERSISTENT=0 COMMENT='Relay Log Information';

CREATE TABLE `slave_worker_info` (
  `Id` int(10) unsigned NOT NULL,
  `Relay_log_name` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `Relay_log_pos` bigint(20) unsigned NOT NULL,
  `Master_log_name` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `Master_log_pos` bigint(20) unsigned NOT NULL,
  `Checkpoint_relay_log_name` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `Checkpoint_relay_log_pos` bigint(20) unsigned NOT NULL,
  `Checkpoint_master_log_name` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `Checkpoint_master_log_pos` bigint(20) unsigned NOT NULL,
  `Checkpoint_seqno` int(10) unsigned NOT NULL,
  `Checkpoint_group_size` int(10) unsigned NOT NULL,
  `Checkpoint_group_bitmap` blob NOT NULL,
  PRIMARY KEY (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 STATS_PERSISTENT=0 COMMENT='Worker Information';
###########################################################################
###########################################################################
###########################################################################


### Regenerate performance_schema
mysql_upgrade --force -u root -p


### Make sure tables are okay
mysqlcheck -p


### Grow mysql temporary space to prevent:
#### ERROR 1034 (HY000): Incorrect key file for table 'oc_filecache'; try to repair it
lvextend -L 16G /dev/rootvg/hd1
resize2fs /dev/rootvg/hd1


### Set to compressed tables
# gzipped, the dump is 319MB, and deployed, the one table is 6GB, for read mostly data.
mysql -u root -p
mysql> alter table owncloud.oc_filecache ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8;
mysql> exit


### Clean up free space
mysql -u root -p
mysql> OPTIMIZE TABLE owncloud.oc_filecache;
mysql> exit


#####################################
### fix roundcube since it was unhappy with some of the updates
apt-get install roundcube;


### Cleanup some old stuff amplified by partial updates
apt-get autoremove


### Reboot since we had a new dbus installed, and apache2 is still down
shutdown -fr now

bos.rte.security broken

In a couple of instances, I’ve found bos.rte.* filesets broken during upgrade, perhaps with the root part missing.

It’s always a pain, and I always forget how to fix it.

The problem is that the AIX base media does not include base install images for these. They are S (single) updates instead of I (install) images. This is because, during install, a bff called “bos” is laid down first, and that includes 10-20 core filesets, /usr, /, and all the core stuff. It’s basically a prototype mksysb. Sort of.

Anyway, in rare instances, when there is a known defect, IBM will release a fileset as a patch through support/ztrans to get you fixed. If you don’t have time to wait, or if you are a biz partner, working with a customer who hasn’t yet approved you using their support, then you might have to fix it yourself.

<b># install_all_Updates -cYd /export/lppsource/AIX_7.1.4.1_TLSP</b>
+-----------------------------------------------------------------------------+
                   BUILDDATE Verification ...
+-----------------------------------------------------------------------------+
Verifying build dates...
0503-466 installp: The build date requisite check failed for fileset     bos.rte.security.
Installed fileset build date is 1415.  Selected fileset does not have a build date, but one is required.
installp: Installation failed due to BUILDDATE requisite failure.

install_all_updates: Checking for recommended maintenance level 7100-04.
install_all_updates: Executing /usr/bin/oslevel -rf, Result = 7100-03
install_all_updates: ATTENTION, the system recommended maintenance level
does not correspond to the highest level known to install_all_updates.
For more details, execute /usr/bin/oslevel -rl 7100-04.

install_all_updates: Log file is /var/adm/ras/install_all_updates.log
install_all_updates: Result = FAILURE


<b># installp -acXYd /export/lppsource/AIX_7.1.4.0_Base/installp/ppc bos.rte.security</b>
+-----------------------------------------------------------------------------+
                    Pre-installation Verification...
+-----------------------------------------------------------------------------+
Verifying selections...done
Verifying requisites...Verifying requisites...done
Results...

FAILURES
--------
  Filesets listed in this section failed pre-installation verification
  and will not be installed.

  Requisite Failures
  ------------------
  SELECTED FILESETS:  The following is a list of filesets that you asked to
  install.  They cannot be installed until all of their requisite filesets
  are also installed.  See subsequent lists for details of requisites.

    bos.rte.security 7.1.4.0                  # Base Security Function

  CONFLICTING REQUISITES:  The following filesets are required by one or
  more of the selected filesets listed above.  There are other versions of
  these filesets which are already installed (or which were selected to be
  installed during this installation session).  A base level fileset cannot
  be installed automatically as another fileset's requisite when a different
  version of the requisite is already installed.  You must explicitly select
  the base level requisite for installation.

    bos.64bit 7.1.4.0                         # Base Operating System 64 bit...
    bos.acct 7.1.4.0                          # Accounting Services
    bos.adt.include 7.1.4.0                   # Base Application Development...
    bos.mp64 7.1.4.0                          # Base Operating System 64-bit...
    bos.perf.libperfstat 7.1.4.0              # Performance Statistics Libra...
    bos.perf.perfstat 7.1.4.0                 # Performance Statistics Inter...
    bos.perf.proctools 7.1.4.0                # Proc Filesystem Tools
    bos.perf.tools 7.1.4.0                    # Base Performance Tools
    bos.pmapi.pmsvcs 7.1.4.0                  # Performance Monitor API Kern...
    bos.wpars 7.1.4.0                         # AIX Workload Partitions
    mcr.rte 7.1.4.0                           # Metacluster Checkpoint and R...
    perfagent.tools 7.1.4.0                   # Local Performance Analysis &...

  MISCELLANEOUS FAILING REQUISITES:  The following filesets are requisites
  of one or more of the selected filesets listed above.  Various problems
  associated with these requisites are preventing the selected filesets
  from installing.  See the "Requisite Failure Key" for failure reasons and
  possible recovery hints.

    < bos.rte.security 7.1.3.30               # Base Security Function

  Requisite Failure Key:
  "<" superseded fileset that is applied on the "usr" part which must
      also be applied on the "root" part for consistency.  Select this
      fileset explicitly or use the option to automatically include
      requisite software (-g flag).

  AVAILABLE REQUISITES:  The following filesets are requisites of one or
  more of the selected filesets listed above.  They are available on
  the installation media.  To install these requisites with the selected
  filesets, specify the option to automatically install requisite
  software (-g flag).

    bos.rte.control 7.1.4.0                   # System Control Commands
    bos.rte.libc 7.1.4.0                      # libc Library

  << End of Failure Section >>

+-----------------------------------------------------------------------------+
                   BUILDDATE Verification ...
+-----------------------------------------------------------------------------+
Verifying build dates...done
FILESET STATISTICS
------------------
    1  Selected to be installed, of which:
        1  FAILED pre-installation verification
  ----
    0  Total to be installed


Pre-installation Failure/Warning Summary
----------------------------------------
Name                      Level           Pre-installation Failure/Warning
-------------------------------------------------------------------------------
bos.rte.security          7.1.4.0         Requisite failure

<b># installp -acXYFd /export/lppsource/AIX_7.1.4.0_Base/installp/ppc bos.rte.security</b>
+-----------------------------------------------------------------------------+
                    Pre-installation Verification...
+-----------------------------------------------------------------------------+
Verifying selections...

Pre-installation Failure/Warning Summary
----------------------------------------
0503-500 installp:  After completion of pre-installation processing,
        there were no installable base level filesets found on the
        installation media.  Note that use of the force install option
        (-F flag) will cause installp to consider only base level filesets
        (fileset updates will be ignored).  No installation has occurred.

So then I installed these, thinking maybe….

  bos.64bit 7.1.4.0                           # Base Operating System 64 bit...
  bos.acct 7.1.4.0                            # Accounting Services
  bos.adt.include 7.1.4.0                     # Base Application Development...
  bos.mp64 7.1.4.0                            # Base Operating System 64-bit...
  bos.perf.libperfstat 7.1.4.0                # Performance Statistics Libra...
  bos.perf.perfstat 7.1.4.0                   # Performance Statistics Inter...
  bos.perf.proctools 7.1.4.0                  # Proc Filesystem Tools
  bos.perf.tools 7.1.4.0                      # Base Performance Tools
  bos.pmapi.pmsvcs 7.1.4.0                    # Performance Monitor API Kern...
  bos.wpars 7.1.4.0                           # AIX Workload Partitions
  mcr.rte 7.1.4.0                             # Metacluster Checkpoint and R...
  perfagent.tools 7.1.4.0                     # Local Performance Analysis &...
 bos.rte.control 7.1.4.0                     # System Control Commands

But no joy. bos.rte.libc and bos.rte.security depend on eachother, and it still fails with the top errors.

<b># installp -acXYd /export/lppsource/AIX_7.1.4.0_Base/installp/ppc bos.rte.security</b>
+-----------------------------------------------------------------------------+
                    Pre-installation Verification...
+-----------------------------------------------------------------------------+
Verifying selections...done
Verifying requisites...Verifying requisites...done
Results...

FAILURES
--------
  Filesets listed in this section failed pre-installation verification
  and will not be installed.

  Requisite Failures
  ------------------
  SELECTED FILESETS:  The following is a list of filesets that you asked to
  install.  They cannot be installed until all of their requisite filesets
  are also installed.  See subsequent lists for details of requisites.

    bos.rte.security 7.1.4.0                  # Base Security Function

  MISCELLANEOUS FAILING REQUISITES:  The following filesets are requisites
  of one or more of the selected filesets listed above.  Various problems
  associated with these requisites are preventing the selected filesets
  from installing.  See the "Requisite Failure Key" for failure reasons and
  possible recovery hints.

    < bos.rte.security 7.1.3.30               # Base Security Function

  Requisite Failure Key:
  "<" superseded fileset that is applied on the "usr" part which must
      also be applied on the "root" part for consistency.  Select this
      fileset explicitly or use the option to automatically include
      requisite software (-g flag).

  AVAILABLE REQUISITES:  The following filesets are requisites of one or
  more of the selected filesets listed above.  They are available on
  the installation media.  To install these requisites with the selected
  filesets, specify the option to automatically install requisite
  software (-g flag).

    bos.rte.libc 7.1.4.0                      # libc Library

  << End of Failure Section >>

+-----------------------------------------------------------------------------+
                   BUILDDATE Verification ...
+-----------------------------------------------------------------------------+
Verifying build dates...done
FILESET STATISTICS
------------------
    1  Selected to be installed, of which:
        1  FAILED pre-installation verification
  ----
    0  Total to be installed


Pre-installation Failure/Warning Summary
----------------------------------------
Name                      Level           Pre-installation Failure/Warning
-------------------------------------------------------------------------------
bos.rte.security          7.1.4.0         Requisite failure



<b># installp -acXYd /export/lppsource/AIX_7.1.4.0_Base/installp/ppc bos.rte.libc</b>
+-----------------------------------------------------------------------------+
                    Pre-installation Verification...
+-----------------------------------------------------------------------------+
Verifying selections...done
Verifying requisites...Verifying requisites...done
Results...

FAILURES
--------
  Filesets listed in this section failed pre-installation verification
  and will not be installed.

  Requisite Failures
  ------------------
  SELECTED FILESETS:  The following is a list of filesets that you asked to
  install.  They cannot be installed until all of their requisite filesets
  are also installed.  See subsequent lists for details of requisites.

    bos.rte.libc 7.1.4.0                      # libc Library

  MISCELLANEOUS FAILING REQUISITES:  The following filesets are requisites
  of one or more of the selected filesets listed above.  Various problems
  associated with these requisites are preventing the selected filesets
  from installing.  See the "Requisite Failure Key" for failure reasons and
  possible recovery hints.

    < bos.rte.security 7.1.3.30               # Base Security Function

  Requisite Failure Key:
  "<" superseded fileset that is applied on the "usr" part which must
      also be applied on the "root" part for consistency.  Select this
      fileset explicitly or use the option to automatically include
      requisite software (-g flag).

  AVAILABLE REQUISITES:  The following filesets are requisites of one or
  more of the selected filesets listed above.  They are available on
  the installation media.  To install these requisites with the selected
  filesets, specify the option to automatically install requisite
  software (-g flag).

    bos.rte.security 7.1.4.0                  # Base Security Function

  << End of Failure Section >>

+-----------------------------------------------------------------------------+
                   BUILDDATE Verification ...
+-----------------------------------------------------------------------------+
Verifying build dates...done
FILESET STATISTICS
------------------
    1  Selected to be installed, of which:
        1  FAILED pre-installation verification
  ----
    0  Total to be installed


Pre-installation Failure/Warning Summary
----------------------------------------
Name                      Level           Pre-installation Failure/Warning
-------------------------------------------------------------------------------
bos.rte.libc              7.1.4.0         Requisite failure


<b># installp -acXYFd /export/lppsource/AIX_7.1.4.0_Base/installp/ppc bos.rte.libc</b>
+-----------------------------------------------------------------------------+
                    Pre-installation Verification...
+-----------------------------------------------------------------------------+
Verifying selections...

Pre-installation Failure/Warning Summary
----------------------------------------
0503-500 installp:  After completion of pre-installation processing,
        there were no installable base level filesets found on the
        installation media.  Note that use of the force install option
        (-F flag) will cause installp to consider only base level filesets
        (fileset updates will be ignored).  No installation has occurred.



<b># installp -acXYd /export/lppsource/AIX_7.1.4.0_Base/installp/ppc bos.rte.security bos.rte.libc</b>
+-----------------------------------------------------------------------------+
                    Pre-installation Verification...
+-----------------------------------------------------------------------------+
Verifying selections...done
Verifying requisites...Verifying requisites...done
Results...

SUCCESSES
---------
  Filesets listed in this section passed pre-installation verification
  and will be installed.

  Selected Filesets
  -----------------
  bos.rte.libc 7.1.4.0                        # libc Library
  bos.rte.security 7.1.4.0                    # Base Security Function

  Requisites
  ----------
  (being installed automatically;  required by filesets listed above)
  bos.rte.security 7.1.3.30                   # Base Security Function

  < < End of Success Section >>

+-----------------------------------------------------------------------------+
                   BUILDDATE Verification ...
+-----------------------------------------------------------------------------+
Verifying build dates...
0503-466 installp: The build date requisite check failed for fileset     bos.rte.security.
Installed fileset build date is 1415.  Selected fileset does not have a build date, but one is required.
installp: Installation failed due to BUILDDATE requisite failure.




<b># installp -C</b>
0503-439 installp:  No filesets were found in the Software Vital
        Product Database that could be cleaned up.



<b># installp -c all</b>
+-----------------------------------------------------------------------------+
                        Pre-commit Verification...
+-----------------------------------------------------------------------------+
Verifying selections...done
Verifying requisites...done
Results...

WARNINGS
--------
  Problems described in this section are not likely to be the source of any
  immediate or serious failures, but further actions may be necessary or
  desired.

  Nothing to Commit
  -----------------
  There is nothing in the APPLIED state that needs to be committed.

  < < End of Warning Section >>



<b># lslpp -h bos.rte.security</b>
  Fileset         Level     Action       Status       Date         Time
  ----------------------------------------------------------------------------
Path: /usr/lib/objrepos
  bos.rte.security
                  7.1.3.0   COMMIT       COMPLETE     07/25/14     09:44:45
                 7.1.3.15   COMMIT       COMPLETE     11/20/14     11:25:13
                 7.1.3.30   COMMIT       COMPLETE     12/08/14     02:47:41

Path: /etc/objrepos
  bos.rte.security
                  7.1.3.0   COMMIT       COMPLETE     07/25/14     09:44:45
                 7.1.3.15   COMMIT       COMPLETE     11/20/14     11:25:14



<b># installp -rBXJw bos.rte.security</b>
+-----------------------------------------------------------------------------+
                        Pre-reject Verification...
+-----------------------------------------------------------------------------+
Verifying selections...done
Verifying requisites...done
Results...

WARNINGS
--------
  Problems described in this section are not likely to be the source of any
  immediate or serious failures, but further actions may be necessary or
  desired.

  Not Rejectable
  --------------
  No software could be found installed on the system that could be rejected
  for the following requests:

    bos.rte.security

  (Possible reasons for failure:  1. the selected software has been
   committed, i.e., cannot be rejected, 2. the selected software is not
   installed, 3. the pre-reject script failed, or 4. a typographical
   error was made.)

  < < End of Warning Section >>

FILESET STATISTICS
------------------
    1  Selected to be rejected, of which:
        1  FAILED pre-reject verification
  ----
    0  Total to be rejected


Pre-installation Failure/Warning Summary
----------------------------------------
Name                      Level           Pre-installation Failure/Warning
-------------------------------------------------------------------------------
bos.rte.security                          Failed pre-rejection check



<b># lppchk -vm3</b>
lppchk:  The following filesets need to be installed or corrected to bring
         the system to a consistent state:

  bos.rte.security 7.1.3.30               (usr: COMMITTED, root: not installed)


<b># installp -acXYFd /export/lppsource/AIX_7.1.4.0_Base/installp/ppc bos.rte.security bos.rte.libc</b>
+-----------------------------------------------------------------------------+
                    Pre-installation Verification...
+-----------------------------------------------------------------------------+
Verifying selections...

Pre-installation Failure/Warning Summary
----------------------------------------
0503-500 installp:  After completion of pre-installation processing,
        there were no installable base level filesets found on the
        installation media.  Note that use of the force install option
        (-F flag) will cause installp to consider only base level filesets
        (fileset updates will be ignored).  No installation has occurred.


The solution was ODM surgery.

First, I took a mksysb and copied it to somewhere safe (another server with NIM installed).

Then, I looked into ODM, and found /etc/objrepos/product was missing the entry for this version.
You might be able to copy from /usr/lib/objrepos, but I copied from a valid clone of this system.

# export ODMDIR=/etc/objrepos
# ssh goodserver odmget -q lpp_name=bos.rte.security product | odmadd

Then, I needed to add the history line, which was identical between root and usr:

# odmget -q name=bos.rte.security lpp     (note the lpp_id)
# ODMDIR=/usr/lib/objrepos odmget -q lpp_name=39 history | ODMDIR=/etc/objrepos odmadd

The “inventory” ODM is accessed with lpp_name also, but that had a long list of files already. I did not mess with any of that.

Now, install_all_updates from my TLSP worked fine.


How to show respect when bestowing honors…

It’s great to announce milestones when employees achieve certain number of years. However, if you’re going to do this verbally, it’s important to find out from the person, or their manager, how to pronounce their name.

It’s not acceptable for a CEO or other executive to claim they are honoring someone, but to say “I’m sorry I don’t know how to pronounce these.” If some of the names are really too tough, it’s fine to send out a list via email, and maybe a temporary blurb on the company page. Even having someone else read the list who can pronounce names is acceptable.

Also, if your company is a conglomerate, it’s not okay for the executive to announce only people in the business unit that promoted them, when it’s a call for the entire company. The list really needs to be complete for the audience selected. It is entirely acceptable to thank only a specific unit when only addressing that unit. It’s entirely acceptable to put a list up somewhere and ask people to review it, as long as they are given access and time.

Further, communication really needs to be targeted. If you have several business units, do not spam XYZ with things only related to PDQ, and vice versa. Technical people for one product do not need, and do not want, sales information for other, mostly unrelated products. On the same token, Sales people do not want, nor do they need, in-depth details about technical matters.

Lastly, when concerns about respect are brought up, it’s important to directly address them. Do not put them off to a later date, or assume they are okay. Put the issue on a list, and put follow up dates on your calendar. Make sure you understand the issue, and that it’s been resolved. Usually, it’s simply a communication error, or sometimes it’s a cultural difference.

Remember, honor and respect are key components. These little things are the pillars of any company. If their expression is hollow or incomplete, then what does that say about the foundation of your business?


cl_rsh fails

PROBLEM: On some migrates, we found the rpdomain would not stay running on one node.
The cluster was up, and SEEMED to operate normally, but errpt got CONFIGRM stop/start messages every minute.

lsrpdomain would show Offline, or “Pending online”.

lsrpnode would show:
2610-412 A Resource Manager terminated while attempting to enumerate resources for this command.
2610-408 Resource selection could not be performed.
2610-412 A Resource Manager terminated while attempting to enumerate resources for this command.
2610-408 Resource selection could not be performed.

On the other node, lsrpnode only showed itself, and lsrpdomain showed Online.

“cl_rsh node1 date” worked from both nodes
“cl_rsh node2 date” worked only from node2.
/etc/hosts, cllsif, hostname, /etc/cluster/rhosts… everything was spotless.
clcomd was running, even after refresh.
Same subnet, and ports were not filtered.

Importing a snapshot said:
Warning: unable to verify inbound clcomd communication from

        node "node1" to the local node, "node2".</code>

I applied PowerHA 7.1.3 SP4, and no fix. I think this is a problem with clmigcheck or mkcluster in AIX.

SOLUTION
I saved a snapshot, blew away the cluster, and imported the snapshot.
/usr/es/sbin/cluster/utilities/clsnapshot -c -i -nmysnapshot -d "Snapshot before clrmcluster"
clstop -g -N
stopsrc -g cluster
clrmclstr
rmcluster -r hdisk10

  1. one node's SSHd died here.

rmdev -dl cluster0
cfgmgr
cl_rsh works all the way around now.
/usr/es/sbin/cluster/utilities/clsnapshot -a -n'mysnapshot' -f'false'
cllsclstr ; lscluster -m ; lsrpdomain ; lsrpnode

works fine all around, before and after reboot.
Cluster starts normally.

Error Reference
---------------------------------------------------------------------------
LABEL: CONFIGRM_STOPPED_ST
IDENTIFIER: 447D3237

Date/Time: Tue Nov 24 04:18:36 EST 2015
Sequence Number: 42614
Class: O
Type: INFO
WPAR: Global
Resource Name: ConfigRM

Description
IBM.ConfigRM daemon has been stopped.

Probable Causes
The RSCT Configuration Manager daemon(IBM.ConfigRMd) has been stopped.

User Causes
The stopsrc -s IBM.ConfigRM command has been executed.

       Recommended Actions
       Confirm that the daemon should be stopped. Normally, this daemon should

not be stopped explicitly by the user.

Detail Data
DETECTING MODULE
RSCT,ConfigRMDaemon.C,1.25.1.1,219
ERROR ID

REFERENCE CODE

---------------------------------------------------------------------------
LABEL: CONFIGRM_MESSAGE_ST
IDENTIFIER: F475ABC7

Date/Time: Tue Nov 24 04:18:32 EST 2015
Sequence Number: 42613
Class: O
Type: INFO
WPAR: Global
Resource Name: ConfigRM

Detail Data
DETECTING MODULE
RSCT,ConfigRMGroup.C,1.337.1.1,6951

DIAGNOSTIC EXPLANATION
get_adapter_info_by_addr(192.168.0.12) FAILED rc=28
---------------------------------------------------------------------------
LABEL: CONFIGRM_MESSAGE_ST
IDENTIFIER: F475ABC7

Date/Time: Tue Nov 24 04:18:32 EST 2015
Sequence Number: 42612
Class: O
Type: INFO
WPAR: Global
Resource Name: ConfigRM

Detail Data
DETECTING MODULE
RSCT,ConfigRMGroup.C,1.337.1.1,6951

DIAGNOSTIC EXPLANATION
get_adapter_info_by_addr(192.168.0.12) FAILED rc=28
---------------------------------------------------------------------------
LABEL: CONFIGRM_MESSAGE_ST
IDENTIFIER: F475ABC7

Date/Time: Tue Nov 24 04:18:32 EST 2015
Sequence Number: 42611
Class: O
Type: INFO
WPAR: Global
Resource Name: ConfigRM

Detail Data
DETECTING MODULE
RSCT,ConfigRMGroup.C,1.337.1.1,6951

DIAGNOSTIC EXPLANATION
get_adapter_info_by_addr(10.0.0.12) FAILED rc=28
---------------------------------------------------------------------------
LABEL: CONFIGRM_MESSAGE_ST
IDENTIFIER: F475ABC7

Date/Time: Tue Nov 24 04:18:32 EST 2015
Sequence Number: 42610
Class: O
Type: INFO
WPAR: Global
Resource Name: ConfigRM

Detail Data
DETECTING MODULE
RSCT,ConfigRMGroup.C,1.337.1.1,6951

DIAGNOSTIC EXPLANATION
get_adapter_info_by_addr(192.168.0.11) FAILED rc=28
---------------------------------------------------------------------------
LABEL: CONFIGRM_MESSAGE_ST
IDENTIFIER: F475ABC7

Date/Time: Tue Nov 24 04:18:32 EST 2015
Sequence Number: 42609
Class: O
Type: INFO
WPAR: Global
Resource Name: ConfigRM

Detail Data
DETECTING MODULE
RSCT,ConfigRMGroup.C,1.337.1.1,6951

DIAGNOSTIC EXPLANATION
get_adapter_info_by_addr(192.168.0.11) FAILED rc=28
---------------------------------------------------------------------------
LABEL: CONFIGRM_MESSAGE_ST
IDENTIFIER: F475ABC7

Date/Time: Tue Nov 24 04:18:32 EST 2015
Sequence Number: 42608
Class: O
Type: INFO
WPAR: Global
Resource Name: ConfigRM

Detail Data
DETECTING MODULE
RSCT,ConfigRMGroup.C,1.337.1.1,6951

DIAGNOSTIC EXPLANATION
get_adapter_info_by_addr(10.0.0.11) FAILED rc=28
---------------------------------------------------------------------------
LABEL: CONFIGRM_PENDINGQUO
IDENTIFIER: A098BF90

Date/Time: Tue Nov 24 04:18:32 EST 2015
Sequence Number: 42607
Class: S
Type: PERM
WPAR: Global
Resource Name: ConfigRM

Description
The operational quorum state of the active peer domain has changed to PENDING_QUORUM.
This state usually indicates that exactly half of the nodes that are defined in the
peer domain are online. In this state cluster resources cannot be recovered although
none will be stopped explicitly.

Failure Causes
One or more nodes in the active peer domain have failed.
One or more nodes in the active peer domain have been taken offline by the user.
A network failure is disrupted communication between the cluster nodes.

       Recommended Actions
       Ensure that more than half of the nodes of the domain are online.
       Ensure that the network that is used for communication between the nodes is functioning correctly.
       Ensure that the active tie breaker device is operational and if it set to

'Operator' then resolve the tie situation by granting ownership to one of
the active sub-domains.

Detail Data
DETECTING MODULE
RSCT,PeerDomain.C,1.99.30.8,19713

---------------------------------------------------------------------------
LABEL: STORAGERM_STARTED_S
IDENTIFIER: EDFF8E9B

Date/Time: Tue Nov 24 04:17:53 EST 2015
Sequence Number: 42606
Node Id: node1
Class: O
Type: INFO
WPAR: Global
Resource Name: StorageRM

Detail Data
DETECTING MODULE
RSCT,IBM.StorageRMd.C,1.49,147

---------------------------------------------------------------------------
LABEL: CONFIGRM_ONLINE_ST
IDENTIFIER: 3B16518D

Date/Time: Tue Nov 24 04:17:52 EST 2015
Sequence Number: 42605
Node Id: node1
Class: S
Type: INFO
WPAR: Global
Resource Name: ConfigRM

Detail Data
DETECTING MODULE
RSCT,PeerDomain.C,1.99.30.8,24950

Peer Domain Name
mycluster


clmigcheck bypass hitachi

Hitachi disks, clmigcheck always says there are no matching disks.
This may be due to spaces in udid, or other problems with the awk line in list_common_free_disks
The easiest fix is to
A) Verify you have absolutely picked the right disk. Set a PVID and remove/readd it to make sure.
B) comment out the “mv” line for the RESULT files.


Posted in Reference, Work | Comments Off on clmigcheck bypass hitachi