Feeds:
Posts
Comments

Archive for the ‘EMC’ Category

With EMC’s PowerPath product it may be useful to check the status of what is available as far as PowerPath is concerned.
Use the “powermt display” command for this.

EXAMPLE:

# powermt display
total emcpower devices: 94, highest device number: 93
==============================================================================
———— Adapters ————   —– Device Paths ——  — Queued —
##  Switch   Name                    Summary      Total Closed  IOs   Blocks
==============================================================================
0  Enabled  sbus@1f/fcaw@0          Optimal      94    0       0     0
1  Enabled  sbus@1f/fcaw@2          Optimal      94    0       0     0
This shows us that there are 2 paths that PowerPath is using and both of them are online and functioning at optimal performance.
It is possible to search further by using the “powermt display dev=<dev#|cNtNdN|all>” command.

EXAMPLE:

# powermt display dev=c1t0d93
emcpower91: state=operational, policy=symm_opt, priority=0, IOs in progress=0
Symmetrix ID=0184502576
==============================================================================
—— Adapter ——    ——— Device Path ———-  Serial  Queued Errs
## Name                  Mode    Link      ID       State  Number    IOs
==============================================================================
0 sbus@1f/fcaw@0        Active  c1t0d93   sd2681   Open   7611D000  0     0
1 sbus@1f/fcaw@2        Active  c2t1d93   sd2572   Open   7611D000  0     0
To see all devices use the following command :”powermt display dev=all”

Advertisements

Read Full Post »

Details:

As a supplement to TechAlert 274518 (VERITAS Volume Manager 3.5 MP2 may not failover to alternate paths making loss of data availability and data corruption possible), this TechNote illustrates examples of dynamic multipathing (DMP) not failing over to alternate paths and the procedure to recover from such errors.
System Configuration
This problem can occur either on an Active/Active (A/A) array such as EMC Symmetrix, or an Active/Passive (A/P) arrays such as EMC CLARiiON. A CLARiiON array is used here with VERITAS Volume Manager ™ 3.5 MP2.

1. Volume layout
A simple 2-column stripe volume, using two LUNS belonging to a CLARiiON array:

v testvol - ENABLED ACTIVE 8382464 SELECT testvol-01 fsgen
pl testvol-01 testvol ENABLED ACTIVE 8382464 STRIPE 2/128 RW
sd d2-01 testvol-01 d2 0 4191232 0/0 EMC_CLARiiON0_9 ENA
sd d1-01 testvol-01 d1 0 4191232 1/0 EMC_CLARiiON0_18 ENA

2. File system
A VERITAS File System ™ file system resides on this volume :

# df
/testvol (/dev/vx/dsk/clariiondg/testvol): 3738854 blocks 467355 files

3. Multipathing
Each LUN has two paths; one LUN has the primary path on controller c2, other LUN has the primary path on controller c3:

# vxdisk list EMC_CLARiiON0_18
Device: EMC_CLARiiON0_18
devicetag: EMC_CLARiiON0_18
...
Multipathing information:
numpaths: 2
c2t50060169102041E8d1s2 state=enabled type=secondary
c3t50060160102041E8d1s2 state=enabled type=primary
# vxdisk list EMC_CLARiiON0_9
Device: EMC_CLARiiON0_9
devicetag: EMC_CLARiiON0_9
...
Multipathing information:
numpaths: 2
c2t50060169102041E8d10s2 state=enabled type=primary
c3t50060160102041E8d10s2 state=enabled type=secondary

Examples of Failures
Two examples of path failures on the CLARiiON array are illustrated here.
1. Path failure due to transient failure such as a Service Processor (SP) going temporarily offline (for example, due to a reboot of the SP):
An excerpt of some of the SCSI errors recorded in syslog :

Feb 1 15:38:23 soliton.veritas.com scsi: [ID 107833 kern.warning] WARNING: /pci@9,600000/pci@2/SUNW,qlc@5/fp@0,0/ssd@w50060160102041e8,1 (ssd460):

Feb 1 15:38:23 soliton.veritas.com Error for Command: write(10) Error Level: Fatal

Feb 1 15:38:23 soliton.veritas.com scsi: [ID 107833 kern.notice] Requested Block: 367872 Error Block: 367872

Feb 1 15:38:23 soliton.veritas.com scsi: [ID 107833 kern.notice] Vendor: DGC Serial Number: 0100002F9CCL

Feb 1 15:38:23 soliton.veritas.com scsi: [ID 107833 kern.notice] Sense Key: Not Ready

Feb 1 15:38:23 soliton.veritas.com scsi: [ID 107833 kern.notice] ASC: 0x4 (<vendor unique code 0x4>), ASCQ: 0x3, FRU: 0x0

A little later, you will see vxio errors :

Feb 1 15:38:23 soliton.veritas.com vxio: [ID 663439 kern.warning] WARNING: vxvm:vxio: Subdisk d1-01 block 365568: Uncorrectable write error

Feb 1 15:38:23 soliton.veritas.com vxio: [ID 663439 kern.warning] WARNING: vxvm:vxio: Subdisk d1-01 block 365696: Uncorrectable write error

After this, the file system gets disabled, and you will see VERITAS File System errors similar to those below :

Feb 1 15:38:29 soliton.veritas.com vxfs: [ID 702911 kern.warning] WARNING: msgcnt 2 vxfs: mesg 037: vx_metaioerr - vx_logbuf_write - /dev/vx/dsk/clariiondg/testvol file system meta data write error in block 2573

Feb 1 15:38:29 soliton.veritas.com vxfs: [ID 702911 kern.warning] WARNING: msgcnt 3 vxfs: mesg 031: vx_disable - /dev/vx/dsk/clariiondg/ testvol file system disabled

Feb 1 15:38:29 soliton.veritas.com vxfs: [ID 702911 kern.warning] WARNING: msgcnt 4 vxfs: mesg 017: vx_delbuf_flush - /testvol file system inode 4 marked bad incore

Feb 1 15:38:29 soliton.veritas.com vxfs: [ID 885974 kern.info] vxfs msgcnt 4 offset 0x00000000 41ed 4 0 1

...

Feb 1 15:38:29 soliton.veritas.com vxfs: [ID 214594 kern.info] vxfs msgcnt 4 offset 0x000000b0 0 0

Feb 1 15:38:29 soliton.veritas.com vxfs: [ID 702911 kern.warning] WARNING: msgcnt 5 vxfs: mesg 017: vx_delbuf_flush - /testvol file system inode 2880 marked bad incore

2. Error messages from permanent path failure (such as due to SAN cable failure):
An excerpt of some of the SCSI errors recorded in syslog :

Feb 1 16:55:47 soliton.veritas.com scsi: [ID 243001 kern.info] /pci@9,600000/pci@2/SUNW,qlc@5/fp@0,0 (fcp2):

Feb 1 16:55:47 soliton.veritas.com offlining lun=13 (trace=0), target=601200 (trace=2800004)

Feb 1 16:55:47 soliton.veritas.com scsi: [ID 243001 kern.info] /pci@9,600000/pci@2/SUNW,qlc@5/fp@0,0 (fcp2):

Feb 1 16:55:47 soliton.veritas.com offlining lun=12 (trace=0), target=601200 (trace=2800004)

A little later, you will see vxio errors :

Feb 1 16:55:47 soliton.veritas.com vxio: [ID 663439 kern.warning] WARNING: vxvm:vxio: Subdisk d1-01 block 4040960: Uncorrectable write error

Feb 1 16:55:47 soliton.veritas.com vxio: [ID 663439 kern.warning] WARNING: vxvm:vxio: Subdisk d1-01 block 4041984: Uncorrectable write error

After this, the file system gets disabled, and you will see VERITAS File System errors similar to those below :

Feb 1 16:55:47 soliton.veritas.com vxfs: [ID 702911 kern.warning] WARNING: msgcnt 1309 vxfs: mesg 037: vx_metaioerr - vx_logbuf_write - /dev/vx/dsk/clariiondg/testvol file system meta data write error in block 8852

Feb 1 16:55:47 soliton.veritas.com vxfs: [ID 702911 kern.warning] WARNING: msgcnt 1310 vxfs: mesg 031: vx_disable - /dev/vx/dsk/clariiondg/testvol file system disabled

Feb 1 16:55:47 soliton.veritas.com vxfs: [ID 702911 kern.warning] WARNING: msgcnt 1337 vxfs: mesg 037: vx_metaioerr - vx_inode_iodone - /dev/vx/dsk/clariiondg/testvol file system meta data write error in block 941184

...

Feb 1 16:55:47 soliton.veritas.com vxfs: [ID 702911 kern.warning] WARNING: msgcnt 1349 vxfs: mesg 017: vx_ilock - /testvol file system inode 782 marked bad incore

Feb 1 16:55:47 soliton.veritas.com vxfs: [ID 885974 kern.info] vxfs msgcnt 1347 offset 0x00000090 0 0 0 0

...

Feb 1 16:55:47 soliton.veritas.com vxfs: [ID 702911 kern.warning] WARNING: msgcnt 1350 vxfs: mesg 017: vx_ilock - /testvol file system inode 788 marked bad incore

System Status after the Failure
In either of the above cases, the end result is that the volume is in a DISABLED state. If a file system resided on that volume, that file system is no longer accessible. The procedure to recover from such a situation is the same in both cases above.
1. Assume that path c3 fails (either a transient or a permanent failure). After the single-path failure and subsequent sequence of errors shown in the above two cases, the volume goes into a DISABLED state :

v testvol - DISABLED ACTIVE 8382464 SELECT - fsgen

pl testvol-01 testvol DISABLED NODEVICE 8382464 STRIPE 2/128 RW

sd d2-01 testvol-01 d2 0 4191232 0/0 EMC_CLARiiON0_9 ENA

sd d1-01 testvol-01 d1 0 4191232 1/0 - NDEV

2. The vxdisk list command shows one LUN (EMC_CLARiiON0_18, which had c3 as its PRIMARY path) is in a FAILED state:

DEVICE TYPE DISK GROUP STATUS

EMC_CLARiiON0_9 sliced d2 clariiondg online

- - d1 clariiondg failed was:EMC_CLARiiON0_18

3. The df command will show that the file system is in an I/O error state:

# df -k /testvol

Filesystem kbytes used avail capacity Mounted on

df: cannot statvfs /testvol: I/O error

Recovery Procedure
1. First, umount the file system:

# /usr/sbin/umount /testvol

(If this fails, use the “-f” flag parameter to cause a force unmount of the file system; since the volume is in a DISABLED state, it is safe to use this option at this point).
2. Next, run the following command to force Volume Manager to rescan all paths:

# /usr/sbin/vxdctl enable

a. If a transient path error had occurred, and the path is fully functional now, Volume Manager will rediscover this path and re-enable it. All functional paths will be in an ENABLED state.

# vxdmpadm getsubpaths dmpnodename=EMC_CLARiiON0_18

NAME STATE PATH-TYPE CTLR-NAME ENCLR-TYPE ENCLR-NAME

====================================================================

c2t50060169102041E8d1s2 ENABLED SECONDARY c2 EMC_CLARiiON EMC_CLARiiON0

c3t50060160102041E8d1s2 ENABLED PRIMARY c3 EMC_CLARiiON EMC_CLARiiON0

b. If a permanent path failure had occurred, Volume Manager will correctly discover the loss of this path and will mark all non-functional paths as DISABLED.

# vxdmpadm getsubpaths dmpnodename=EMC_CLARiiON0_18

NAME STATE PATH-TYPE CTLR-NAME ENCLR-TYPE ENCLR-NAME

====================================================================

c2t50060169102041E8d1s2 ENABLED SECONDARY c2 EMC_CLARiiON EMC_CLARiiON0

c3t50060160102041E8d1s2 DISABLED PRIMARY c3 EMC_CLARiiON EMC_CLARiiON0

3. Use the following command to reattach the failed LUN:

# /etc/vx/bin/vxreattach

Once this command completes, the disk should now be in an ONLINE state:

DEVICE TYPE DISK GROUP STATUS

EMC_CLARiiON0_9 sliced d2 clariiondg online

EMC_CLARiiON0_18 sliced d1 clariiondg online

However, the volume is now in a DISABLED/RECOVER state :

v testvol - DISABLED ACTIVE 8382464 SELECT - fsgen

pl testvol-01 testvol DISABLED RECOVER 8382464 STRIPE 2/128 RW

sd d2-01 testvol-01 d2 0 4191232 0/0 EMC_CLARiiON0_9 ENA

sd d1-01 testvol-01 d1 0 4191232 1/0 EMC_CLARiiON0_18 ENA

4. Start the volume. Use the “-f” option to force start this volume:

# /usr/sbin/vxvol -f start testvol

The volume is now in an ENABLED/ACTIVE state:

v testvol - ENABLED ACTIVE 8382464 SELECT testvol-01 fsgen

pl testvol-01 testvol ENABLED ACTIVE 8382464 STRIPE 2/128 RW

sd d2-01 testvol-01 d2 0 4191232 0/0 EMC_CLARiiON0_9 ENA

sd d1-01 testvol-01 d1 0 4191232 1/0 EMC_CLARiiON0_18 ENA

5. If this was a raw volume, use appropriate application utilities to check the consistency of the data in that volume.
If the file system resided on this volume, first check if the file system consistency check reports any errors (use the “-n” option to check for errors without committing any changes):

# fsck -F vxfs -n /dev/vx/rdsk/clariiondg/testvol

If no errors, proceed to step 7
6. If the fsck utility reports errors, do the following:
– First, capture a metasave of the file system. See TechNote 208020 for details on how to download and use the metasave utility
– After completing the metasave operation, run the fsck utility (with “-o full” flag ) on this file system

# fsck -F vxfs -o full /dev/vx/rdsk/clariiondg/testvol

If this fails with an error, contact VERITAS Technical Support for further assistance
7. If the above fsck checks completed successfully, you can now mount the file system:

# /usr/sbin/mount -F vxfs /dev/vx/dsk/clariiondg/testvol /testvol

Read Full Post »

Today, there was a problem when I tried to configure Hitachi HTC-USP1000 SAN devices presented to me

in Solaris 8 and Veritas Volume manager 3.5. There were 2 paths to each disk but Veritas does not seem

to realize that these disks are multipathed and was showing 2 disks instead of one with a multipath.

 

[ The fact that Veritas shows the disks twice, doesn’t mean that multipathing isn’t working. You can explicitly tell Veritas

not to show the redundant disk aliases visible through either path (for instance, through vxdiskadm menu option 17 and then

option 2 from the next menu). Note that you may need a reboot to make the excluded disks visible again, if you change your mind.

Best way to check whether VxVM uses multipathing, is vxdisk list DISKNAME, in the bottom you’ll see paths: 2 this means VxVM

knows about both paths to this device .This still doesn’t mean DMP works. Make sure DMP recover daemon runs (vxdmpadm help,

vxdmpadm stat restored), see how long is DMP reconfiguration interval, take machine to single-user mode, umount all VxVM volumes

on SAN, start some sort of read-only stress rest from rav Vx volumes, and while running it: pull out one of fibers make sure

test is still running, and the only complain you get is a message from DMP that one path is gone. stick the fiber back in.

Wait DMP recovery interval (usually, 5 minutes) Pull out the second fiber. Ensure the test is still running. stick the fiber back in.]

 

Later on I found out the following solution to support Hitachi drives by VxVM DMP (Dynamic Multipathing)

http://support.veritas.com/docs/271477

Array Support Library for Hitachi TagmaStore Universal Storage Platform disk array and Sun StorEdge 9990 on VERITAS Volume Manager ™ for Solaris

Details:

This TechFile provides information about the array support library (ASL) for Hitachi TagmaStore (HTC) Universal Storage Platform (USP) disk array and Sun StorEdge 9990.
For general information about ASLs, refer to TechNote 249446 (link in the Related Documents section of this TechFile)
Package Name: Array Support Library for HTC USP
Package Version: 1.0,REV=08.06.2004.11.43
Supported versions of VERITAS Volume Manager: 3.2 and 3.5
Notes :
1. This library is not supported on Volume Manager 4.0.
2. This ASL will not support the premium features of the array: ECOPY and Hitachi Shadow Image ( hardware mirror recognition ).
For more information on ECOPY, refer to TechNote 265477 (link in the Related Documents section of this TechFile). For more information on Hitachi Shadow Image, contact Hitachi.
If you need support for the above premium features, you will need to upgrade to Volume Manger 4.0
If you need the ASL for Volume Manager 4.0, look up TechFile 272025 for the download link (in the Related Documents Section of this TechFile)

Supported Arrays: Hitachi USP and Sun StorEdge 9990 (Active/Active)
At the bottom of this TechFile, click on “Download Now” to download the ASL package
After downloading HTC-USP_271477.tar.Z, extract the file:
# zcat HTC-USP_271477.tar.Z | tar xvf -
Before adding any ASL package, make sure VERITAS Volume Manager ™ is installed and enabled (the vxdctl mode command should return “enabled”):
# vxdctl mode
mode: enabled
To install the package type:
# pkgadd -d . HTC-USP
After the package is installed, you must run the vxdctl enable command to claim the disk array as an Hitachi array:
# vxdctl enable
After running vxdctl enable, run the following commands to ensure that ASL is correctly installed. You should see libhtcusp.so in output of the vxddladm listsupport command:
# vxddladm listsupport | grep -i htcusp
libhtcusp.so A/A HITACHI OPEN-*
For additional information about ASLs, refer to Installing an Array Support Library (ASL) Guide for Solaris, TechNote 249325 (link in Related Documents section of this TechFile)

Additional Information
Package information:
# pkginfo -l HTC-USP
PKGINST: HTC-USP
NAME: Array Support Library for HTC USP
CATEGORY: system
ARCH: sparc
VERSION: 1.0,REV=08.06.2004.11.43
BASEDIR: /etc/vx
VENDOR: HITACHI
DESC: Array Support library for HTC USP
INSTDATE: Sep 21 2004 18:18
HOTLINE: Please contact Hitachi tech support
STATUS: completely installed
FILES: 4 installed pathnames
2 shared pathnames
2 directories
2 executables
40 blocks used (approx)

Download Now – 14 K
File Name: HTC-USP_271477.tar.Z
File Type: Driver
Click Below to Browse the FTP files by Product:
ftp.support.veritas.com/pub/support/products

Lala!!! that did all the trick. DMP worked like a charm.

Read Full Post »

EMC PowerPath

Install the PowerPath package on Solaris (and other supported OS) and it provides EMC Clariion commands related to disk/path access.
cfgadm -c configure c3
powermt check

rm /dev/*dsk/emcpower6*
rm /devices/pseudo/emcp*6*
powermt check
powercf -q
powermt display dev=all # and check things are okay
powermt save
reboot if desired to ensure spurious device don’t pop up again.
powercf -q patches /kernel/etc/emcp.conf and remove any spurious devices.
It should be run during boot time; but somehow this manual procedure was needed
to remove the ghost device.

Read Full Post »

To display what HBA’s are installed.

  • prtdiag -v
  • dmesg
  • cat /var/adm/messages | grep -i wwn | more

To set the configuration you must carry out the following:

  • changes to the /etc/system file
  • HBA driver modifications
  • Persistent binding (HBA and SD driver config file)
  • EMC recommended changes
  • Install the Sun StorEdge SAN Foundation package

Changes to /etc/system

SCSI throttle set sd:sd_max_throttle=20
Enable wide SCSI set scsi_options=0x7F8
SCSI I/O timeout value sd:sd_io_time=0x3c (with powerpath)
sd:sd_io_time=0x78 (without powerpath

Changes to HBA driver (/kernel/drv/lpfc.conf)

fcp-bind-WWNN=16
automap=2
fcp-on=1
lun-queue-depth=20
tgt-queue-depth=512
no-device-delay=1 (without PP/DMP) 0 (with PP/DMP)
xmt-que-size=256
scan-down=0
linkdown-tmo=0 (without PP/DMP) 60 (with PP/DMP)

Persistent Binding

Both the lpfc.conf and sd.conf files need to be updated. General format is

name=”sd” parent=”lpfc” target=”X” lun=”Y” hba=”lpfcZ”

X is the target number that corresponds to the fcp_bindWWNID lpfcZtX
Y is the LUN number that corresponds to symmetrix volume mapping on the symmetrix port WWN or HLU on the clariion
Z is the lpfc drive instance number that corresponds to the fcp_bind_WWID lpfcZtX

To discover the SAN devices

  • disk;devlinks;devalias (solaris 2.6)
  • devfsadm (solaris 2.8)
  • /usr/sbin/update_drv -f sd (solaris 2.9 >)

Read Full Post »

The following packages need to be downloaded from http://www.emulex.com

  • solaris-3.1a12-6.11c-1b
  • lpfc-6.02f-sparc.tar
  • EmlxApps300a39-Solaris.tar
  • hd192a1.all

Patches required

   Solaris 8

    108528-29   SunOS 5.8: kernel update and Apache patch
117000-05   SunOS 5.8: Kernel Patch
117350-46   SunOS 5.8: kernel patch
111792-13       SunOS 5.8: PICL plugins patch
108974-54   SunOS 5.8: dada, uata, dad, sd, ssd and scsi drivers patch

Solaris 9

    112233-12   SunOS 5.9: Kernel Patch
117171-17   SunOS 5.9: Kernel Patch
118558-39   SunOS 5.9: Kernel Patch
122300-08   SunOS 5.9: Kernel patch
113277-52   SunOS 5.9: sd and ssd drivers Patch
112834-06   SUNOS 5.9: scsi patch

Solaris 10

    118822-30   SunOS 5.10: Kernel Patch
125100-09   SunOS 5.10: Kernel patch
118833-36   SunOS 5.10: sd and ssd driver   
1.Copy configuration files
# cp -p /kernel/drv/lpfc.conf /kernel/drv/lpfc.conf.date
# cp -p /kernel/drv/sd.conf /kernel/drv/sd.conf.date
# cp -p /kernel/drv/st.conf /kernel/drv/st.conf.date
# cp -p /etc/path_to_inst /etc/path_to_inst.date

2.Copy Driver / Firmware updates from shared area to local disk
# mkdir /var/tmp/emulex
# cp –p /proj/gissmo/HBA/EMC/Emulex/* /var/tmp/emulex/

3.Shutdown server to single user mode
# reboot — -rs

4.Remove the HBAnyware package
# pkgrm HBAnyware

5.Remove the lpfc driver
# pkgrm lpfc

6.Copy back the saved path_to_inst file
# cp –p /etc/path_to_inst.date /etc/path_to_inst

7.Untar the file containing the driver, apps, driver and the Emulex Application Kit
# tar xvf solaris-2.1a18-6.02f-1a.tar
# tar xvf lpfc-6.02f-sparc.tar
# pkgadd –d .
# tar xvf EmlxApps300a39-Solaris.tar
# gunzip HBAnyware-*-sparc.tar.gz
# tar xvf HBAnyware-*-sparc.tar
# pkgadd –d . Note: Select the package for HBAnyware

8.Revert sd.conf file
# cp –p /kernel/drv/sd.conf /kernel/drv/sd.conf.post_upgrade
# cp –p /kernel/drv/sd.conf.date /kernel/drv/sd.conf

9.Convert lpfc.conf file from version 5 to version 6
# /usr/sbin/lpfc/update_lpfc /kernel/drv/lpfc.conf.date /kernel/drv/lpfc.conf > /kernel/drv/lpfc.conf.updated
# cp -p /kernel/drv/lpfc.conf /kernel/drv/lpfc.conf_post_upgrade
# cp /kernel/drv/lpfc.conf.upgrated /kernel/drv/lpfc.conf

10.Reboot system back into single user mode
# reboot — -rs

11.Copy firmware into /usr/sbin/lpfc
# cd /var/tmp/emulex
# unzip cd392a2.zip
# cp –p cd392a3.awc /usr/sbin/lpfc/

12.Update firmware
# cd /usr/sbin/lpfc
# ./lputil
> Select option 3 for – Firmware Maitenance
> Select adaptor number to update
> Select option 1 for – Load Firmware Image
> Type in the full name of the image : – cd392a3.awc

Repeat above steps for all Emulex HBA’s

13.Reboot into Single user mode and ensure that devices can been seen
# reboot — -rs
# /etc/powermt display

14.Reboot server
# reboot

Read Full Post »