2018年1月4日 星期四

isilon firmware upgrade

Before you install a node firmware package, make sure that you have the necessary
free space in your /var directory.
In order to successfully install a node firmware package, you must have at least
250MB of free space in the /var directory of every node you are working on.
You can find the free space in the /var directory by running the following command:
df -h

1. Download the new firmware package.
a. Visit EMC online support and download the latest firmware package.
b. Open a secure shell (SSH) connection to any node in the cluster and log in using
the "root" account.
c. Copy the firmware package to the /ifs/data directory on the cluster.
2. Install the firmware package. Depending on your version of OneFS, run one of the
following commands:
OneFS 8.0 or later
isi upgrade patches install IsiFw_Package_<versionnumber>.tar
Earlier than OneFS 8.0
isi pkg install IsiFw_Package_<version-number>.tar
The cluster displays a message stating that the firmware package was successfully
installed.

After the node restarts, confirm that the NVRAM firmware matches the installed
firmware package. Depending on your version of OneFS, run one of the following
commands:
OneFS 8.0 or later
isi upgrade cluster firmware devices
Earlier than OneFS 8.0
isi firmware status
If the NVRAM firmware still does not match the installed firmware package, you must
run the firmware update a second time. Depending on your version of OneFS, run one
of the following commands:
OneFS 8.0 or later
isi upgrade cluster firmware start
Earlier than OneFS 8.0
isi firmware update
update status
node1# while :; do clear; date; isi upgrade view; isi up no li|grep "LNN\|State"|paste - -; echo; isi up no fi pr li | awk '{if (($4 != "-" && $4 != $3) || ($5 != "-" && $4 != "-")) print}'; sleep 30; done

Wed Nov 29 15:22:08 CST 2017

Upgrade Status:

   Cluster Upgrade State: committed
Current Upgrade Activity: Firmware
      Upgrade Start Time: 2017-11-28T14:18:45
   Upgrade Finished Time: 2017-11-28T14:28:25
      Current OS Version: 8.0.0.5_build(81)style(5)
      Upgrade OS Version: N/A

Nodes Progress:

     Total Cluster Nodes: 3
       Nodes On Older OS: 3
          Nodes Upgraded: 0
Nodes Transitioning/Down: 0

If there are any errors please run "isi_upgrade_logs" to gather more information.

             Node LNN: 1           Node Upgrade State: committed
             Node LNN: 2           Node Upgrade State: upgrade ready
             Node LNN: 3           Node Upgrade State: upgrade ready

Lnns  Device               Old Version             New Version             Status
------------------------------------------------------------------------------------
1     CMC_Yeti             02.07                   02.07                   upgraded
2     CMC_Yeti             02.05                   02.07                   upgrading
3     CMC_Yeti             02.05                   02.07                   -
------------------------------------------------------------------------------------
Total: 36

node1#isi upgrade patches uninstall IsiFw_Package_<versionnumber>.tar

node1# isi upgrade patches list
Patch Name Description Status
-----------------------------

-----------------------------

node1# cd /ifs/data/Isilon_Support


node1# vi PRSisiHealth

node1#perl PRSisiHealth -u 8.0.0.5

Dell EMC Remote Proactive Health Check            0.1163
Live Cluster Analysis                             Wed Nov 29 15:47:39 2017
Cluster Name                                      gucfs38d
Node Count                                        3
Current OneFS Version                             8.0.0.5
Target OneFS Version                              WARN
  WARN: OneFS target version 8.0.0.5 is less than the current OneFS version, performing checks with no target OneFS version.
OneFS Version                                     PASS
Highly Recommended Patches                        PASS
Cluster Capacity                                  PASS
Cluster Health Status                             FAIL
  FAIL: The cluster health is ATTN
  FAIL: Node 1 is reporting ATTENTION
  FAIL: Node 2 is reporting ATTENTION
  FAIL: Node 3 is reporting ATTENTION
  INFO: Refer to KB210505 (https://support.emc.com/kb/210505) for details.
Critical Events                                   FAIL
  FAIL: Critical event 2059 for node 2: External network link ext-1 (igb0) down
  FAIL: Critical event 2061 for node -1: External network link ext-1 (igb0) down
  FAIL: Critical event 2147 for node 3: External network link ext-1 (igb0) down
  FAIL: Critical event 2168 for node 1: External network link ext-1 (igb0) down
  INFO: Refer to KB210506 (https://support.emc.com/kb/210506) for details.
Jobs Status                                       PASS
System Partition Free Space                       PASS
Cluster Services                                  PASS
Processes                                         PASS
Node Uptime                                       PASS (0 days)
Upgrade Status                                    PASS
Hardware Status                                   PASS
BMC/CMC Hardware Monitoring                       PASS
Boot Disks Life Remaining                         PASS
Mirror Status                                     PASS
Memory                                            PASS
Drives Health                                     PASS
Drives Firmware (DFP 1.18/DSP 1.21)               INFO
  INFO: Model                          Firmware   DSP(1.21)  DFP(1.18)  Count Nodes
  INFO: HGST HUSMM1640ASS200           A204       -          -          3     1-3
  INFO: ST2000NM0055-1V4104            BL06       -          -          105   1-3
  INFO: Refer to KB210512 (https://support.emc.com/kb/210512) for details.
Node Firmware (10.1.1)                            PASS
Node Compatibility                                PASS
SmartConnect Service IP                           PASS
Duplicate Gateway Priority                        PASS
SyncIQ                                            PASS
Authentication Status                             PASS
Licenses                                          PASS
Access Zones                                      PASS (1)
Aspera                                            PASS
Cluster Encoding                                  PASS (utf-8)
Time Zone                                         PASS (Asia/Taipei)
DialHome & Remote Connectivity                    INFO
  INFO: Current states:
  INFO:    ConnectEMC is Disabled
  INFO:    ESRS is not enabled
ETAs                                              PASS
BXE Nodes                                         INFO (3)
  INFO: Nodes that have BXE interfaces: 1-3


UPGRADE ISSUE DETECTED

2015年8月24日 星期一

safesync 3.0 憑證匯入後,管理介面無法開啟,故障排除方法



1.In single Server Environment:
a. Login to the sever console and change to root permission
b. Go to the system path : root@appliance1:/opt/SingleInstaller/MgmtUI/SSL
c. List the file to check if there are "mgmt.key" and "mgmt.key " (新的key後面有空白)there.
d. The "mgmt.key " (end with blank character) key file is the new one. So you need to purge the old one and rename it.
e. root@appliance1: rm "mgmt.key"   --->  delete the old key file
f. root@appliance1: mv "mgmt.key " "mgmt.key"  ----> rename the new key file to be correct.
g.root@appliance1: supervisorctl restart mgmtui   -----> restart  management UI service to apply the change.
h.Re-sign in the management console by browser and verify the HTTPs site.

2. In HA Server Environment:
You need to repeat the steps to resolve the issue on the SSFE server which you apply new certificate and key file, and the corresponding SSFE server does not have the defect.
Cheers.


2015年5月19日 星期二

safesync 擴充空間(NFS)

0.1 > NFS (Client side)
Install nfs-common portmap
 # apt-get install nfs-common portmap

Restart portmap
 # service portmap restart

Create a new mount point in server
 # mkdir /storage/mogdata/dev12

Mount new device to that mount point
 # mount -t nfs (serverIP):/tmp /storage/mogdata/dev12

Check the result
 # showmount -e (serverIP)

Change the owner of /storage/mogdata/dev12
 # chown www-data:mogstored  /storage/mogdata/dev12

Change the file mode of /storage/mogdata/dev12, giving group authority of writing.
 # chmod g+w  /storage/mogdata/dev12

To get correct information of usage/free space
# vim /usr/local/share/perl/5.10.1/Mogstored/ChildProcess/DiskUsage.pm
  (line 58: my $rval = `df $gnu_df -l -k $path/$devnum`;
  (change to: my $rval = `df $gnu_df -k $path/$devnum`;
  (remove the param -l)
# /etc/init.d/mogstored restart

# mogadm --trackers=tracker1:6001 device add osdp-store1 12 --status=alive

Check the result
 # mogadm check

Checking trackers...
  tracker1:6001 ... OK

Checking hosts...
  [ 1] osdp-store1 ... OK

Checking devices...
 host device         size(G)    used(G)    free(G)   use%   ob state   I/O%
 ---- ------------ ---------- ---------- ---------- ------ ---------- -----
 [ 1] dev11            7.027      5.590      1.437  79.55%  writeable   0.0
 [ 1] dev12            7.472      0.018      7.454   0.24%  writeable   0.0
[ 1] dev13            7.027      5.510      1.517  78.41%  writeable   0.0
    total:    21.526      11.118      10.408  52.73%



Add NFS mount command on /etc/rc.local (auto mount nfs when reboot)
mount -t nfs (server IP):/tmp /storage/mogdata/dev13

2014年12月25日 星期四

San switch設定檔的備份和復原

1. 設定 FTP 伺服器,其 IP 位址為 192.168.1.101,User:root Password: root

2. 登入 SAN switch 

備份 SAN switch的設定檔
switch:admin> configUpload
Server Name or IP Address [citadel]: 192.168.1.101
User Name [None]: root
File Name [config.txt]: config.txt
Protocol (RSHD or FTP) [FTP]: ftp
Password: xxxxxx
upload complete
 
復原 SAN switch的設定檔
switch:admin> switchDisable
switch:admin> configDownload
Server Name or IP Address [citadel]: 192.168.1.101
User Name [None]: root
File Name [config.txt]: config.txt
Protocol (RSHD or FTP) [FTP]: ftp
Password: xxxxxx
upload complete
switch:admin>switchEnable

2014年3月31日 星期一

AX4 buffer I/O error

AX4在EMC中是最低階的storage,所以他的controller mode 是 active/passive,linux server 開機時載入driver 會對硬體進行測試,測試controller B時,因controller A 活著,B就不會發訊號給linux server,所以會出現buffer I/O error,等multipath  or PowerPath 載入後,我們所分配的lun 是可以正常存取,所以這個錯誤訊息是可以忽略的。


Root Cause
·         Storage arrays in a SAN are generally set up in a redundant fashion such that hosts can access logical units (LUNs) over one of many different paths.  Typically these arrays operate in one of two different modes: active/active or active/passive.  With an active/active array, I/O can be sent down any one of the paths to a LUN and it will be processed by that controller. With active/passive arrays, one controller is considered the primary for each LUN, while the other controller is a backup.  Some of these arrays will accept I/O for a LUN over the backup controller, but it will not be optimized (i.e. worse performance).  However other active/passive arrays will not accept any I/O on the backup controller for a LUN, and thus any commands sent to it will result in an I/O error.
·         In RHEL, there are a number of different commands and utilities that can send I/O to different devices, such as LVM, udev, fdisk, etc, not to mention applications such as databases, web servers, etc.  If any of these were to issue I/O to a passive path on an array that does not accept it, it would cause an I/O error in the logs.  The messages are harmless and do not indicate a problem, but they may fill up the logs or causes unwarranted concern.  As a result, some may wish to try to avoid these errors by preventing applications from accessing the passive paths.  Typically, filtering devices out from LVM will cause the majority of these errors to go away.  Likewise, avoiding commands like 'fdisk -l' that scan all devices can reduce their frequency.  Finally, configuring any user applications that scan or access multiple devices to only access the appropriate active path or the logical multipath device (/dev/mapper/mpath*, /dev/emcpower*, /dev/sddlma*, etc) can cut down on the errors as well.

Resolution

Note: The following applies only to I/O errors caused by accessing passive paths.  See the Root Cause and Diagnostic Steps for more information on determining whether this applies to your environment.
·         One way to cut down on the number of spurious I/O errors in the system logs is to avoid scanning passive paths with LVM commands.  This can be done with a filter in /etc/lvm/lvm.conf that only scans devices from device-mapper-multipath, EMC PowerPath, Hitachi HDLM, or another multipath solution, and avoids the underlying SCSI device nodes. 
·         I/O errors may be caused by any utility or program that accesses passive storage paths, so it may be necessary to configure or run them in such a way that avoids these devices.  For instance, rather than using 'fdisk -l', specify an individual device such as 'fdisk -l /dev/mapper/mpatha.
·         Some storage arrays, such as the EMC Clariion, offer an option to enable a type of active/active mode known as ALUA.  With ALUA, path groups are established with different priorities.  Multipath software such as device-mapper-multipath will recognize these path groups and send I/O to the higher priority paths, but if I/O does end up going down a passive path it may not generate an I/O error.  If your array supports such a mode, enabling it may prevent these I/O errors.  This different access method generally requires a configuration change in the multipath software as well.
Note: I/O errors caused by unintentional access to passive paths are not harmful and should not cause any issues on a system.  They can be safely ignored.

2014年3月17日 星期一

linux 配置新的lun

配置新的lun給linux,下fdisk -l 找不到lun,如遇到此情況可以將HBA卡進行port scan
指令:echo "1" > /sys/class/fc_host/host3/issue_lip (單port,如果是雙port要加掃/host4)
在下fdisk -l 即可看到新配置的lun

2013年12月23日 星期一

iorate

使用方法:
1. vi devices.ior (檔案放在要測試的disk)
 # File system example
 Device = "/tmp/iorate.tst"  capacity 8GB;

 # UNIX raw device example
 #Device = "/dev/rdsk/c901t0d0s2"  capacity 4.1GB;

2. more patterns.ior (查看要測試的R/W參數)
Pattern 10 = "8k Random Read"     io size 8KB   random     read;
Pattern 11 = "8k Seq Write"       io size 8KB   sequential write max sequential
10000;
Pattern 12 = "8k Random Write"    io size 8KB   random     write;
10000;

3. vi tests.ior (測試讀寫8GB,block 8k,R:W=7:3,5分鐘)
Test  1 = " 8k Bus Test"  for 300 sec ignore 30 sec size 8GB 70% pat  10, 30% pat 12;
#Test  2 = " 8k Bus Test"  for 130 sec ignore 30 sec size 1MB 100% pat  9;

#Test  3 = "64k Bus Test"  for 130 sec ignore 30 sec size 1MB 100% pat 21;

4. time dd if=/dev/zero of=/tmp/iorate.tst bs=8k count=1179648 (產生測試檔,必須大於8GB)

5. ./iorate -p patterns.ior (執行I/O測試)

6. ./gen_sums (可把測試出來的檔案轉換成xls)