Thomas 筆記

2018年1月4日星期四

isilon firmware upgrade

Before you install a node firmware package, make sure that you have the necessary
free space in your /var directory.
In order to successfully install a node firmware package, you must have at least
250MB of free space in the /var directory of every node you are working on.
You can find the free space in the /var directory by running the following command:
df -h

1. Download the new firmware package.
a. Visit EMC online support and download the latest firmware package.
b. Open a secure shell (SSH) connection to any node in the cluster and log in using
the "root" account.
c. Copy the firmware package to the /ifs/data directory on the cluster.
2. Install the firmware package. Depending on your version of OneFS, run one of the
following commands:
OneFS 8.0 or later
isi upgrade patches install IsiFw_Package_<versionnumber>.tar
Earlier than OneFS 8.0
isi pkg install IsiFw_Package_<version-number>.tar
The cluster displays a message stating that the firmware package was successfully
installed.

After the node restarts, confirm that the NVRAM firmware matches the installed
firmware package. Depending on your version of OneFS, run one of the following
commands:
OneFS 8.0 or later
isi upgrade cluster firmware devices
Earlier than OneFS 8.0
isi firmware status
If the NVRAM firmware still does not match the installed firmware package, you must
run the firmware update a second time. Depending on your version of OneFS, run one
of the following commands:
OneFS 8.0 or later
isi upgrade cluster firmware start
Earlier than OneFS 8.0
isi firmware update
update status
node1# while :; do clear; date; isi upgrade view; isi up no li|grep "LNN\|State"|paste - -; echo; isi up no fi pr li | awk '{if (($4 != "-" && $4 != $3) || ($5 != "-" && $4 != "-")) print}'; sleep 30; done

Wed Nov 29 15:22:08 CST 2017

Upgrade Status:

Cluster Upgrade State: committed
Current Upgrade Activity: Firmware
Upgrade Start Time: 2017-11-28T14:18:45
Upgrade Finished Time: 2017-11-28T14:28:25
Current OS Version: 8.0.0.5_build(81)style(5)
Upgrade OS Version: N/A

Nodes Progress:

Total Cluster Nodes: 3
Nodes On Older OS: 3
Nodes Upgraded: 0
Nodes Transitioning/Down: 0

If there are any errors please run "isi_upgrade_logs" to gather more information.

Node LNN: 1 Node Upgrade State: committed
Node LNN: 2 Node Upgrade State: upgrade ready
Node LNN: 3 Node Upgrade State: upgrade ready

Lnns Device Old Version New Version Status
------------------------------------------------------------------------------------
1 CMC_Yeti 02.07 02.07 upgraded
2 CMC_Yeti 02.05 02.07 upgrading
3 CMC_Yeti 02.05 02.07 -
------------------------------------------------------------------------------------
Total: 36

node1#isi upgrade patches uninstall IsiFw_Package_<versionnumber>.tar

node1# isi upgrade patches list
Patch Name Description Status
-----------------------------

-----------------------------

node1# cd /ifs/data/Isilon_Support

node1# vi PRSisiHealth

node1#perl PRSisiHealth -u 8.0.0.5

Dell EMC Remote Proactive Health Check 0.1163
Live Cluster Analysis Wed Nov 29 15:47:39 2017
Cluster Name gucfs38d
Node Count 3
Current OneFS Version 8.0.0.5
Target OneFS Version WARN
WARN: OneFS target version 8.0.0.5 is less than the current OneFS version, performing checks with no target OneFS version.
OneFS Version PASS
Highly Recommended Patches PASS
Cluster Capacity PASS
Cluster Health Status FAIL
FAIL: The cluster health is ATTN
FAIL: Node 1 is reporting ATTENTION
FAIL: Node 2 is reporting ATTENTION
FAIL: Node 3 is reporting ATTENTION
INFO: Refer to KB210505 (https://support.emc.com/kb/210505) for details.
Critical Events FAIL
FAIL: Critical event 2059 for node 2: External network link ext-1 (igb0) down
FAIL: Critical event 2061 for node -1: External network link ext-1 (igb0) down
FAIL: Critical event 2147 for node 3: External network link ext-1 (igb0) down
FAIL: Critical event 2168 for node 1: External network link ext-1 (igb0) down
INFO: Refer to KB210506 (https://support.emc.com/kb/210506) for details.
Jobs Status PASS
System Partition Free Space PASS
Cluster Services PASS
Processes PASS
Node Uptime PASS (0 days)
Upgrade Status PASS
Hardware Status PASS
BMC/CMC Hardware Monitoring PASS
Boot Disks Life Remaining PASS
Mirror Status PASS
Memory PASS
Drives Health PASS
Drives Firmware (DFP 1.18/DSP 1.21) INFO
INFO: Model Firmware DSP(1.21) DFP(1.18) Count Nodes
INFO: HGST HUSMM1640ASS200 A204 - - 3 1-3
INFO: ST2000NM0055-1V4104 BL06 - - 105 1-3
INFO: Refer to KB210512 (https://support.emc.com/kb/210512) for details.
Node Firmware (10.1.1) PASS
Node Compatibility PASS
SmartConnect Service IP PASS
Duplicate Gateway Priority PASS
SyncIQ PASS
Authentication Status PASS
Licenses PASS
Access Zones PASS (1)
Aspera PASS
Cluster Encoding PASS (utf-8)
Time Zone PASS (Asia/Taipei)
DialHome & Remote Connectivity INFO
INFO: Current states:
INFO: ConnectEMC is Disabled
INFO: ESRS is not enabled
ETAs PASS
BXE Nodes INFO (3)
INFO: Nodes that have BXE interfaces: 1-3

UPGRADE ISSUE DETECTED

2015年8月24日星期一

safesync 3.0 憑證匯入後，管理介面無法開啟，故障排除方法

1.In single Server Environment:

a. Login to the sever console and change to root permission

b. Go to the system path : root@appliance1:/opt/SingleInstaller/MgmtUI/SSL

c. List the file to check if there are "mgmt.key" and "mgmt.key " (新的key後面有空白)there.

d. The "mgmt.key " (end with blank character) key file is the new one. So you need to purge the old one and rename it.

e. root@appliance1: rm "mgmt.key" ---> delete the old key file

f. root@appliance1: mv "mgmt.key " "mgmt.key" ----> rename the new key file to be correct.

g.root@appliance1: supervisorctl restart mgmtui -----> restart management UI service to apply the change.

h.Re-sign in the management console by browser and verify the HTTPs site.

2. In HA Server Environment:

You need to repeat the steps to resolve the issue on the SSFE server which you apply new certificate and key file, and the corresponding SSFE server does not have the defect.

Cheers.

2015年5月19日星期二

safesync 擴充空間(NFS)

0.1 > NFS (Client side)
Install nfs-common portmap
# apt-get install nfs-common portmap

Restart portmap
# service portmap restart

Create a new mount point in server
# mkdir /storage/mogdata/dev12

Mount new device to that mount point
# mount -t nfs (serverIP):/tmp /storage/mogdata/dev12

Check the result
# showmount -e (serverIP)

Change the owner of /storage/mogdata/dev12
# chown www-data:mogstored /storage/mogdata/dev12

Change the file mode of /storage/mogdata/dev12, giving group authority of writing.
# chmod g+w /storage/mogdata/dev12

To get correct information of usage/free space
# vim /usr/local/share/perl/5.10.1/Mogstored/ChildProcess/DiskUsage.pm
(line 58: my $rval = `df $gnu_df -l -k $path/$devnum`;
(change to: my $rval = `df $gnu_df -k $path/$devnum`;
(remove the param -l)
# /etc/init.d/mogstored restart

# mogadm --trackers=tracker1:6001 device add osdp-store1 12 --status=alive

Check the result
# mogadm check

Checking trackers...
tracker1:6001 ... OK

Checking hosts...
[ 1] osdp-store1 ... OK

Checking devices...
host device         size(G)    used(G)    free(G)   use%   ob state   I/O%
---- ------------ ---------- ---------- ---------- ------ ---------- -----
[ 1] dev11            7.027      5.590      1.437 79.55% writeable   0.0
[ 1] dev12            7.472      0.018      7.454   0.24% writeable   0.0
[ 1] dev13            7.027      5.510      1.517 78.41% writeable   0.0
    total:    21.526      11.118      10.408 52.73%

Add NFS mount command on /etc/rc.local (auto mount nfs when reboot)
mount -t nfs (server IP):/tmp /storage/mogdata/dev13

2014年12月25日星期四

San switch設定檔的備份和復原

1. 設定 FTP 伺服器，其 IP 位址為 192.168.1.101，User:root Password: root

2. 登入 SAN switch

備份 SAN switch的設定檔
switch:admin> configUpload
Server Name or IP Address [citadel]: 192.168.1.101
User Name [None]: root
File Name [config.txt]: config.txt
Protocol (RSHD or FTP) [FTP]: ftp
Password: xxxxxx
upload complete

復原 SAN switch的設定檔
switch:admin> switchDisable
switch:admin> configDownload
Server Name or IP Address [citadel]: 192.168.1.101
User Name [None]: root
File Name [config.txt]: config.txt
Protocol (RSHD or FTP) [FTP]: ftp
Password: xxxxxx
upload complete
switch:admin>switchEnable

2014年3月31日星期一

AX4 buffer I/O error

AX4在EMC中是最低階的storage，所以他的controller mode 是 active/passive，linux server 開機時載入driver 會對硬體進行測試，測試controller B時，因controller A 活著，B就不會發訊號給linux server，所以會出現buffer I/O error，等multipath or PowerPath 載入後，我們所分配的lun 是可以正常存取，所以這個錯誤訊息是可以忽略的。

Root Cause

· Storage arrays in a SAN are generally set up in a redundant fashion such that hosts can access logical units (LUNs) over one of many different paths. Typically these arrays operate in one of two different modes: active/active or active/passive. With an active/active array, I/O can be sent down any one of the paths to a LUN and it will be processed by that controller. With active/passive arrays, one controller is considered the primary for each LUN, while the other controller is a backup. Some of these arrays will accept I/O for a LUN over the backup controller, but it will not be optimized (i.e. worse performance). However other active/passive arrays will not accept any I/O on the backup controller for a LUN, and thus any commands sent to it will result in an I/O error.

· In RHEL, there are a number of different commands and utilities that can send I/O to different devices, such as LVM, udev, fdisk, etc, not to mention applications such as databases, web servers, etc. If any of these were to issue I/O to a passive path on an array that does not accept it, it would cause an I/O error in the logs. The messages are harmless and do not indicate a problem, but they may fill up the logs or causes unwarranted concern. As a result, some may wish to try to avoid these errors by preventing applications from accessing the passive paths. Typically, filtering devices out from LVM will cause the majority of these errors to go away. Likewise, avoiding commands like 'fdisk -l' that scan all devices can reduce their frequency. Finally, configuring any user applications that scan or access multiple devices to only access the appropriate active path or the logical multipath device (/dev/mapper/mpath*, /dev/emcpower*, /dev/sddlma*, etc) can cut down on the errors as well.

Resolution

Note: The following applies only to I/O errors caused by accessing passive paths. See the Root Cause and Diagnostic Steps for more information on determining whether this applies to your environment.

· One way to cut down on the number of spurious I/O errors in the system logs is to avoid scanning passive paths with LVM commands. This can be done with a filter in /etc/lvm/lvm.conf that only scans devices from device-mapper-multipath, EMC PowerPath, Hitachi HDLM, or another multipath solution, and avoids the underlying SCSI device nodes.

· I/O errors may be caused by any utility or program that accesses passive storage paths, so it may be necessary to configure or run them in such a way that avoids these devices. For instance, rather than using 'fdisk -l', specify an individual device such as 'fdisk -l /dev/mapper/mpatha.

· Some storage arrays, such as the EMC Clariion, offer an option to enable a type of active/active mode known as ALUA. With ALUA, path groups are established with different priorities. Multipath software such as device-mapper-multipath will recognize these path groups and send I/O to the higher priority paths, but if I/O does end up going down a passive path it may not generate an I/O error. If your array supports such a mode, enabling it may prevent these I/O errors. This different access method generally requires a configuration change in the multipath software as well.

Note: I/O errors caused by unintentional access to passive paths are not harmful and should not cause any issues on a system. They can be safely ignored.

2014年3月17日星期一

linux 配置新的lun

配置新的lun給linux，下fdisk -l 找不到lun，如遇到此情況可以將HBA卡進行port scan
指令:echo "1" > /sys/class/fc_host/host3/issue_lip (單port，如果是雙port要加掃/host4)
在下fdisk -l 即可看到新配置的lun

2013年12月23日星期一

iorate

使用方法：
1. vi devices.ior (檔案放在要測試的disk)
# File system example
Device = "/tmp/iorate.tst" capacity 8GB;

# UNIX raw device example
#Device = "/dev/rdsk/c901t0d0s2" capacity 4.1GB;

2. more patterns.ior (查看要測試的R/W參數)
Pattern 10 = "8k Random Read" io size 8KB random read;
Pattern 11 = "8k Seq Write" io size 8KB sequential write max sequential
10000;
Pattern 12 = "8k Random Write" io size 8KB random write;
10000;

3. vi tests.ior (測試讀寫8GB，block 8k，R:W=7:3，5分鐘)
Test 1 = " 8k Bus Test" for 300 sec ignore 30 sec size 8GB 70% pat 10, 30% pat 12;
#Test 2 = " 8k Bus Test" for 130 sec ignore 30 sec size 1MB 100% pat 9;

#Test 3 = "64k Bus Test" for 130 sec ignore 30 sec size 1MB 100% pat 21;

4. time dd if=/dev/zero of=/tmp/iorate.tst bs=8k count=1179648 (產生測試檔，必須大於8GB)

5. ./iorate -p patterns.ior (執行I/O測試)

6. ./gen_sums (可把測試出來的檔案轉換成xls)

2018年1月4日 星期四