Storage Knowledge Base : Replacing drive in netapp

The NetApp filer in the lab recently encountered a failed disk. With the failed disk confirmed dead and removed, and the replacement disk added this is how the disk is replaced

fas3050clow*> disk assign 0a.29
disk 0a.29 (S/N 3HY0T1GG00007342W9NJ) is already owned by system cr2conffd03 (ID
84173417).
disk assign: Assign failed for one or more disks in the disk list.

Detour. The following parsed output confirmed this disk had ownership information from a previous filer in its DNA:

fas3050clow*> disk show -a
DISK       OWNER                  POOL   SERIAL NUMBER
———— ————-          —– ————-
0a.29        cr2conffd03(84173417)   Pool0 3HY0T1GG00007342W9NJ

Quick help from the community set me in the right direction. A few commands accomplished the required task:

fas3050clow*> priv set advanced
fas3050clow*> disk assign 0a.29 -s unowned -f
Note: Disks may be automatically assigned to this node, since option disk.auto_a
ssign is on.
fas3050clow*> disk assign 0a.29
Thu May 13 13:30:56 CDT [fas3050clow: diskown.changingOwner:info]: changing owne
rship for disk 0a.29 (S/N 3HY0T1GG00007342W9NJ) from unowned (ID -1) to fas3050c
low (ID 101175198)
Thu May 13 13:30:56 CDT [fas3050clow: HTTPPool00:warning]: HTTP XML Authenticati
on failed from 192.168.110.71.
fas3050clow*> Thu May 13 13:30:56 CDT [fas3050clow: diskown.RescanMessageFailed:
warning]: Could not send rescan message to fas3050clow. Please type disk show on
the console of fas3050clow for it to scan the newly inserted disks.
Thu May 13 13:30:56 CDT [fas3050clow: raid.assim.label.upgrade:info]: Upgrading
RAID labels.
Thu May 13 13:30:57 CDT [fas3050clow: disk.fw.downrevWarning:warning]: 1 disks h
ave downrev firmware that you need to update.
Thu May 13 13:31:00 CDT [fas3050clow: monitor.globalStatus.ok:info]: The system’
s global status is normal.

Shortly after, the firmware on the replacement disk was automatically upgraded:

Thu May 13 13:31:18 CDT [fas3050clow: dfu.firmwareDownloading:info]: Now downloa
ding firmware file /etc/disk_fw/X274_SCHT6146F10.NA16.LOD on 1 disk(s) of plex [
Pool0]…

I confirmed via NetApp System Manager (my GUI crutch), that the replaced disk is now a spare for the two aggregates configured on/owned by the head. I then updated the storage array spreadsheet I maintain which tracks disks, spares, arrays, luns, aggregates, volumes, exports, groups, pools, etc. for the various lab storage.

One additional item I learned from a NetApp Engineer is that spares are not to remain static. Rather, the role is designed to float around to different disks as failures can and will occur. This is a habit I’m learning to break which contradicts management of older storage arrays where spares instantiated to active duty were later deactivated when a failed disk was replaced.

Don’t forget to exit privileged mode when done:

fas3050clow*> priv set

Storage Knowledge Base

Tuesday, August 6, 2013

Replacing drive in netapp

No comments:

Post a Comment

About Me