Thursday, August 22, 2013

Netapp – Common Deduplication Misconfiguration

Netapp deduplication misconfigurations are quite common, and nothing to be ashamed of. Netapps have a lot of features to work with, as does the tech world as a whole. Getting all these parts working together well sometimes is tricky. Implementing a simple dedup schedule in your environment can save you time and most importantly disk. But there are some configurations which are not optimal and need to be planned out carefully. This is a quick high level look at what some of these misconfigurations are.

Never Enabling Deduplication on VMWare Implementations!

First big one is in fact never turning on dedupe for VMWare workloads when the systems are initially setup. Also for getting the -s or scan option. It is recommended that dedupication be enabled on all VMware configurations. When using the Virtual Storage Console (VSC) plugin for vCenter for the creation of a VMware datastores, the plugin always enables deduplication. It is strongly recommended that dedupe be enabled right away for an assortment of reasons. When you enable deduplication on a NetApp volume, the controller starts tracking any new blocks that are written to that volume. And following a scheduled deduplication pass the controller will look at those new blocks and eliminate any duplicates.
But if you managed to already have some VM’s on that volume before deduplication was enabled, then those VM’s will never be examined or dedupedlicated. This will result in very poor deduplication. There still exists a simple solution. The administrator can start a deduplication pass using the VSC with a “scan” option enabled.  This can also be done from the command line with the “-s” switch.
Fix is to Scan a volume using the -s flag
 > sis start -s /vol/my_vol_2b_deduped

Misaligned VMware VW’s

Misaligned VM’s with differing operating systems on the same underlying storage subsystem are well known to cause poor deduplication results. This has been well documented. Misaligned VMware VM guests can cause lower than expected deduplication results. At a low level, if the starting offset of one of the guest VM operating system types is different than the starting offset of the other, then almost no blocks will align. This will not deduplicate effectively. Not only will your deduplications be less efficient, there will also be more load on your storage controller. Not just the NetApp controller, any storage array controller will be affected. There is a free tool for NetApp customers called MBRalign which is part of the VSC. It will help remedy this problem.As you align your VM’s, you will see your deduplication savings rise and your array controllers workload decrease.

LUN reservations provisioning

Over the years thin Provisioning has been beaten down and has had a bad reputation. Not necessarilly justified though.  NetApp controllers have multiple levels of reservations, and depending on the requirements with respect to VMware. One type of reservation is volume reservation. Volume reservation reserves space away from the large storage pool which is the aggregate. It insures whatever object you place onto that volume has the space it needs. We create a LUN for VMware using this volume. In some cases, storage admins reserve space for the LUN which in turn removes the space away from the available space in the volume. But there is no need to do this. You have already reserved the space with the volume reservation, therefore there is no need to reserve the space again with the LUN reservation. If you use LUN reservation, it means that the unused space in the LUN will aways consume the space reserved. That is, a 500GB LUN with space reservation turned on will consume 500 GB of space with no data in it. Deduplicating space reserved on the LUN will win you some space from the data that was consumed but the unused space will remain reserved.
Here is an interesting working example
600GB LUN on a 1 TB Volume  - 600GB LUN is reserved with no data on LUN. 
Volume shows 600 GB 
Add 270 GB data onto LUN
Volume still shows 600 GB
Dedpue 270 GB of data down to 100 GB
Volume reports 430GB used since it reclaimed 170 GB from the operation.
Remove LUN reservation and data only takes up 100 GB and volume reports 900 GB free
Simply by removing the LUN reservation will result in an actual saving from deduplication. This can be done on a live volume with VMs running. Once the final deduplication savings are visible, within the 60-70% range, the volume size can be adjusted to meet the actual use of the LUN, based on the actual amount of data on the LUN. BTW, like all things Netapp, volumes can be resized on a live system too.

 Large amounts of data in the VMs

As a desgin option, there are times when the VMDK files are used for boot and data. Not necessarily a misconfiguration, it is more of a design option. It is a much simpler design option to keep data and boot on one VM in a single folder. With this configuration, systems are still able to achieve high deduplication ratios when application data is mixed with the operating system data blocks. The problem arises when there are large data files. Large files like the ones used with databases or image repositories or mailboxes for mail servers.  The large data files do not deduplicate very well and lower the efficiency. The NetApp will deduplicate the operating system blocks as well as the data blocks around these large sections of used blocks

No comments:

Post a Comment