Deduplication refers to the elimination of redundant data in the storage. In the deduplication process, duplicate data is deleted, leaving only one copy of the data to be stored. However, indexing of all data is still retained should that data ever be required. De-duplication is able to reduce the required storage capacity since only the unique data is stored.
Netapp supports deduplication where only unique blocks in the flex volume is stored and it creates a small amount of additional metadata in the dedup process. The NetApp deduplication technology allows duplicate 4KB blocks anywhere in the flexible volume to be deleted and stores a unique one.
The core enabling technology of deduplication is fingerprints. These are unique digital signatures for every 4KB data block in the flexible volume.
When deduplication runs for the first time on a flexible volume with existing data, it scans the blocks in the flexible volume and creates a fingerprint database, which contains a sorted list of all fingerprints for used blocks in the flexible volume. After the fingerprint file is created, fingerprints are checked for duplicates and if found, first a byte-by-byte comparison of the blocks is done to make sure that the blocks are indeed identical. If they are found to be identical, the block’s pointer is updated to the already existing data block and the duplicate data block is released and inode is updated.
Netapp Deduplication commands:
- Enable dedup (asis) license.fractal-design> sis on /vol/demovol
- If you have a new flex volume which was just created, follow this step to enable ASIS deduplicationfractal-design> sis on /vol/demovol
Deduplication for “/vol/demovol” is enabled.
Already existing data could be processed by running “sis start -s /vol/demovol”
- If you have already existing flex volume with data in it, follow this step.fractal-design> sis start -s /vol/demovol
- Checking the status of deduplication.
fractal-design> vol status demovol
Volume State Status Options
VolArchive online raid_dp, flex nosnap=on
Containing aggregate: ‘aggr0′
fractal-design> sis status /vol/demovol
Path State Status Progress
/vol/demovol Enabled Idle Idle for 00:02:12
- Check the storage space saved due to deduplication
fractal-design> df -s /vol/demovol
Filesystem used saved %saved
/vol/demovol/ 9316052 0 0%
- If you have to run deduplication at a later point of time on this volume, just do a “sis start /vol/demovol”.
- The sis can be scheduled using “sis config” command.