Data Deduplication in Windows 8.1

I got a hot tip from my friend regarding the Data Deduplication feature available (only) in Windows Server 2012 (R2) (Thanks Mats 🙂 ). With some small tricks you’re now able to use this cool feature in Windows 8/8.1. I’m currently using a lot of disk space for virtual machines and iso-files, so for me this is (was) the best thing ever. With deduplication enabled, I’m now able to free up a lot of my precious disk space. Yes A LOT. You’ll love it on your small SSD. So, what is Data Deduplication you may ask. Here’s a short answer to that:

Quote

“Data deduplication involves finding and removing duplication within data without compromising its fidelity or integrity. The goal is to store more data in less space by segmenting files into small variable-sized chunks (32–128 KB), identifying duplicate chunks, and maintaining a single copy of each chunk. Redundant copies of the chunk are replaced by a reference to the single copy. The chunks are compressed and then organized into special container files in the System Volume Information folder.

Source: http://technet.microsoft.com/en-us/library/hh831602.aspx

//Quote

And while you’re at it, also have a look at http://blogs.technet.com/b/filecab/archive/2012/05/21/introduction-to-data-deduplication-in-windows-server-2012.aspx

 

The GOOD…

I have to say that after finding out that Deduplication was doable on windows 8.1, it wasn’t that hard finding information on HOW to do it. I’ll start by linking to the original source as usual, thanks to the author for the guide!

http://weikingteh.wordpress.com/2013/01/15/how-to-enable-data-deduplication-in-windows-8/

Well, not much to say really. I followed the guide (for Win 8.1) and it worked 🙂 Here are some screenshots of the process:

deduplication1

Fig 1. Adding Deduplication packages and enabling the Deduplication feature.

 

deduplication_status

Fig 2. Enabling and starting Deduplication job on the D: –drive. Also checking the status with Get-DedupJob (currently at 0% because the process just started).

 

My D: –drive consist of 2 x Western Digital Black 2TB 7200 RPM drives configured in software raid-0 (stripe). The Deduplication job ran forever the first time (4.5h), but it was well worth the wait – check the before and after screenshots below;

before_deduplication

Fig 3. Before Deduplication (1.54TB free on D: –drive).

 

after_deduplication

  Fig 4. After Deduplication (2.77TB free on D: –drive 🙂 )

 

and the same thing checked with Get-DedupStatus:

get-dedupstatus

Fig 5. Get-DedupStatus

 

As you can see, the space savings are HUGE (1.25TB saved space). All I can say is that I’d recommend Deduplication for everyone. Now the same procedure is waiting for my E: –drive (a 512GB SSD) 🙂 And remember folks, do NOT run/use Data Deduplication on your system drive! You have been warned.

For some more Deduplication PowerShell commands, have a look at: http://technet.microsoft.com/en-us/library/hh831434.aspx

For information about Deduplication and backups, have a look at: https://social.technet.microsoft.com/Forums/windowsserver/en-US/ff22c1f5-f704-4173-9aed-e57ad4d3508d/data-deduplication-and-file-backup?forum=winserverfiles

 

…and the BAD

Well, all of this seemed too good to be true. And for me, apparently it was. I didn’t notice any slowdowns in normal usage, but when I started playing around with my virtual machines things started getting slow. It took FOREVER to load a virtual machine from a suspended state for example. My guess is that Microsoft’s version of Deduplication is best suited for Hyper-V, NOT VMware. (They even have an option for it in the Hyper-V settings). What a shame. For now, it’s not usable and I decided to disable it altogether. Luckily this was an easy (but slooooow) process. It took me about 7-8h to De-Deduplicate my 4TB Raid-0 striped volume. Steps for reversing the deduplication process:

In PowerShell:

Start DedupJob -Type Unoptimization -Volume D:

That’s it. No other command was needed (even though I found some articles saying that I should also disable deduplication and clean up the garbage collection after unoptimization). This was however unnecessary/impossible;

PS C:\> Start-DedupJob -Volume D: -Type GarbageCollection
Start-DedupJob : MSFT_DedupVolume.Volume=’D:’ – HRESULT 0x80565323, The specified volume is not enabled for deduplication.
At line:1 char:1
+ Start-DedupJob -Volume D: -Type GarbageCollection
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : NotSpecified: (MSFT_DedupJob:ROOT/Microsoft/…n/MSFT_DedupJob) [Start-DedupJob], CimException
+ FullyQualifiedErrorId : HRESULT 0x80565323,Start-DedupJob

 

PS C:\> Disable-DedupVolume D:
Disable-DedupVolume : MSFT_DedupVolume.Volume=’D:’ – HRESULT 0x80565323, The specified volume is not enabled for deduplication.
At line:1 char:1
+ Disable-DedupVolume D:
+ ~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo         : NotSpecified: (MSFT_DedupVolume:ROOT/Microsoft/…SFT_DedupVolume) [Disable-DedupVolume], CimException
+ FullyQualifiedErrorId : HRESULT 0x80565323,Disable-DedupVolume

 

Finally I also removed the Deduplication (and File Server) feature from Windows Features (in “Turn Windows features on or off”). After a test run I can confirm that my virtual machines now start and resume more rapidly again.

Sources:

http://nickwhittome.com/2014/10/01/disabling-data-deduplication-on-windows-server-2012r2/
http://jeffwouters.nl/index.php/2012/01/disk-deduplication-in-windows-8-explained-from-a-to-z/
https://translate.google.com/translate?sl=auto&tl=en&js=y&prev=_t&hl=en&ie=UTF-8&u=http%3A%2F%2Fwww.amaxing.de%2Fwindows-server-2012-deduplizierung-der-festplatten-rueckgaengig-machen%2F2014%2F11%2F22%2F&edit-text=

 

Final words

Unfortunately, luck wasn’t completely on my side with Deduplication 😦

However, I’d still recommend deduplication for people running low on disk space (using non-disk intensive stuff). A good place for deduplication would probably be your non-system stash-disk filled with music, photos, videos and so on.

That said, happy deduplication after all 🙂

Advertisements