Use the Windows Data Deduplication to Free Up Disk Space

Using the Windows Data Deduplication

There are many ways to deduplicate data, the result, however, should always be the same: Free up disk space by eliminating data stored twice (or more) on the storage medium. Microsoft offers a powerful feature to save server space on Windows Storage Server 2012, Windows Server 2012 R2, or Windows Server 2012: the Windows data deduplication.

Even though storage space is cheap, administrators should consider data deduplication to optimize server space usage. However, administrators should not combine the methods we present in this article. We recommend using either block-based or file-based deduplication.

Deduplicate Data

The Windows data deduplication works block-based (as opposed to file-based deduplication). This is a short summary of the process:

  • The deduplication splits all files selected for the process into smaller chunks and compares these chunks.
  • It removes duplicate chunks until only one chunk remains.
  • The data deduplication then places a reference to the remaining chunk.
  • Afterwards it organizes the chunks into container files located in the System Volume Information folder.
  • Each file is now a collection of pointers to different chunks.

Since deduplicated files and chunk storage require less disk space than unoptimized files, this frees up server space.

If a user access a file, the file system assembles it in the background. For the user nothing changes: They can still access and work with all files – there is no noticeable difference in behavior.

Learn more about the Windows data deduplication here.

Other Forms of Deduplication

TreeSize and SpaceObServer offer a different way to deduplicate data: file-based deduplication.

Both disk space managers scan storage media to find copied data (for example by comparing file names or hash values). If multiple instances of a file are found on the same partition, TreeSize and SpaceObServer can replace these duplicates by NTFS hard links (several instances of the same content). If files are stored on different partitions, administrators can decide to simply keep them or opt to replace them with symbolic links.

Again, users will see no difference. They can access or modify the files just like before. The only change: More free disk space on the computer or server.