Disadvantages and limitations of using Data Deduplication techniques

IBM Tivoli Storage Manager offers two types of Data Deduplication Techniques, Server-Side Data Deduplication and Client-Side Data Deduplication. However, like any  other deduplication techniques it also has some limitations and conditions. The following are the limitations provided by ISM to be considered before implementing data deduplication.

TSM Data Deduplication Limitations & Disadvantages

TSM Version Support
Server-side data deduplication is available only with IBM Tivoli Storage Manager V6.1 or later servers. For optimal efficiency when using server-side data deduplication, upgrade to the backup-archive client V6.1 or later. Client-side data deduplication is available only with Tivoli Storage Manager V6.2 or later servers and backup-archive clients V6.2 or later.

Eligible Storage Pools
Data on random-access disk or on tape cannot be deduplicated. Only data in storage pools that are associated with sequential-access disk devices (FILE) can be deduplicated. You must enable FILE storage pools for data deduplication. Client files must be bound to a management class that specifies a deduplication-enabled storage pool.

Also Read: Difference between Server-side and Client-side Deduplication

Encrypted files
The Tivoli Storage Manager server and the backup-archive client cannot deduplicate encrypted files. If an encrypted file is encountered during data deduplication processing, the file is not deduplicated, and a message is logged.

You do not have to process encrypted files separately from files that are eligible for client-side data deduplication. Both types of files can be processed in the same operation. However, they are sent to the server in different transactions. As a security precaution, you can take one or more of the following steps
  • Enable storage-device encryption together with client-side data deduplication.
  • Use client-side data deduplication only for nodes that are secure.
  • If you are uncertain about network security, enable Secure Sockets Layer (SSL).
  • If you do not want certain objects (for example, image objects) to be processed by client-side data deduplication, you can exclude them on the client. If an object is excluded from client-side data deduplication and it is sent to a storage pool that is set up for data deduplication, the object is deduplicated on server.
  • Use the SET DEDUPVERIFICATIONLEVEL command to detect possible security attacks on the server during client-side data deduplication. Using this command, you can specify a percentage of client extents for the server to verify. If the server detects a possible security attack, a message is displayed.
File size
Only files that are more than 2 KB are deduplicated. Files that are 2 KB or less are not deduplicated.

Operations that preempt client-side data deduplication
The following operations take precedence over client-side data deduplication:
  •     LAN-free data movement
  •     Subfile backup operations
  •     Simultaneous-write operations
  •     Server-initiated sessions
Do not schedule or enable any of those operations during client-side data deduplication. If any of those operations occur during client-side data deduplication, client-side data deduplication is turned off, and a message is issued to the error log.

Also Read: What is Directory Container Storagepool ?

Data deduplication of hierarchical storage management data
HSM data from UNIX and Linux clients is ignored by client-side data deduplication. Server-side data deduplication of HSM data from UNIX and Linux clients is allowed. HSM uses the Tivoli Storage Manager application programming interface (API), which can deduplicate client data. Server-side deduplication of HSM data from Microsoft Windows clients is allowed.

Collocation
You can use collocation for storage pools that are set up for data deduplication. However, collocation might not have the same benefit as it does for storage pools that are not set up for data deduplication.

By using collocation with storage pools that are set up for data deduplication, you can control the placement of data on volumes. However, the physical location of duplicate data might be on different volumes. No-query-restore, and other processes remain efficient in selecting volumes that contain non-deduplicated data. However, the efficiency declines when additional volumes are required to provide the duplicate data.

Moving or copying data from a deduplicated storage pool to a non-deduplicated storage pool
When you copy or move data from a deduplicated storage pool to a non-deduplicated storage pool, the data is reconstructed. However, after the data movement or copy operation, the amount of data that is reported as moved or copied is the amount of deduplicated data. For example, suppose that a storage pool contains 20 GB of deduplicated data that represents 50 GB of total file data. If the data is moved or copied, the server reports that 20 GB was moved or copied, even though 50 GB of data was sent.

When you should not enable data deduplication ?

Using Tivoli Storage Manager data deduplication can provide several advantages. However, there are some situations where data deduplication is not appropriate. Those situations are

Also Read: What is Cloud Container Storagepool ?
  • Your primary storage of backup data is on a Virtual Tape Library or physical tape. If regular migration to tape is required, the benefits of using data deduplication are lessened, since the purpose of data deduplication is to reduce disk storage as the primary location of backup data.
  • You have no flexibility with the backup processing window. Tivoli Storage Manager data deduplication processing requires additional resources, which can extend backup windows or server processing times for daily backup activities.
  • Your restore processing times must be fast. Restore performance from deduplicated storage pools is slower than from a comparable disk storage pool that does not use data deduplication. If fast restore performance from disk is a high priority, restore performance benchmarking must be done to determine whether the effects of data deduplication can be accommodated.

0 Comment to "Disadvantages and limitations of using Data Deduplication techniques"

Post a Comment