Deduplication research papers

When evaluating data deduplication, it's important to trial vendors' products in your environment with your own data over several backup cycles to determine a product's impact on your backup/recovery environment. The focus of selecting a product should be less on reduction ratios as a decision factor. ESG research (ESG Research Report, "Data Protection Market Trends," January 2008) found that, not surprisingly, the cost of the deduplication solution was the most frequently cited factor (although savings garnered from capacity reduction often overcome financial objections to deploying deduplication). Otherwise, the survey data suggests that ease of deployment and ease of use, as well as the impact on backup/recovery performance were important considerations -- more so than technical implementations, such as the deduplication ratio.

Your explanation of the difference between ‘size’ and ‘size on disk’ for the Program Files folder is incorrect. The ‘size’ value will not decrease due to hard links from deduplication. Quite the opposite actually; the ‘size on disk’ will become smaller than ‘size’. What you’re seeing with the Program Files folder is actually due to the allocation unit size of the volume and a certain number of files smaller than that unit size consuming a full unit. Thus the ‘ GB’ of data are actually needing ‘ GB’ to be stored on that volume.

If the disks are connected to a RAID controller, it is most efficient to configure it as a HBA in JBOD mode (. turn off RAID function). If a hardware RAID card is used, ZFS always detects all data corruption but cannot always repair data corruption because the hardware RAID card will interfere. Therefore, the recommendation is to not use a hardware RAID card, or to flash a hardware RAID card into JBOD/IT mode. For ZFS to be able to guarantee data integrity, it needs to either have access to a RAID set (so all data is copied to at least two disks), or if one single disk is used, ZFS needs to enable redundancy (copies) which duplicates the data on the same logical drive. Using ZFS copies is a good feature to use on notebooks and desktop computers, since the disks are large and it at least provides some limited redundancy with just a single drive.

Deduplication research papers

deduplication research papers


deduplication research papersdeduplication research papersdeduplication research papersdeduplication research papers