What is the difference between Data Domain and Avamar?

What is the difference between Data Domain and Avamar? Let’s first look at what duplicate data is. What is duplicate data? There are two ways to understand duplicate data: one is from the perspective of files, the two files are exactly the same. For example, if we send a file via email, we have a copy and others have a copy; the other is from the perspective of data blocks. , file or database, cut into small data blocks, the data blocks are several KB. The data blocks cut are not of fixed length. It will intelligently analyze the data and then cut the data into variable-length blocks. The data blocks cut out from the same file may be 4K, 8K, or 10K, and deduplication operations are performed according to the algorithm. Classification of deduplication technology 1) Divided by the location where the deduplication operation occurs: 1. Backup at the data source (based on the host), deduplication is done before the backup, and then backed up, such as Avamar. 2. Deduplication and online processing (Inline or online processing) are done on the backup side. Data Domain is the representative of this technology. In this method, after the data is read in, it is deduplicated before it is saved to the disk. , that is, backing up and deduplicating at the same time. The advantage of in-line is that it saves disk space and deduplicates data in one step, which is very simple. However, the disadvantage is that it consumes a lot of CPU and takes up a lot of CPU resources, resulting in performance degradation. End users should first figure out where their duplicate data occurs most, and then decide whether to deduplicate data at that location. For example, within an enterprise, the sender sends an email with attachments to all employees, and the data is stored on the host. In this case, host-based deduplication can be used. 2) Divided by the technology used for deduplication: 1. File-level deduplication: To delete duplicate files, hash or byte-by-byte comparison is generally used; such as EMC Celerra's deduplication technology. 2. Fixed block deduplication: First cut the file into fixed-size blocks, and then deduplicate each block, so the requirements on the CPU are very high. 3. Deduplication in variable-sized blocks: Similar to deduplication in fixed blocks, except that the cutting method is more flexible. Both DD and Avamar use this technology. 4. Data compression: It is generally believed that compression and deduplication are two different concepts. In fact, compression is a variable-size bit-level deduplication. EMC Celerra If you use deduplication technology, this feature is turned on by default. Of course, you can also not use compression technology. The difference between Data Domain and Avamar With the above foundation, we officially get to the point. The difference between DD and Avamar is very clear. Avamar EMC Avamar software reduces the amount of backup data at the source location before transferring it across the network and storing it to disk, enabling fast, efficient, and reliable data protection. Unlike traditional solutions, Avamar finds redundant sub-file data segments that exist across all servers, desktops, laptops and offices distributed around the world. Avamar's patented global deduplication technology ensures that backup data segments are stored globally only once. This effectively reduces the total amount of data moved and stored daily to 1/500, and also enables daily full backups over existing LAN/WAN bandwidth to protect their critical data. Avamar uses efficient data deduplication technology to reduce the disk space required to protect primary data to 1/50, thereby extending the time for enterprises to save disk backups. Additionally, by freeing up storage space, the need to add more physical storage is reduced, significantly reducing capital expenditures and operating costs in areas such as floor space, power and cooling. Data Domain Data Domain has unique technology – inline deduplication, online deduplication. Data Domain performs deduplication operations in the CPU memory. The data is not written to the disk first, but is written to SATA through operation. This is the most essential difference from other traditional VTLs in deduplication. Other VTLs are more efficient in deduplication due to their deduplication efficiency. The reason is that the data is first written on the SATA disk, then read out, deduplicated, and then written back. Compared with Data Domain, the pressure on the back-end disk read and write is three times, which is also the advantage of Data Domain.

In EMC's data deduplication technology blueprint, Avamar and Data Domain are given different work goals. Avamar focuses more on the source end and is more inclined to application fields such as VMware virtualization environments, backup servers, and online replication. Its latest progress is EMC has promoted Avamar into the desktop and mobile office fields; Data Domain's work focuses more on the target end, that is, the storage, backup and archiving, and disaster recovery equipment connected to the backend of the business system. Avamar software and Data Domain deduplication storage system are the core of the current EMC deduplication solution, as shown in the figure below. There are quite good reasons for the combination of Avamar and Data Domain deduplication solutions: Avamar is software, and the Data Domain series products are hardware, and the two complement each other; Avamar uses deduplication technology at the data source, and the core of the Data Domain series products It is the deduplication technology on the target side. The two are highly complementary and can form a comprehensive deduplication solution. EMC Avamar 6.x version can use the upgraded version of Data Domain centralized data storage, as well as Avamar's own Data Store (now doubled in capacity). Avamar software deduplicates data files at the source before they are sent to the Data Store over a network link. Traditional deduplication systems are used as target systems, and deduplication operations are performed before the data files arrive at the system (online deduplication, such as Data Domain) or after (post-process deduplication). Data Domain uses a backup server to partially deduplicate backup files sent to the Data Domain system, thereby increasing the speed of deduplication. This is called DD Boost. Because DD Boost is policy-based and built into the Avamar client, Avamar 6.x can use this technology when sending data to Data Domain systems. Avamar 6.x can leverage built-in DD Boost to back up Exchange, Oracle, SharePoint, SQL Server and VMware images to Data Domain targets. The effective capacity of Avamar's own Data Store has doubled to 124TB. By comparison, the DD890 target system has 285TB of usable capacity. Users can send backups of applications supported by DD Boost to Data Domain targets, while backups of other applications will be sent to Avamar Data Store, thereby maximizing overall backup performance and accelerating Avamar clients. Avamar's key advantage: Data Domain is a target-based deduplication appliance that uses a very different approach than Avamar's approach to deduplication, but many people often lump Data Domain together with Avamar as "data deduplication." tool. Avamar has several key advantages over Data Domain, which can be divided into the following four categories: 1. Avamar deduplicates data on the client side—before being transmitted over the network. This speeds up daily full backups by up to 10x while reducing the network bandwidth required each day by up to 500x. Data Domain relies on traditional backup software that moves approximately 200% of the primary data being protected through the network and backup servers each week (very inefficient). So, Avamar solves the backup problems caused by network congestion and backup servers. And Avamar is particularly well-suited to VMware environments because backup data is reduced before being transferred across the underlying physical server infrastructure. Alternatively, smaller remote offices can deploy only the Avamar software agent (no additional hardware required). Data Domain requires additional hardware at each location.

2. Avamar provides server high availability through patented RAIN technology, so customers can confidently and cost-effectively store years of accumulated backups on disk. Data Domain relies on backup servers that can become single points of failure; if the backup server fails, recovering data from the Data Domain appliance is impractical. Additionally, Data Domain appliances can become a performance bottleneck because data is deduplicated inline before being stored to disk. Data Domain also lacks automated daily backup data and server integrity checks. 3. Avamar enables simple one-step recovery, eliminating the need to restore a full backup and then restore subsequent incremental backups to reach the desired recovery point. Avamar also provides centralized management and at-a-glance dashboard monitoring of Avamar environments (this feature is not available with Data Domain). 4. Avamar provides full replication for disaster recovery. Unlike Data Domain, which requires a backup server at the target site for functional disaster recovery, Avamar's replication includes both servers and storage, so it can perform immediate recovery from disk storage at any time. Additionally, Avamar is a software solution that uses an open storage platform and can scale to higher capacities than Data Domain appliances. The combination of Data Domain, Avamar and Networker. At present, EMC will not combine Data Domain and Avamar, but Avamar can be combined with Networker. For example, the Networker backup client now has two options: When the Networker backup client is backing up, you can choose to deduplicate or not.