Creating copy of Data is known as backup.
Why do we need Backup ?
Backups are needed as an insurance policy against loss of data, which can occur because of
Hardware failures
Human error
Application failures
Security breaches, such as hackers or viruses
High-availability storage arrays have reduced the need to recover data because of hardware failures. Hardware availability features can protect data from loss due to hardware failures. however, these availability features cannot protect against the other factors that can result in loss of data. Backups are sometimes used as an archive for instance, government regulations require that certain financial data must be kept for a specific number of years. In this context, a backup also becomes an archive.
Backup Architecture
Here we are going to see about Traditional backup architectures .Traditional backup architectures have two methods they are
Here we are going to see about Traditional backup architectures .Traditional backup architectures have two methods they are
Direct-attached backups
LAN-based backups
Direct-attached backups
Many organizations started with a simple backup infrastructure called direct-attached. This topology is also sometimes referred to as host-based or server-tethered backup. Each backup client has dedicated tape devices. Backups are performed directly from a backup client’s disk to a backup client’s tape devices.
Advantages
The key advantage of direct-attached backups is speed. The tape devices can operate at the speed of the channels. Direct-attached backups optimize backup and restore speed, since the tape devices are close to the data source and dedicated to the host.
Disadvantages
Disadvantages
- Large numbers of tape devices might be underutilized.
- A wide variety of backup media might be in use.
- Operators could find it difficult to manage tape. Tape devices might be scattered between floors, buildings, or entire metropolitan areas.
- Each server might have unique (and possibly locally created) backup processes and tools, which can complicate backup management and operation.
- It might be difficult to determine if everything is being backed up properly.
- Dispersed backups, multiple media types, diverse tools, and operational complexity can challenge the task of business continuance recovery
LAN-based backups
Lan Based Backup process overview
The backup process is as follows:
1. The metadata server invokes backup client processes on the backup client.2. The tape control server places tapes into the tape drives.
3. The backup client determines which files require backup.
4. The backup client reads the backup data from disk and writes the backup data to the LAN.
5. The tape control server reads the backup data from the LAN and writes the backup data to the tape.
6. The backup client and the tape control servers sends metadata information to the metadata server, including what was backedup and which tapes the backups used.
7. The metadata server stores the metadata on disk.
The key advantages of LAN-based backups compared to direct-attached backups are:
Reduced costs — Pooling tape resources improves tape device utilization and reduces the number of tape drives required, which also results in fewer host bus adapters. Some small servers may require backups; because of tape drive cost or limited card slot availability, however, it might not be practical to dedicate a tape drive to one of these systems. LAN backups can address these issues.
Improved management and operability — Centralized backups reduce management complexity; there are fewer resources to manage, and they are all in one place.Centralizing tape resources into tape control servers improves the productivity of the operations staff, especially when backup clients are scattered across floors of a building, campuses, or cities. Operability can be improved further by utilizing automated, robotic tape libraries.
Disadvantages
- Backups impact the host and the application.
- A LAN-based backup adds two additional data movement steps.
- Backups consume host I/O bandwidth, memory, LAN, and CPU resources.
- There could be network issues.
- A LAN-based backup might require dedicated media servers.
- There could be restore and cloning issues.
Additional data movement steps
LAN backups require two additional data movement steps to put the backup data on tape, as illustrated in below Figure
Additional CPU and memory resources are required on the backup client (compared to directly connected tape devices) to comply withnetwork protocols, format the data, and transmit the data over the network. Note that restore processing in a LAN environment is identical except that the data flows in the opposite direction.
Resource consumption
Like direct-attached backups, LAN backups consume CPU, I/O bandwidth, and memory. Since the final destination of the backup data resides elsewhere on the LAN, additional CPU is required on a tape control server. LAN bandwidth is also required.
Network issues
LAN backups will generally not perform as well as direct-attached backups. Additional data movement steps, network protocol overhead, and network bandwidth limits reduce the speed of backups. If the network segment is not dedicated to backups, the backup performance can be erratic, since it is vulnerable to such other network activity as large FTPs, video, audio, and email. Even the fastest available network connections can be overwhelmed by a few disk connections. Backup disk I/O consists of intense read activity. Modern cached disk arrays, like the EMCSymmetrix system, process I/O as fast as the channels will allow. Cache arrays with two Ultra SCSI or Fibre Channel connections are capable of exceeding the theoretical and practical rates of even the faster networking technologies. A single logical disk per path can impact the network for lengthy bursts, and multiple logical disks can saturate the network for long periods.
Environments that back up many logical disks to many tape libraries will constrain even the fastest network technologies. Adding additional LAN bandwidth may not always be technically feasible, since there are often limits on how many high-speed NICs (network interface cards) a server can support. LAN backups can increase management and troubleshooting complexity. Performing backups through firewalls can be a challenge. Troubleshooting may require engagement with operations personnel, system administrators, storage administrators, and network administrators to resolve a problem.