1. What does RAID stand for ?
In 1987, Patterson, Gibson and Katz at
the University of California Berkeley, published a paper entitled "A
Case for Redundant Arrays of Inexpensive Disks (RAID)" . This paper
described various types of disk arrays, referred to by the acronym
RAID. The basic idea of RAID was to combine multiple small, inexpensive
disk drives into an array of disk drives which yields performance
exceeding that of a Single Large Expensive Drive (SLED). Additionally,
this array of drives appears to the computer as a single logical
storage unit or drive.
The Mean Time Between Failure (MTBF) of
the array will be equal to the MTBF of an individual drive, divided by
the number of drives in the array. Because of this, the MTBF of an
array of drives would be too low for many application requirements.
However, disk arrays can be made fault-tolerant by redundantly storing
information in various ways.
Five types of array architectures,
RAID-1 through RAID-5, were defined by the Berkeley paper, each
providing disk fault-tolerance and each offering different trade-offs
in features and performance. In addition to these five redundant array
architectures, it has become popular to refer to a non-redundant array
of disk drives as a RAID-0 array.
2. Data Striping
Fundamental to RAID is "striping", a
method of concatenating multiple drives into one logical storage unit.
Striping involves partitioning each drive's storage space into stripes
which may be as small as one sector (512 bytes) or as large as several
megabytes. These stripes are then interleaved round-robin, so that the
combined space is composed alternately of stripes from each drive. In
effect, the storage space of the drives is shuffled like a deck of
cards. The type of application environment, I/O or data intensive,
determines whether large or small stripes should be used.
Most multi-user operating systems today,
like NT, Unix and Netware, support overlapped disk I/O operations
across multiple drives. However, in order to maximize throughput for
the disk subsystem, the I/O load must be balanced across all the drives
so that each drive can be kept busy as much as possible. In a multiple
drive system without striping, the disk I/O load is never perfectly
balanced. Some drives will contain data files which are frequently
accessed and some drives will only rarely be accessed. In I/O intensive
environments, performance is optimized by striping the drives in the
array with stripes large enough so that each record potentially falls
entirely within one stripe. This ensures that the data and I/O will be
evenly distributed across the array, allowing each drive to work on a
different I/O operation, and thus maximize the number of simultaneous
I/O operations which can be performed by the array.
In data intensive environments and
single-user systems which access large records, small stripes
(typically one 512-byte sector in length) can be used so that each
record will span across all the drives in the array, each drive storing
part of the data from the record. This causes long record accesses to
be performed faster, since the data transfer occurs in parallel on
multiple drives. Unfortunately, small stripes rule out multiple
overlapped I/O operations, since each I/O will typically involve all
drives. However, operating systems like DOS which do not allow
overlapped disk I/O, will not be negatively impacted. Applications such
as on-demand video/audio, medical imaging and data acquisition, which
utilize long record accesses, will achieve optimum performance with
small stripe arrays.
A potential drawback to using small
stripes is that synchronized spindle drives are required in order to
keep performance from being degraded when short records are accessed.
Without synchronized spindles, each drive in the array will be at
different random rotational positions. Since an I/O cannot be completed
until every drive has accessed its part of the record, the drive which
takes the longest will determine when the I/O completes. The more
drives in the array, the more the average access time for the array
approaches the worst case single-drive access time. Synchronized
spindles assure that every drive in the array reaches its data at the
same time. The access time of the array will thus be equal to the
average access time of a single drive rather than approaching the worst
case access time.
3. The different RAID levels
RAID-0
RAID Level 0 is
not redundant, hence does not truly fit the "RAID" acronym. In level 0,
data is split across drives, resulting in higher data throughput. Since
no redundant information is stored, performance is very good, but the
failure of any disk in the array results in data loss. This level is
commonly referred to as striping.
RAID-1
RAID Level 1
provides redundancy by writing all data to two or more drives. The
performance of a level 1 array tends to be faster on reads and slower
on writes compared to a single drive, but if either drive fails, no
data is lost. This is a good entry-level redundant system, since only
two drives are required; however, since one drive is used to store a
duplicate of the data, the cost per megabyte is high. This level is
commonly referred to as mirroring.
RAID-2
RAID Level 2,
which uses Hamming error correction codes, is intended for use with
drives which do not have built-in error detection. All SCSI drives
support built-in error detection, so this level is of little use when
using SCSI drives.
RAID-3
RAID Level 3
stripes data at a byte level across several drives, with parity stored
on one drive. It is otherwise similar to level 4. Byte-level striping
requires hardware support for efficient use.
RAID-4
RAID Level 4
stripes data at a block level across several drives, with parity stored
on one drive. The parity information allows recovery from the failure
of any single drive. The performance of a level 4 array is very good
for reads (the same as level 0). Writes, however, require that parity
data be updated each time. This slows small random writes, in
particular, though large writes or sequential writes are fairly fast.
Because only one drive in the array stores redundant data, the cost per
megabyte of a level 4 array can be fairly low.
RAID-5
RAID Level 5 is
similar to level 4, but distributes parity among the drives. This can
speed small writes in multiprocessing systems, since the parity disk
does not become a bottleneck. Because parity data must be skipped on
each drive during reads, however, the performance for reads tends to be
considerably lower than a level 4 array. The cost per megabyte is the
same as for level 4.
Summary:
* RAID-0 is the
fastest and most efficient array type but offers no fault-tolerance.
* RAID-1 is the
array of choice for performance-critical, fault-tolerant environments.
In addition, RAID-1 is the only choice for fault-tolerance if no more
than two drives are desired.
* RAID-2 is
seldom used today since ECC is embedded in almost all modern disk
drives.
* RAID-3 can be
used in data intensive or single-user environments which access long
sequential records to speed up data transfer. However, RAID-3 does not
allow multiple I/O operations to be overlapped and requires
synchronized-spindle drives in order to avoid performance degradation
with short records.
* RAID-4 offers
no advantages over RAID-5 and does not support multiple simultaneous
write operations.
* RAID-5 is the
best choice in multi-user environments which are not write performance
sensitive. However, at least three, and more typically five drives are
required for RAID-5 arrays.
4. Possible aproaches to RAID
* Hardware RAID
The
hardware based system manages the RAID subsystem independently from the
host and presents to the host only a single disk per RAID array. This
way the host doesn't have to be aware of the RAID subsystems(s).
o The controller based hardware solution
DPT's SCSI controllers are a good example for a controller based RAID
solution.
The intelligent contoller manages the RAID subsystem independently from
the host. The advantage over an external SCSI---SCSI RAID subsystem is
that the contoller is able to span the RAID subsystem over multiple
SCSI channels and and by this remove the limiting factor external RAID
solutions have: The transfer rate over the SCSI bus.
o The external hardware solution (SCSI---SCSI RAID)
An external RAID box moves all RAID handling "intelligence" into a
contoller that is sitting in the external disk subsystem. The whole
subsystem is connected to the host via a normal SCSI controller and
apears to the host as a single or multiple disks.
This solution has drawbacks compared to the contoller based solution:
The single SCSI channel used in this solution creates a bottleneck.
Newer technologies like Fiber Channel can ease this problem, especially
if they allow to trunk multiple channels into a Storage Area Network.
4 SCSI drives can already completely flood a parallel SCSI bus, since
the average transfer size is around 4KB and the command transfer
overhead - which is even in Ultra SCSI still done asynchonously - takes
most of the bus time.
* Software RAID
o The MD driver in the Linux kernel is an example of a RAID solution
that is completely hardware independent.
The Linux MD driver supports currently RAID levels 0/1/4/5 + linear
mode.
o Under Solaris you have the Solstice DiskSuite and Veritas Volume
Manager which offer RAID-0/1 and 5.
o Adaptecs AAA-RAID controllers are another example, they have no RAID
functionality whatsoever on the controller, they depend on external
drivers to provide all external RAID functionality.
They are basically only multiple single AHA2940 controllers which have
been integrated on one card. Linux detects them as AHA2940 and treats
them accordingly.
Every OS needs its own special driver for this type of RAID solution,
this is error prone and not very compatible.
* Hardware vs. Software RAID
Just
like any other application, software-based arrays occupy host system
memory, consume CPU cycles and are operating system dependent. By
contending with other applications that are running concurrently for
host CPU cycles and memory, software-based arrays degrade overall
server performance. Also, unlike hardware-based arrays, the performance
of a software-based array is directly dependent on server CPU
performance and load.
Except for the array functionality, hardware-based RAID schemes have
very little in common with software-based implementations. Since the
host CPU can execute user applications while the array adapter's
processor simultaneously executes the array functions, the result is
true hardware multi-tasking. Hardware arrays also do not occupy any
host system memory, nor are they operating system dependent.
Hardware arrays are also highly fault tolerant. Since the array logic
is based in hardware, software is NOT required to boot. Some software
arrays, however, will fail to boot if the boot drive in the array
fails. For example, an array implemented in software can only be
functional when the array software has been read from the disks and is
memory-resident. What happens if the server can't load the array
software because the disk that contains the fault tolerant software has
failed? Software-based implementations commonly require a separate boot
drive, which is NOT included in the array.
5. What are the advantages of a multichannel contoller ?
6. Hardware vs. Software caching ?