|
May 4, 2004
iSCSI: Storage Networking sans the SAN?
Storage Area Networks (SANs) have proven to be a promising technology,
simplifying the management of large and complex storage systems. But
they come at a high cost: First-generation SANs depend on Fibre Channel
(FC) networks, which require laying new cables, learning new skills,
and buying specialized switches. Unfortunately, this has made SANs hard
to justify for all but the largest installations.
Internet SCSI (iSCSI) could change all of that. Because iSCSI runs
over standard TCP/IP links, it's easier to build, manage, and justify
than FC, making SAN technology more affordable and accessible for everyone.
A small iSCSI SAN can be built on top of an existing network infrastructure
and even use ordinary Windows or Linux servers as remote storage arrays.
Larger networks can also benefit since iSCSI turns the SAN into a low-cost
commodity.
If you already use FC, there's probably no reason to move to iSCSI
yet. But if you're about to build a new SAN, iSCSI is a strong contender.
Even if you've hitherto ignored or rejected SAN technology, iSCSI's low
cost means it may be time to reconsider.
Remote RAID
The principle behind a SAN is simple: Transport storage commands and
data over long distances so that a host can connect to a remote storage
device as easily as it connects to internal drives. With iSCSI and FC
alike, this is handled by a local storage interface that communicates
with storage devices across a network of some kind, with local commands
and data being relayed into and out of the network as needed. The host
computer is still responsible for managing the logical drives that have
been exposed to it, but those drives are usually somewhere else, not
inside the host's cabinet.
At first glance, this architecture doesn't appear compelling. Compared
to a local drive, a storage device located on the other side of a network
introduces additional latency and overhead. But a SAN can also offer
powerful benefits, the most important of which is simplified management.
Instead of dealing with local drives on each server, you can consolidate
storage into a single massive RAID unit that presents logical drives
to each system on the network. Backup is also simplified, allowing disk-to-disk
backups and block-level replication between storage devices.
None of these features are unique to iSCSI. However, iSCSI does provide
the ability to run a SAN over TCP/IP, which can scale to longer distances
than rival SAN technologies. And whereas FC networks have very specific
requirements, iSCSI is more flexible. TCP/IP can run over any physical
medium, so it offers a choice of topologies. For example, since iSCSI
uses IP, it can use standard Ethernet switches and routers, which cost
less than their FC counterparts, and IP packets can travel across alternative
LAN topologies just as easily. It can also be easier to manage because
it's based on familiar standards and doesn't involve learning as many
new skills.
IP Channel
From the network's perspective, iSCSI is just another service that
runs over TCP/IP. It can use the same networking stack as other applications,
with clients requesting data from servers. The main difference is that
its function is more specialized. Whereas other layer-7 protocols such
as SMTP are agnostic toward the technologies used at their endpoints,
iSCSI is designed as a way to extend an existing storage technology across
IP networks.
The iSCSI drafts and RFCs are published by the IETF, but based on the
SCSI specifications from the ISO's Technical Committee Ten (T10), the
ANSI-accredited body responsible for developing and maintaining the core
SCSI standard. To the committee, iSCSI is just another SCSI transport
and just as officially sanctioned (though technically it's a superset
of SCSI, providing additional functionality through unique commands and
data formats used for secondary services such as authentication). iSCSI
can be used in exactly the same way as FC, InfiniBand, or local SCSI
cables.
For iSCSI purposes, the SCSI protocol is conceptually similar to TCP/IP's
client/server architecture. Every SCSI link involves a host adapter,
called a SCSI initiator, and a storage device, called a SCSI target.
A local SCSI bus usually connects a single initiator to up to seven targets,
but a SAN allows an unlimited number of each. In iSCSI, the initiator
acts as a client, and the target a server (see figure). The initiator's
iSCSI stack packs SCSI commands and data into IP packets, which are then
unpacked by the target for processing as if they had originated locally.
Virtualized Storage, Virtual Network
iSCSI's most obvious benefit is that it can create a virtual SAN using
existing network infrastructure. However, this benefit is easily overstated.
If a network already has gigabit switches in place with bandwidth to
spare, the benefits of iSCSI are immediate. But most networks aren't
usually this overprovisioned, and since SCSI involves transporting large
amounts of data very quickly, you'll likely need to spend big bucks on
upgrades to deal with the additional traffic.
Infrastructure reuse is still important, however. Upgrading an existing
network is less expensive than building a new one from scratch, making
SANs cost-effective for small networks that can't justify FC's high price
tag. The resultant mass market will push down iSCSI costs further, to
the point where it replaces FC as the dominant SAN technology. Server
vendors will likely jump on the iSCSI bandwagon just as they did with
FC, offering 1U and blade servers with integrated iSCSI ports.
A lightly burdened departmental server may be able to use a single
100-Mbit/sec Ethernet link for both storage and application traffic,
but a busy database server might easily require a dedicated Gigabit Ethernet
connection for storage traffic alone. If there are many such servers,
the cost of iSCSI can approach that of FC, though IP and Ethernet do
have the advantage that they can be reused in future projects if the
SAN is abandoned. The costs are even higher when storage traffic is sent
over a WAN. However, iSCSI still scores over alternatives here. FC depends
on dedicated cabling and therefore can't run over a public network at
all. Its range limit is about 10km using single-mode fiber.
FC can be made to cover longer distances, but only by using FC over
IP (FCIP), a technology that's much like iSCSI and even uses some of
the same Internet drafts (see "State of the Standards"). The difference
is that whereas iSCSI takes raw SCSI and packs it into IP packets, FCIP
requires that the SCSI data first be packed into Fibre Channel Protocol
(FCP) frames. This adds additional processing and bandwidth overhead,
so FCIP compares poorly to iSCSI as a standalone means of transferring
storage through IP. Its main use is in interconnecting existing FC networks
without installing new FC hardware.
Traffic Trap
Infrastructure reuse also depends on the types of nonstorage applications
running over the network. iSCSI uses TCP for its reliability and flow-control
services, which are highly sensitive to packet loss and delay. For example,
congestion that results in a packet loss of only 2 to 3 percent is sometimes
enough to cause TCP sessions to drop to very low utilization rates. This
means that much of the network's bandwidth is wasted because of the way
TCP tries to fill the available capacity.
If there's any packet loss or delay, TCP blames congestion and cuts the
data rate in half before gradually ramping back up again. As a result,
a minor amount of periodic congestion can cause the data rate to halve
several times in a row. If the existing network is used for multimedia
or other applications that cause traffic spikes, storage may be better
off on a physically separate network.
There are other reasons to build a dedicated SAN, particularly for
large installations. Networks that increase throughput with Jumbo Frames
(which allow up to 9,000 bytes of data per Ethernet frame, instead of
the traditional 1,500 bytes) will likely want to restrict these to hosts
capable of using them. This avoids performance penalties associated with
fallback algorithms. Security is also enhanced by physical separation,
especially if many different departments or groups have access to the
network.
Infrastructure reuse is often more theoretical than practical, but
even networks that do need to invest in additional infrastructure can
benefit from the lower costs of TCP/IP and Ethernet. Because they're
familiar and use existing skills, most network managers will also find
them easier to manage than FC.
Services as Software
iSCSI's advantages over alternative SAN technologies aren't limited
to the network infrastructure. Because every server already has a network
connection and a TCP/IP stack, iSCSI can use a software-based SCSI initiator.
With FC, servers require a dedicated SCSI adapter, which is more expensive
than an off-the-shelf Ethernet NIC.
Reusing a server's network connection isn't always feasible, however.
Software initiators can place a high load on a server's own processor
and other system resources, so they're most suitable for lightly burdened
departmental servers.
A data center server that's already working full time will need to
off-load storage processing to dedicated iSCSI hardware. Software initiators
are likely to be useful in a small SAN that shares Ethernet cables with
other IP packets, but not in a larger one with its own dedicated infrastructure.
An iSCSI host interface must provide several services. At the top of
the stack is a SCSI driver. The driver or a subordinate protocol engine
encapsulates raw SCSI commands and data into iSCSI messages, maps the
local drive assignment to remote devices, and is responsible for related
tasks such as authentication. Further down, iSCSI messages must be transferred
over TCP/IP sessions, which may involve services such as error detection
and encryption. At the bottom, the host interface must manage low-level
network media functions.
Except for the Physical-layer requirements, any of these service sets
can be provided in either hardware or software. Each additional software
service requires incrementally greater amounts of processor overhead.
Conversely, hardware services result in lower CPU overhead, but are less
flexible and prevent the network interface from being reused for nonstorage
purposes.
Web Hosting
At one extreme, Microsoft provides a software-based iSCSI driver for
Windows 2000 and derivative systems. SourceForge.net also hosts a project
that's actively involved in developing an iSCSI driver and initiator
for Linux. Both of these drivers are capable of accessing remote iSCSI
devices through the OS's own TCP/IP stack, but impose significant processor
overhead. They're most useful for high-horsepower servers not already
burdened with significant processing demands.
A step up from the software-only approach, "smart" network adapters
can take on some common network processing tasks. These are designed
to help any server busy with TCP/IP processing, so they don't perform
any iSCSI-specific functions. For example, Alacritech's 100Mbit/sec and
Gigabit Ethernet adapters can perform TCP fast-path computations on the
NIC itself, lightening the load on the host CPU.
Hard Hosts
At the other extreme, dedicated iSCSI adapters from Adaptec, Intel,
or QLogic can provide all necessary TCP/IP and SCSI functions in hardware,
with only a minimalist driver needed in software. To the OS, hardware
initiators are indistinguishable from FC or local SCSI adapters. For
example, Adaptec's 7211 card is effectively a traditional SCSI host adapter,
but one that accesses networked devices through TCP/IP instead of providing
an internal storage bus.
The biggest caveat with hardware initiators is that they typically
operate independently of the host OS. The host's TCP/IP stack can't be
bound to the network adapter, making it difficult or impossible to use
with other network applications and services. The adapter's internal
TCP/IP stack doesn't usually support extended features such as RIP, OSPF,
or other dynamic routing protocols. Similarly, it might support static
DNS, but not Lightweight Directory Access Protocol (LDAP). Finally, network
adapters are often unable to support failover and load-balancing protocols
provided in the host OS.
One advantage of dedicated hardware is that it enables booting from
a SAN-connected drive, something not possible using a software initiator
that requires an OS to be loaded before it can connect. This gives it
the potential for use in diskless workstations, not just in servers,
thus simplifying desktop management.
Soft Targets
iSCSI also entails changes for SCSI targets. Major storage vendors
such as EMC and Network Appliance are already pushing iSCSI versions
of their existing storage products, letting FC customers stick with familiar
suppliers when moving over to iSCSI.
As an alternative, a new breed of vendors are offering software-based
targets that run on off-the-shelf computing platforms. These are currently
aimed at the low end of the market, with low prices to match. For example,
String Bean Software is beta-testing a software target that will allow
Windows 2000 and 2003 servers to share their locally attached storage
over iSCSI. Ardis Technologies offers a similar product for Linux, though
this requires a modified kernel. Other vendors have taken Ardis' approach
a step further, selling iSCSI appliances that consist of several hard
drives in a customized Linux PC.
Software targets can't yet compete with high-end storage products,
but they may be adequate for a departmental or small-office SAN in which
a dedicated appliance can't be justified. The combination of multigigahertz
processors, network accelerator cards, RAID-aware OSs, and warm-swappable
drives allows software targets to deliver decent functionality at a low
cost. While a RAID cabinet can run up to six or seven figures, these
no-frills offerings have the potential to reach megabytes-per-penny.
In the short term, this kind of product won't significantly impact
high-end offerings; it's useful only where advanced software and support
isn't necessary. Over time, however, both Moore's Law and maturing products
should allow software targets to compete head-to-head with Cadillac-class
systems. The breakout will occur within two to four years, when Microsoft
or one of the mainstream Linux distributors includes iSCSI target functionality
in a server OS.
Though currently curiosities for the adventurous, software targets
represent the future of the SAN. Even without them, iSCSI will ultimately
result in accelerated commoditization, lower costs, and wider SAN adoption.
Written by Eric
A. Hall.
Copyright © 2004 CMP Media, Inc. Used with permission. |