|
January 1, 1997
Lessons From the Biggest Site on Earth
Netscape's site today could be your site tomorrow. Here's what you
need to know now.
In mid-September, Netscape became the undisputed king of the hill with
the most-often visited Web site on the Internet. According to I/Pro's
latest Web traffic audits, Netscape was receiving more than 100 million
hits per day. And by early October, Netscape broke the 110 million hit-per-day
mark, which consisted of more than 3 million independent sessions, 10
million pages, and 230 gigabytes of data per day.
Yikes! That level of traffic exceeds most local-area networks (LANs),
much less most Internet sites. How do you even go about putting together
a Web site that can handle that much traffic? To find out, we went to
Netscape's headquarters in Mountain View, Calif., to speak to Robert
Andrews, the company's chief webmaster.
Andrews told us many of the management techniques Netscape uses are
applicable to all Web sites. Netscape manages its infrastructure, servers,
and content in a fairly conservative manner, minimizing the chances of
connection failure while simultaneously optimizing for performance. All
told, Netscape's techniques can be applied to any organization's Web
site, whether small or gigantic, with relative benefit.
the Infrastructure
To pump out 230-plus gigabytes of data a day, you first need lots and
lots of pipe for both local and Internet traffic. Netscape achieves external
connectivity through four T3 (45-Mbps) circuits. These lines are connected
directly to MCI's and Sprint's Internet backbone networks. Since these
carriers provide much of the world's Internet backbone traffic, connecting
them directly to Netscape's network also provides the shortest and quickest
connection to Netscape users. Additional circuits dedicated to AT&T,
Uunet, and other aggregate carriers are planned.
There also are OC3 (155-Mbps) circuits dedicated to FTP traffic (the
majority of Netscape's traffic is from downloads). One of the circuits
is installed at MAE (Metropolitan Area Exchange) West in the Silicon
Valley area, and another at MAE East in Reston, Va., where Netscape keeps
a handful of FTP servers for East Coast users to access. "We wanted to
bring the servers closer to the users," Andrews says,
"and MAE East serves a large geographic community." Indeed, MAE East
provides primary connectivity for most of the northeastern United States,
as well as international connectivity for several European network providers.
Each of the FTP installations get 100 Mbps of Internet connectivity,
and Netscape's Web site (which runs out of Netscape's headquarters) has
access to a cumulative 180 Mbps of Internet connectivity. All of this
bandwidth-while not necessarily required for the level of traffic used
today-is designed to bear the burden when traffic to Netscape's Web site
reaches terabyte levels, which is expected to happen within the next
few months.
Supporting all of these pipes are dedicated, top-end 7500-class Cisco
Systems Inc. routers. The routers themselves are interconnecte using
two sets of local LAN cabling. The primary path is a fast Ethernet LAN
that uses a Cisco Catalyst switch, providing 100-Mbps of dedicated bandwidth
to each system. This is backed up by an FDDI (Fiber Distributed Data
Interface) ring that provides a second 100-Mbps network. The routers
are configured to use BGP (Border Gateway Patrol) for automatic rollover,
so if the Ethernet switch fails for any reason, the routers and devices
will use the FDDI network automatically.
The Servers
The Web and FTP servers used on Netscape's site are optimized for their
separate functions. Each system has a full RAID (Redundant Array of Independent
Disks) for local data storage, 256 MB of system RAM, and highly optimized
system kernels, which do nothing but what they're configured for. They
don't even run SNMP monitoring agents, as that would steal system and
TCP/IP stack time. They're "lean and mean, for maximum performance," Andrews
says. Even the log files are kept on local disks so as to avoid generating
any of the network traffic incurred by using SYSLOG.
Over time, Netscape has been able to work with OS vendors to improve
the reliability and efficiency of the native TCP/IP stacks and kernels
so they're more robust than ever. Whereas some of the systems could support
only a few hundred simultaneous sessions in the past, they're now able
to support several thousand simultaneously. This exponential improvement
at the OS level has boosted Netscape's ability to keep up with demand. "These
new advances in IP stack technology are allowing us to run more services
on the systems," Andrews says.
Distributing the Load
Although Netscape advertises 20 FTP servers, in reality there are only
six systems-SGI Challenger L servers and HP 9000 H-class systems that
are capable of supporting 4,000 sessions each.
The new systems aren't maxed out like the ones Netscape previously
used, so the 4,000th user gets just as much CPU time as the first. The
FTP server software used by all of these boxes is a slightly modified
version of Washington University's freeware FTP server, chosen for its
flexible configuration and logging options.
Of course, all of Netscape's Web servers run Netscape software. A variety
of vendor systems are used as Web servers, including Digital Equipment
Corp., IBM, Silicon Graphics Inc., and Sun Microsystems Inc. Most of
these systems run Netscape's Enterprise Server, except for those requiring
transaction processing like Netscape's General Store.
"Netscape's site is really made up of a variety of specialized systems,
though this isn't readily apparent to most users," Andrews says. Just
as there is a specialized system for the General Store, there are other
systems for online registration and even for the general purpose sites.
The registration system, for example, is used for people who buy Navigator
Personal Edition at a computer store and then need to find an ISP for
their Internet access. The first time a user installs the Personal Edition,
the browser locates the Netscape registration server and prompts the
user through a series of locale and pricing questions. Once the back-end
system has the necessary information, it locates an ISP and creates the
account information on the remote system automatically. All of this is
handled invisibly as far as the user is concerned.
All told, there are almost 20 different systems acting as Web servers.
Some are for the special purpose systems, but most are for the home.netscape.com
systems, which is the default URL used by the Netscape Navigator client
software.
DNS Naming Issues
Obviously, there isn't just one system that supports home.netscape.com.
There couldn't be, as it alone is responsible for 6.5 million hits in
just one hour. Thus, Netscape has the home host name referenced to 32
separate host names through a combination of round-robin DNS and client
spoofing. These 32 host names are served by eight individual systems.
"Originally, the site was set up using round-robin DNS lookups, meaning
that clients would use whatever system was randomly returned by the DNS
servers," Andrews says. "However, not all DNS clients worked well this
way." Some PC stacks didn't implement support for round-robin at all;
they'd simply use the same IP address all the time, which put too much
of a load on any single server.
To get around these problems, Netscape embedded a bypass technology
directly into the Navigator client. Now when a user enters home.netscape.com
into the URL, Navigator randomly chooses a number between 1 and 32 and
sends a request to homeNN. The result is that 32 host names (home1
through home32) act as the destination for any Navigator client that
is connecting to home.netscape.com. In essence, a Navigator user will
never be able to connect to home.netscape.com, as the client will always
intercept the request and convert it to homeNN.netscape.com.
However, there is a DNS entry for home.netscape.com, which is used
exclusively for non-Navigator clients since they won't be rewriting the
host name. Although the home server has the same content as the other
hosts, it also runs extensive monitoring software that logs the different
types of browsers used by the visiting clients, providing demographic
data to Netscape's webmasters on the types of browsers in use.
Although Netscape's primary site is home.netscape.com, many people
enter www into their Web browsers out of habit, so Netscape's webmasters
have set up a system explicitly for www.netscape.com. Since this system
is accessible to both Navigator and non-navigator clients, it provides
another data collection point on the distribution of browsers on their
site.
Andrews admits it's highly unlikely that other webmasters will be able
to implement DNS load-balancing into their Web browsers directly, but
he doesn't apologize for what Netscape does. "We had to deal with these
load issues six months before anybody else even had to think about it," he
says. And even though DNS technologies have come a long way since then,
Netscape will continue to use the client-based lookups for the foreseeable
future. "It certainly guarantees evenly balanced servers," Andrews notes.
Content Management
Just as Netscape's servers are split up for the different types of
access, so is the content managed according to function. All of the core
materials are created and managed by a development team that's part of
the marketing organization, which even has a handful of programmers on
staff for CGI and JavaScript development.
Web pages and programs are developed using a variety of tools. Once
the pages are completed and tested, they are checked into a source code
management tool, just like a real development project. The source library
can be used to archive files, check for inconsistencies, compare versions,
and even restore past versions of pages for re-use.
Having a distributed, decentralized development environment is "really
the only way to build and manage super-scalable content," Andrews claims.
Otherwise the process would become too bureaucratic, with the various
departments fighting for the centralized developers' resources. "The
downside is that there is a decreased level of control," he says, which
can result in problems when one department steps on another's online
efforts by changing a URL or by removing a file.
Once the pages are checked into the archive, the webmaster-on-duty
pulls the pages to a master Web server that feeds all of the other systems.
Once a night, the master server sends any changed files in the directory
tree to all of the other servers using the standard RDIST (Remote Distribution)
protocol. At that point, the individual servers archive their daily log
files and send them to the internal log server. This host expands the
archives and then runs the daily analysis tools against them, generating
graphs and charts showing the number of hits, amount of data transferred,
and other useful statistics.
Incidentally, this is one of the current bottlenecks for the entire
layout. Although the nightly submissions of new pages don't consume much
bandwidth, the processing of the log files does. Since the connection
between the external site and the internal network relies on 10-Mbps
Ethernet, the transfer times for the log files can be huge. The simple
act of uncompressing a single log file can take well over an hour due
to the huge volumes of traffic that each one sends. "As the volume edges
toward terabyte levels, the need for higher-bandwidth, on-the-fly compression
will be required for the daily log file analysis to continue," Andrews
says.
Learning Curve
While it seems unlikely that any of us will ever have a Web site that
generates as much traffic at Netscape's, it's not as far-fetched as you
may think. What we're seeing from Netscape's site today is probably going
to be very common within the next two or three years, especially for
many of the larger consumer-related Web sites that are just coming online
now.
There's a lot to be learned from Netscape's experience. First of all,
setting up a strong infrastructure makes a big difference in overall
performance. Balance the bandwidth equally between the Internet connection
and the local server pool. If the devices can't talk to each other quickly,
then a high-speed connection is irrelevant. Next, provide direct links
to the aggregate carriers to improve performance for the access providers
underneath them. Finally, set up a multilayered, fault-tolerant wiring
scheme. You don't want to lose all of your servers simply if the hub
is accidentally unplugged.
Don't try to put too much load on any one server, unless you really
don't expect too much traffic. Optimizing the platform for performance
will really make a difference in user satisfaction. For example, out
of the 350 gigabytes of data transferred from Netscape's site, more than
70 gigabytes came from the server caches, resulting in low disk times
and high performance. Make sure to work with your vendor on optimizing
the system kernel so that TCP/IP traffic gets the highest priority, if
possible.
Finally, push the content development and management off to the groups
that have the most to benefit-probably your marketing department. These
people have a much higher level of commitment to good design than you
do, so make them responsible for it. Your job as webmaster should be
to make sure that everything works smoothly, not to do the layout.
Taken all together, these tips provide a strong overall success strategy
that will help you maximize your company's Web presence.
So you wanna be a webmaster?
Managing the largest Internet site on the planet is not as simple as
it sounds-it requires a myriad of skills. "It's not just programming
HTML and CGI," says Robert Andrews, Netscape's Web site director. "You
have to take a holistic approach," looking at every aspect of the process,
from how applications generate data down to how the user connects to
your site, and all of the layers in between.
With a formal education in physics, Andrews prepared for his position
by working in a variety of systems management roles over the years. In
his previous job, he worked as a network manager for a large semiconductor
company, and before that he was a Unix systems manager for another technology
company in Silicon Valley.
This combination of Internet infrastructure and general systems management
experience has proved to be his most valuable asset in dealing with his
day-to-day management issues, as well as with the strategic design efforts.
Andrews points out two requirements for being a successful webmaster-"You
have to know [system administration and Internet networking] from top
to bottom."
And it's only going to get more complex as the Internet continues to
grow. "The future of the Web is one of truly dynamic content," Andrews
says. "Not movies or static files, but real-time, dynamically generated
material that reflects current events and activity. Web sites are going
to be capturing data from other dynamic sources, marking them up and
displaying them automatically."
Written by Eric
A. Hall.
Copyright © 1997 CMP Media, Inc. Used with permission. |