ISPadmin
Web Hosting
November, 2001
Robert D. Haskins
Introduction
In this installment of ISPadmin, I examine how ISP's implement their web
infrastructure to support retail (tilde or "~" accounts) and hosted domains.
Web hosting is an integral part of most if not all ISP's, and many companies
(Exodus and Servint to name two) focus exclusively on web hosting and collocation
as their core business. Web hosting was the first application to be hosted
the area now known as "application hosting".
It is worthwhile discussing the typical migration of a web hosting customer
at a retail ISP. A traditional dial ISP customer starts out buying a "standard"
dial up account which usually consists of the following:
1 dial up (PPP) account
1 mail (POP) account
1 web hosting account (of the form www.isp.net/~username, commonly referred
to as the "tilde" account)
Figure 1
Figure 1 illustrates a typical migration path of a web hosting customer.
The subscriber starts utilizing their standard PPP account, and the "tilde"
web account if they want to have some sort of a web presence. If it is a
business account, or a retail subscriber with more than a passing interest
in hosting web content, they will probably outgrow the "tilde" account and
want to move to a "real" hosted domain (www.mydomain.org). The ISP needs
to have a web hosting offering, or else they will lose the customer and associated
revenue.
Once the hosted domain owner needs to sell something, they will want to have
a shopping basket, secure site, credit card payment mechanism, etc. Once
again, unless the ISP wants to risk losing the business, they need to make
sure they can support electronic commerce.
The final step is the case where the web site owner has so much traffic,
it needs to be hosted on a dedicated server. The ISP must have a collocation
(colo) offering, or else the domain will have to be moved to a provider which
offers colo and the ISP in question loses the customer and their business.
In many senses, the web hosting business is just an offshoot of the real
estate business. In order to support a large number of domains, data center
space is required. Of course, there is more to web hosting than just real
estate (for example, UPS and backup generator power, fire suppression, network
connectivity, monitoring, etc.) but a large component of the cost will be
the "bricks and mortar" and similar fixed non-IT related components.
Web Hosting Infrastructure
In most cases, a small provider would likely have a very similar setup to
a larger provider. Some differences might be:
· The smaller ISP would probably use a shared machine
where the machine doing web hosting might also perform other functions (such
as mail, RADIUS and/or DNS)
· A larger provider would likely have more automated
web hosting related provisioning and billing mechanisms than a smaller ISP
A large ISP would likely have machines dedicated to each specific type of
hosting. For example, a machine or series of machines would be dedicated
to each of the following functions:
· "Tilde" accounts
· Domain hosting
· E-commerce enabled domain hosting.
It is useful to have some idea on what it takes to manage a web infrastructure
for a dial up provider. Ziplink hosted approximately 5000 "tilde" accounts
on a Sun Ultra 10 with 9 GB mirrored disk drives. For the hosted domain side
of the business, there were two Sun Ultra 10's with mirrored 18 GB drives,
each server hosting approximately 150 domains. The load, under normal circumstances
(i.e., no bad CGI scripts running or no extremely active pornography sites)
was always less than 0.5. Of course, if there are runaway CGI scripts, someone
put up a pornography site or other site that attracted many hits, then the
load would run much higher. This infrastructure was run by the equivalent
of a half full time equivalent mid level staff member. This might seem high,
but Ziplink did not have much automation in the area of web hosting, as the
business plan focused on other areas (namely wholesale dial up).
Web Server Software
Apache
By far, the most commonly utilized web server in use by service providers
is Apache. It has outstanding support for service providers, and is very
configurable. Some of the Apache modules I have found particularly useful
at ISPs include:
· mod_alias
· mod_rewrite
· mod_userdir
· mod_speling (particularly useful for migrating
a VMS or other non-case sensitive O/S web server to Unix)
· mod_vhost_alias
The References contain a link to the virtual domain support page for Apache.
Microsoft
If the ISP wants to provide Microsoft (MS) based services (such as Active
Server Pages [ASP] or Front Page Extensions) this requires some extra effort
if they are a Unix-centered shop. Some service providers may charge more
for an NT based infrastructure. The Unix-based service provider has two options.
They can either:
1. Install a Microsoft based infrastructure -or-
2. Install additional software components to enable the
required Microsoft functionality
If option #1 is taken, a parallel MS web hosting infrastructure must be deployed.
This configuration would take the form of a Windows 2000 (Win2k) server running
IIS (which now ships as part of Windows 2000). A server running Win2k server
(a.k.a. NT) would be able to handle both ASP and FrontPage Server Extensions
(FPSE) functionality. (Note that one must download and install the FPSE for
the Microsoft platforms; Win2k and NT do not ship with FPSE). If the ISP
has a back end billing/provisioning system, then such system(s) must be modified
to provision and bill this infrastructure.
If option #2 is taken, Apache (and most non-Microsoft based web servers)
require additional components. These additional components can often be engineered
on top of the providers existing web infrastructure. In order to support
FPSE, Microsoft has a software package which in its current version (FrontPage
Server Extensions 2002 for Unix) runs under Apache 1.3.19 and implements
the server side of FrontPage. In order to implement MS ASP functionality
under Apache, a package such as Sun Chili!Soft ASP must be implemented. There
is also a Perl ASP (with Perl scripting only) implementation for Apache available,
called Apache::ASP. I do not have any direct experience with either of these
packages.
The other popular Web servers out there (iPlanet and Zeus) are not often
deployed at service providers. This is probably due to the fact they are
both commercial products and not free like Apache.
The Netcraft site has a good graphic showing the statistics of the various
web servers out on the Internet. I was particularly interested in July, 2001
report, which showed a rather large downturn (4.29%) in the number of Apache
sites and a significant upturn (5.49%) in the number of Microsoft IIS installations.
According to the Netcraft analysis, this is due to two large sites converting
to Microsoft, and has been masking a larger trend away from Apache to IIS.
The August, 2001 report showed a much smaller increase in IIS deployments
so the July 2001 increase appears to be an anomaly.
Web Server Issues
There are many challenges facing service providers who host web content.
I will briefly discuss the issues and solutions for ISP's surrounding each.
Managing Domains
Managing a large number of domains is a problem. Most larger web hosting
companies have a file naming scheme whereby the top level name space is broken
down by the first letter of the domain, namely:
/www1/a directory would house all domains starting with
“a”
/www1/b directory would house all domains starting with
“b”
and so forth. Of course, the top level directory name might indicate the
machine name (www1, www2, etc. which are each individual web server machines).
Also, Apache's URL mapping scheme can be utilized to automatically redirect
domains to the correct web server domain content via a “global” command,
without the need for specific per-domain configuration entries in the Apache
configuration file.
Email Aliasing
Email aliasing refers to the ability for email at a hosted domain (info@mydomain.com)
be forwarded to a "real" mailbox, for example cust@isp.net where the customer
actually picks up the mail through POP3, IMAP, webmail, etc. Both the Sendmail
and Postfix mail transport agents have good support for virtual mail configurations.
The ISP usually adds an interface (usually web based) for the customer support
agent and/or the customers themselves to edit these mail mappings. These
mappings must be forwarded to external accounts or the service providers
"regular" dial up mail accounts. As a result, there is usually an interface
between the providers billing system and mail infrastructure.
Legal Liability of Hosted Content
Most providers consider themselves "common carriers" and do not police the
content placed on web sites by their customers. Of course, people can and
do complain. If it is an obvious copywrite infringement (for example, posting
illegal software or copyrighted mp3's) or something similar, the provider
usually can and does take swift action without waiting for a court order.
However, if the complaint is not as clear cut as that situation (for example,
a site that satirizes someone or something), usually the provider will wait
for a properly executed court order before taking action. Most if not all
service providers perform no monitoring of content, as then the provider
will be expected to monitor all content. Please be aware policies in this
area vary quite a bit from one provider to another so it is dangerous to
make too many generalities!
Security
Security of the server as well as security of customer data can be a problem
in a shared web server environment. Many hosting providers will not allow
arbitrary customer written/provided CGI to be run on the machine. Of course,
CGI routines are usually made available for standard functions like web counters,
comment sections, weblogs, etc. However, any non-standard code has to be
reviewed by staff prior to implementation. Also, external programs such as
CGIwrap are used to help ensure CGI programs are run in a secure manner.
The normal file access control mechanisms of the host (Unix or NT/Win2k)
are used as well.
Logs
Most ISPs generate access and error logs for their customers. Apache has
excellent support for automatically generating these logs without human intervention.
The Apache configuration commands used to generate per domain logs are “ErrorLog”
and “CustomLog" and appear under each VirtualHost section on a per domain
basis. Access to the logs is usually granted via the same FTP interface the
customer uses for uploading their content.
Bandwidth
Most web hosting plans include limits on the amount of bandwidth their customer's
pages generate. This bandwidth accounting is so the ISP can limit their exposure
if a customer’s site should suddenly become very popular (for example, the
customer begins to host pornography). If a customer’s site does become too
much of a load on a shared server, the service provider will request the
customer move to a dedicated server. Of course, the ISP will charge additional
money for that additional functionality to cover costs. These costs are:
server hardware (if provided by ISP), staff time (to setup and move the domain)
and transit (network bandwidth to the Internet).
Billing Integration
A small ISP will usually do everything (setting up DNS, configuring Apache,
etc.) by hand and as a result, no billing integration is required or possible.
Billing for a larger ISP or web hosting provider would be much more integrated,
as web hosting might be a larger part of the ISP’s business. Most providers
have automated mechanisms for domain registration as well as signup for web
hosting service. This would require an automated interface into the providers
billing system. How this interface is achieved would be a function of the
billing system as well as the provider's business model.
E-commerce
For the purposes of this article, e-commerce (electronic commerce) includes:
a shopping basket, managing an inventory and processing credit cards. The
ISP who wants to retain their customer through the full life cycle and not
lose them must have an e-commerce solution. This could be farmed out to a
third party due to the financial risks something like this poses, or it could
be done in house via commercial software or open source software.
M third party providers have a complete e-commerce solution. In addition,
several business to business (B2B) and business to consumer (B2C) solutions
exist for ISPs. The most commonly implemented open source shopping basket
and inventory management software is Red Hat Interchange (formerly Akopia
Interchange). Interchange is a full featured B2C software application written
in Perl and is widely used. Processing credit cards is dictated by what software
the ISP is using for their e-commerce solution. For example, according to
their web site, Red Hat Interchange supports many credit card payment gateways,
including Cybercash/Verisign (Verisign recently acquired Cybercash).
Conclusion
Next time I'll take a look at how ISP's design their backbone networks. In
the meantime, if you have a question about ISP's, wondered why something
was done a particular way, I'd love to hear from you!
Thank You
I’d like to thank Vinny Bono of GlobalNAPS for his input to this article.
References
Digex: http://www.digex.com/
ServInt: https://www.servintservers.com/
Apache Web server start page: http://httpd.apache.org/
Apache Virtual Host documentation: http://httpd.apache.org/docs/vhosts/index.html
Apache modules: http://httpd.apache.org/docs/mod/index.html
Microsoft web page: http://www.microsoft.com
Microsoft IIS: http://www.microsoft.com/iis
Microsoft FrontPage: http://www.microsoft.com/frontpage/
Microsoft FrontPage Server Extensions download: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnservext/html/fpovrw.asp
Sun Chili!Soft ASP: http://www.chilisoft.com/
Perl: http://www.perl.com
Apache::ASP: http://www.apache-asp.org/
Netcraft Web Server Survey: http://www.netcraft.com/survey/
iPlanet: http://www.iplanet.com/products/iplanet_web_enterprise/home_2_1_1m.html
Zeus: http://www.zeus.co.uk/
Sendmail virtual domain documentation: http://www.sendmail.org/virtual-hosting.html
Postfix: http://www.postfix.org
CGIWrap: http://cgiwrap.unixtools.org/
Red Hat Interchange: http://interchange.redhat.com
Cybercash/Verisign: http://www.verisign.com