ISPadmin
Web Hosting
November, 2001
Robert D. Haskins

Introduction

In this installment of ISPadmin, I examine how ISP's implement their web infrastructure to support retail (tilde or "~" accounts) and hosted domains. Web hosting is an integral part of most if not all ISP's, and many companies (Exodus and Servint to name two) focus exclusively on web hosting and collocation as their core business. Web hosting was the first application to be hosted the area now known as "application hosting".

It is worthwhile discussing the typical migration of a web hosting customer at a retail ISP. A traditional dial ISP customer starts out buying a "standard" dial up account which usually consists of the following:

1 dial up (PPP) account
1 mail (POP) account
1 web hosting account (of the form www.isp.net/~username, commonly referred to as the "tilde" account)



Figure 1

Figure 1 illustrates a typical migration path of a web hosting customer. The subscriber starts utilizing their standard PPP account, and the "tilde" web account if they want to have some sort of a web presence. If it is a business account, or a retail subscriber with more than a passing interest in hosting web content, they will probably outgrow the "tilde" account and want to move to a "real" hosted domain (www.mydomain.org). The ISP needs to have a web hosting offering, or else they will lose the customer and associated revenue.

Once the hosted domain owner needs to sell something, they will want to have a shopping basket, secure site, credit card payment mechanism, etc. Once again, unless the ISP wants to risk losing the business, they need to make sure they can support electronic commerce.

The final step is the case where the web site owner has so much traffic, it needs to be hosted on a dedicated server. The ISP must have a collocation (colo) offering, or else the domain will have to be moved to a provider which offers colo and the ISP in question loses the customer and their business.

In many senses, the web hosting business is just an offshoot of the real estate business. In order to support a large number of domains, data center space is required. Of course, there is more to web hosting than just real estate (for example, UPS and backup generator power, fire suppression, network connectivity, monitoring, etc.) but a large component of the cost will be the "bricks and mortar" and similar fixed non-IT related components.


Web Hosting Infrastructure

In most cases, a small provider would likely have a very similar setup to a larger provider. Some differences might be:

·    The smaller ISP would probably use a shared machine where the machine doing web hosting might also perform other functions (such as mail, RADIUS and/or DNS)
·    A larger provider would likely have more automated web hosting related provisioning and billing mechanisms than a smaller ISP

A large ISP would likely have machines dedicated to each specific type of hosting. For example, a machine or series of machines would be dedicated to each of the following functions:

·    "Tilde" accounts
·    Domain hosting
·    E-commerce enabled domain hosting.

It is useful to have some idea on what it takes to manage a web infrastructure for a dial up provider. Ziplink hosted approximately 5000 "tilde" accounts on a Sun Ultra 10 with 9 GB mirrored disk drives. For the hosted domain side of the business, there were two Sun Ultra 10's with mirrored 18 GB drives, each server hosting approximately 150 domains. The load, under normal circumstances (i.e., no bad CGI scripts running or no extremely active pornography sites) was always less than 0.5. Of course, if there are runaway CGI scripts, someone put up a pornography site or other site that attracted many hits, then the load would run much higher. This infrastructure was run by the equivalent of a half full time equivalent mid level staff member. This might seem high, but Ziplink did not have much automation in the area of web hosting, as the business plan focused on other areas (namely wholesale dial up).

Web Server Software
Apache

By far, the most commonly utilized web server in use by service providers is Apache. It has outstanding support for service providers, and is very configurable. Some of the Apache modules I have found particularly useful at ISPs include:

·    mod_alias
·    mod_rewrite
·    mod_userdir
·    mod_speling (particularly useful for migrating a VMS or other non-case sensitive O/S web server to Unix)
·    mod_vhost_alias


The References contain a link to the virtual domain support page for Apache.

Microsoft

If the ISP wants to provide Microsoft (MS) based services (such as Active Server Pages [ASP] or Front Page Extensions) this requires some extra effort if they are a Unix-centered shop. Some service providers may charge more for an NT based infrastructure. The Unix-based service provider has two options. They can either:

1.    Install a Microsoft based infrastructure -or-
2.    Install additional software components to enable the required Microsoft functionality

If option #1 is taken, a parallel MS web hosting infrastructure must be deployed. This configuration would take the form of a Windows 2000 (Win2k) server running IIS (which now ships as part of Windows 2000). A server running Win2k server (a.k.a. NT) would be able to handle both ASP and FrontPage Server Extensions (FPSE) functionality. (Note that one must download and install the FPSE for the Microsoft platforms; Win2k and NT do not ship with FPSE). If the ISP has a back end billing/provisioning system, then such system(s) must be modified to provision and bill this infrastructure.

If option #2 is taken, Apache (and most non-Microsoft based web servers) require additional components. These additional components can often be engineered on top of the providers existing web infrastructure. In order to support FPSE, Microsoft has a software package which in its current version (FrontPage Server Extensions 2002 for Unix) runs under Apache 1.3.19 and implements the server side of FrontPage. In order to implement MS ASP functionality under Apache, a package such as Sun Chili!Soft ASP must be implemented. There is also a Perl ASP (with Perl scripting only) implementation for Apache available, called Apache::ASP. I do not have any direct experience with either of these packages.

The other popular Web servers out there (iPlanet and Zeus) are not often deployed at service providers. This is probably due to the fact they are both commercial products and not free like Apache.

The Netcraft site has a good graphic showing the statistics of the various web servers out on the Internet. I was particularly interested in July, 2001 report, which showed a rather large downturn (4.29%) in the number of Apache sites and a significant upturn (5.49%) in the number of Microsoft IIS installations. According to the Netcraft analysis, this is due to two large sites converting to Microsoft, and has been masking a larger trend away from Apache to IIS. The August, 2001 report showed a much smaller increase in IIS deployments so the July 2001 increase appears to be an anomaly.

Web Server Issues

There are many challenges facing service providers who host web content. I will briefly discuss the issues and solutions for ISP's surrounding each.

Managing Domains

Managing a large number of domains is a problem. Most larger web hosting companies have a file naming scheme whereby the top level name space is broken down by the first letter of the domain, namely:

/www1/a    directory would house all domains starting with “a”
/www1/b    directory would house all domains starting with “b”

and so forth. Of course, the top level directory name might indicate the machine name (www1, www2, etc. which are each individual web server machines). Also, Apache's URL mapping scheme can be utilized to automatically redirect domains to the correct web server domain content via a “global” command, without the need for specific per-domain configuration entries in the Apache configuration file.

Email Aliasing

Email aliasing refers to the ability for email at a hosted domain (info@mydomain.com) be forwarded to a "real" mailbox, for example cust@isp.net where the customer actually picks up the mail through POP3, IMAP, webmail, etc. Both the Sendmail and Postfix mail transport agents have good support for virtual mail configurations. The ISP usually adds an interface (usually web based) for the customer support agent and/or the customers themselves to edit these mail mappings. These mappings must be forwarded to external accounts or the service providers "regular" dial up mail accounts. As a result, there is usually an interface between the providers billing system and mail infrastructure.

Legal Liability of Hosted Content

Most providers consider themselves "common carriers" and do not police the content placed on web sites by their customers. Of course, people can and do complain. If it is an obvious copywrite infringement (for example, posting illegal software or copyrighted mp3's) or something similar, the provider usually can and does take swift action without waiting for a court order. However, if the complaint is not as clear cut as that situation (for example, a site that satirizes someone or something), usually the provider will wait for a properly executed court order before taking action. Most if not all service providers perform no monitoring of content, as then the provider will be expected to monitor all content. Please be aware policies in this area vary quite a bit from one provider to another so it is dangerous to make too many generalities!

Security

Security of the server as well as security of customer data can be a problem in a shared web server environment. Many hosting providers will not allow arbitrary customer written/provided CGI to be run on the machine. Of course, CGI routines are usually made available for standard functions like web counters, comment sections, weblogs, etc. However, any non-standard code has to be reviewed by staff prior to implementation. Also, external programs such as CGIwrap are used to help ensure CGI programs are run in a secure manner. The normal file access control mechanisms of the host (Unix or NT/Win2k) are used as well.

Logs

Most ISPs generate access and error logs for their customers. Apache has excellent support for automatically generating these logs without human intervention. The Apache configuration commands used to generate per domain logs are “ErrorLog” and “CustomLog" and appear under each VirtualHost section on a per domain basis. Access to the logs is usually granted via the same FTP interface the customer uses for uploading their content.

Bandwidth

Most web hosting plans include limits on the amount of bandwidth their customer's pages generate. This bandwidth accounting is so the ISP can limit their exposure if a customer’s site should suddenly become very popular (for example, the customer begins to host pornography). If a customer’s site does become too much of a load on a shared server, the service provider will request the customer move to a dedicated server. Of course, the ISP will charge additional money for that additional functionality to cover costs. These costs are: server hardware (if provided by ISP), staff time (to setup and move the domain) and transit (network bandwidth to the Internet).

Billing Integration

A small ISP will usually do everything (setting up DNS, configuring Apache, etc.) by hand and as a result, no billing integration is required or possible. Billing for a larger ISP or web hosting provider would be much more integrated, as web hosting might be a larger part of the ISP’s business. Most providers have automated mechanisms for domain registration as well as signup for web hosting service. This would require an automated interface into the providers billing system. How this interface is achieved would be a function of the billing system as well as the provider's business model.

    E-commerce

For the purposes of this article, e-commerce (electronic commerce) includes: a shopping basket, managing an inventory and processing credit cards. The ISP who wants to retain their customer through the full life cycle and not lose them must have an e-commerce solution. This could be farmed out to a third party due to the financial risks something like this poses, or it could be done in house via commercial software or open source software.

M third party providers have a complete e-commerce solution. In addition, several business to business (B2B) and business to consumer (B2C) solutions exist for ISPs. The most commonly implemented open source shopping basket and inventory management software is Red Hat Interchange (formerly Akopia Interchange). Interchange is a full featured B2C software application written in Perl and is widely used. Processing credit cards is dictated by what software the ISP is using for their e-commerce solution. For example, according to their web site, Red Hat Interchange supports many credit card payment gateways, including Cybercash/Verisign (Verisign recently acquired Cybercash).

Conclusion

Next time I'll take a look at how ISP's design their backbone networks. In the meantime, if you have a question about ISP's, wondered why something was done a particular way, I'd love to hear from you!

Thank You

I’d like to thank Vinny Bono of GlobalNAPS for his input to this article.

References

Digex: http://www.digex.com/
ServInt: https://www.servintservers.com/
Apache Web server start page: http://httpd.apache.org/
Apache Virtual Host documentation: http://httpd.apache.org/docs/vhosts/index.html
Apache modules: http://httpd.apache.org/docs/mod/index.html
Microsoft web page: http://www.microsoft.com
Microsoft IIS: http://www.microsoft.com/iis
Microsoft FrontPage: http://www.microsoft.com/frontpage/
Microsoft FrontPage Server Extensions download: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnservext/html/fpovrw.asp
Sun Chili!Soft ASP: http://www.chilisoft.com/
Perl: http://www.perl.com
Apache::ASP: http://www.apache-asp.org/
Netcraft Web Server Survey: http://www.netcraft.com/survey/
iPlanet: http://www.iplanet.com/products/iplanet_web_enterprise/home_2_1_1m.html
Zeus: http://www.zeus.co.uk/
Sendmail virtual domain documentation: http://www.sendmail.org/virtual-hosting.html
Postfix: http://www.postfix.org
CGIWrap: http://cgiwrap.unixtools.org/
Red Hat Interchange: http://interchange.redhat.com
Cybercash/Verisign: http://www.verisign.com