ISPadmin

Mail architecture


INTRODUCTION


In this column, I will cover various topics that are in some way unique to the Service Provider (SP) industry. Before working in the ISP industry, I often wondered how SP’s handled problems like high volume mail or news, web hosting, etc. I will attempt to illustrate how many SP’s engineer various services for this often high volume, high expectation industry. The following topics are to be covered (in no particular order):

·    RADIUS
·    LDAP
·    Provisioning/billing
·    DNS
·    News
·    Security
·    Web caching
·    Web hosting
·    Network monitoring/SLA’s

I will use the various Service Providers I have worked for in the past as the primary case studies, including Time Warner Cable of Maine and Ziplink, Inc. I will also attempt to cover alternate case studies as well, where appropriate.


THE PROBLEM OF MAIL AT A SERVICE PROVIDER

In this installment, I will look at how mail solutions are architected. At any SP, implementing a robust mail architecture is different from a typical enterprise for the following reasons:

·    High volume of mail
·    Many customers utilizing mail
·    High expectations as this sometimes a pay-for service

Now, that is not to say that some enterprise mail systems can’t have the above characteristics; they certainly can. It’s just that these characteristics define any SP’s mail architecture.

I would be willing to bet that the reason most people obtain Internet access is first and foremost to read and send email. Sure, they want to surf the web, but ask most subscribers what the most important application they use when online and I’m sure they’d answer “email.” This popularity translates into lots of email going to and from many subscribers. The proliferation of email based greeting cards, jokes, hoaxes, spam, etc. only serves to put additional pressure on SP’s mail infrastructure. Let’s start by examining how an enterprise might engineer their mail system.


A SIMPLE EXAMPLE


A small to medium enterprise has different goals than an ISP when it comes to designing a mail infrastructure. However, it is still worthwhile to compare how a typical enterprise mail setup compares to a SP mail infrastructure. I will assume that this imaginary enterprise is behind a firewall for security purposes. Their mail system might be setup like the diagram in Figure 1.











In most of the enterprises I am familiar with, the firewall only accepts mail connections from the internal mail server on the secure interface as this limits exposure to potential security problems. However, one could easily setup the firewall to accept outbound connections from any client originating on the secure interface. The firewall must always accept inbound mail from anyone (except perhaps those servers listed in MAPS’ or similar anti-spam “black hole” lists if the site chooses to subscribe to such a service) coming in on the insecure interface on port 25. In any case, the firewall must act as inbound and outbound mail relays would act in a SP environment, while the single mail server machine handles all other mail functionality. This single mail server machine ends up being a major bottleneck in a SP environment. To address this shortcoming, the problem of mail is decomposed into its smaller pieces, which is the topic of the next section.


BREAKING DOWN THE PROBLEM OF INBOUND MAIL

The way mail is engineered at SP’s is to decompose the process into smaller, scalable pieces. Mail functionality can be broken down into these categories:

·    Relaying
·    Storing/end user retrieval of messages
·    Forwarding mail
·    Mailing lists
·    Bouncing mail for unknown users


Figure 2 demonstrates how a relatively large ISP might engineer an inbound mail solution.









Figure 2 requires a bit of explanation prior to going into detail on each particular part. The arrows in Figure 2 illustrate the flow of inbound mail messages. The ellipses indicate that the functionality is scaled depending upon the load; for example, there is no need to have the same number of relay machines as store/forward machines. Each function is scaled depending upon the requirements of that particular service. Mailing list maintenance and bounce functionality loading is relatively light, and as a result, would most likely be the last machine to require scaling. It is important to note that within a particular class of machine (relay for example), the servers are essentially clones of one another, and can be brought up and down at will (ensuring appropriate queues get processed of course). The message store is usually designed to access a shared file system (NFS, SAN, etc.) for the messages. This system is engineered with an appropriate level of  redundancy within the file system in order to alleviate any possible single point of failure.

Inbound Mail Relaying

Most ISP’s have one or more machines dedicated to mail relaying. In fact, most very large ISP’s split inbound and outbound mail relays and have multiple machines dedicated to each type of functionality, spread across their network. In this context, inbound mail refers to mail coming from other places (i.e., Internet or other WAN) destined for an end customer of that ISP. Outbound refers to mail originating on the ISP’s network destined for another network.

In Figure 2, mail from the Internet at large would hit a series of dedicated inbound mail relays.  These inbound mail relays might perform some sort of basic anti-spam checking (for example, check for the originating network to be listed in Mail Abuse Prevention Project’s Real time Black hole List a.k.a. MAPS RBL or the relays might run Blackmail software for domain and other message/header validation). Once these basic checks are performed, the mail is forwarded on.

Typically the server software for relay functionality is Sendmail, although other mail server software can be, and is, used. The setup of such inbound mail relays is relatively straightforward, as it is a relatively simple problem to send mail from point “A” to point “B.”  The mail relay servers would need to know what domains it is accepting mail for (these would be hosted domains of course) and forward the message on to the appropriate mailbox. Typically, this is done through a Unix db file and Sendmail setup. However, with the advent of directories, LDAP is a much easier and scalable way of solving what domain mail goes where.


Store/Forward (and a word about Provisioning)

The mail relays would then pass messages to a series of store/forward machines, which accept and deliver mail locally for legitimate users and forward mail on for customers who choose to retrieve their mail from some other server. This is a relatively easy problem to solve, for a small network. However, when the number of mail accounts exceeds several thousand or so users, the directory lookups can take so much time that an alternate scheme for storing messages must be deployed. The discussion here is centered upon a POP3 solution; the topic of IMAP will not be addressed.

The methodology utilized to address scaling of services as it pertains to mail storage in the past was to use the POP3 proxy functionality and forward the request to the appropriate machine by using some sort of a database updated by the provisioning process. I must digress here and explain a little about what provisioning is. Provisioning is simply setting up subscriber accounts. It usually means performing the following steps:

·    Creating a Unix account with an invalid shell on a mail machine for mail retrieval by customer
·    Setting up a Unix account on an FTP server so a customer can update his/her web site
·    Configuring an Apache web server home directory for the customer
·    Etc.

A full discussion of provisioning is out of the scope of this article. I may speak to this topic in a future column, if there is interest.

Besides utilizing the POP3 proxy functionality mentioned above, a more recent development in the area of scalability would be to utilize LDAP to determine exactly what machine the customers mail resides upon. The advent of the Pluggable Authentication Module or PAM makes utilizing LDAP a much easier proposition than it did before PAM arrived on the scene. Once again, a full discussion of PAM and LDAP deserves its own column and is out of the scope of this discussion. The references section contains some links to resources on integrating Sendmail with LDAP and PAM.

A typical mail store would run Sendmail to receive mail and Qpopper to allow POP3 access by end subscribers. These machines need to be controlled by the provisioning process so they know which subscribers are active and which to bounce. They would also utilize some sort of a shared file system (SAN, NFS, etc.) so that the load on the message stores can be scaled easily.

Mailing lists/bouncing mail


The final step would be to have the mail store machines forward mail destined for unknown recipients to a machine or set of machines dedicated to list processing and bouncing any message that wasn’t addressed to a hosted list. Typically, this is a machine running vanilla Sendmail and Majordomo list processing software. If the message is a hosted list, the list is expanded and sent to the mail store and outbound mail relays for final delivery. If the message is not a hosted list, then the message is bounced back to the sender, as it is undeliverable.

Typically, this functionality doesn’t take a lot of resources, so this would be the last machine to require scaling. Also, it is relatively straightforward to configure. It does not require access to the provisioning process, and can easily scale without a need for a shared file store or other such complications.


OUTBOUND MAIL RELAYING


Outbound mail refers to clients sending mail to the outside world. Inbound and outbound mail relays can be the same machine. The only additional functionality an outbound mail relay would perform is an address range check. This check is to ensure that only end subscribers of the SP can relay mail through the machine. If this check were not made, any arbitrary user could send mail through the relay, which is known as an “open relay” and is a “Very Bad Thing.”


MAIL SERVER SOFTWARE BESIDES SENDMAIL

As I have previously mentioned, most SP installations utilize Sendmail. I think the reason for this is a testament to how robust and flexible Sendmail has proven over the years. However, there are other solutions out there, in use by SP’s. Freeware mail server software would include:

·    Qmail
·    Postfix
·    Exim

Commercial solutions include:

·    Intermail Post.Office from Openwave Systems, Inc. (formerly software.com)
·    PMDF from Sun/Netscape Alliance (formerly Innosoft, Inc., now supported/developed by Process Software, Inc.)
·    CommuniGate Pro from Stalker Software, Inc.

While I have no direct experience with any of the above solutions (either freeware or commercial), I am certain they all can be made to work in SP environments.


SPAM

No discussion of SP mail solutions would be complete without including the topic of spam. The problem of spam can be broken down into two parts: inbound and outbound. Most if not all available solutions today address the problem of inbound spam; I am aware of no commercially available solution that tackles the specific problem of outbound spam.

There is some anti-spam support within recent versions of Sendmail. Here is a list of some of the features within Sendmail 8.10:

·    Anti-spam rule sets
·    Content based filtering
·    Built in SMTP authentication
·    RFC2505 support
·    RFC2476 (Mail Submission Agent specification)
·    Specific senders/recipients can be allowed or disallowed

However, the Sendmail anti-spam functionality does not go far enough for most ISP’s, so additional pieces must be added. Some third party freeware available includes:

·    Blackmail (implements many of the recommendations in RFC2505)
·    Spamshield (counts log file entries for users sending large amounts of mail and can stop them in real time if desired)

Another methodology for blocking spam is to utilize a service such as Brightmail. The Brightmail Logistical Operations Center has spam forwarded to it from “mail probes” located at SP’s around the world. Their staff generates rule sets for their spam blocking software that works in conjunction with an ISP’s mail infrastructure. These rule sets identify specific pieces of Unsolicited Commercial Email and “sideline” them for later perusal by the end subscriber. This service can be a very effective method of blocking inbound spam. Note that Brightmail also offers a free service which blocks mail via POP3 proxy. You can find more information under the “Brightmail Individual” heading on the Brightmail web site.


DEALING WITH LARGE AMOUNTS OF MAIL

One of the problems faced by both enterprise and SP system administrators alike is how to deal with large volumes of mail. In a SP environment, the abuse mailbox can easily run into the thousands of messages per day. This doesn’t include the mail that system accounts such as “root” generate on a typical day. (Of course, ISP’s typically have a NOC or other support personnel who (are supposed to) respond to abuse complaints in a timely fashion.) I have used two methodologies when dealing with system (non abuse) mail, neither with much success:

1.    Forward all mail to a central location and read it from there
2.    Read all mail locally

The issue with #1 is that under certain conditions, the volume of messages can easily bring down even the most robust mail system. The issue with #2 is how to read system mail on 200 servers each day and get some productive work accomplished. If anyone has any thoughts on methods to deal with this topic, I’d love to hear from you.


CONCLUSION

A Service Providers’ mail infrastructure must be designed for robustness and scalability. Robustness is handled by utilizing time proven hardware, software and designs. Scalability is achieved by decomposing the problem of handling mail down into its component problems: relay, storage, bounce, etc.

Next time I’ll cover the little known topic (outside of the ISP industry) of Remote Authentication Dial In User Services or RADIUS. In the meantime, please send your questions or comments on this column, Unix systems administration, or other related topic to me! I’d love to hear from you.


REFERENCES

Mail Abuse Prevention Project: http://www.mail-abuse.org
Sendmail.net (articles on using Sendmail): http://www.sendmail.net
Sendmail Consortium (freeware): http://www.sendmail.org
Blackmail: http://bitgate.com/spam
LDAP man (articles on configuring LDAP): http://www.ldapman.org
Linux-PAM: http://www.lyre-mit-edu.lkams.kernel.org/pub/linux/libs/pam/
Qpopper: http://www.eudora.com/qpopper/
Majordomo: http://www.greatcircle.com/majordomo/
Qmail: http://www.qmail.org/
Postfix: http://www.postfix.org/
Exim: http://www.exim.org
Intermail Post.Office: http://www.openwave.com/index.html
PMDF: http://www.innosoft.com/
CommuniGate Pro: http://www.stalker.com/
IETF RFC tool: http://www.ietf.org/rfc.html
Blackmail: http://bitgate.com/spam
Spamshield: http://spamshield.conti.nu
Brightmail: http://www.brightmail.com/