Internet services are provided with servers, which, in turn, provide services.
Distinction between Clients and Servers
Most of the resources made available via the Internet - and all the major services discussed in this book - use a client/server architecture. The distinction between clients and servers is fairly straightforward:
Clients request information from a server for display on a monitor or other use by, or on the behalf of, a user. For example, a Web client (usually called a browser) will request information from a Web server, and then display the data it receives for a user.
Servers provide, or serve, information to clients. A Web server, for example, will listen for requests from clients, and, when it receives one, will send the requested data to the client that asked for it.
Servers, by their nature, require full-time. They always need to be available to answer clients' requests. Clients, on the other hand, need be connected to the Internet only when in use by a user or a user-agent, computer software acting on behalf of a user. Servers are sometimes also known as daemons, the name for any UNIX program that runs as a background process at all times.
In order for clients to communicate with servers, a protocol must be defined. Just like the protocols, or rules of etiquette, you learned as a child - rules governing your expected behavior in specific situations - Internet protocols are rules governing the expected behavior between servers and clients to guarantee the orderly exchange of data. Client/server protocols for Internet services sit atop the IP protocol.
Many client/server architectures on the Internet are also distributed, meaning that there is no one central server to which all clients must connect and depend upon. Distributed services improve reliability and performance. Reliability, in that there is no single point of failure; with multiple servers, if one server isn't available, others are. Performance, in that it is easier and cheaper to build a fast, efficient service using multiple servers and networks, instead of attempting to scale a service with faster and faster hardware for one, monolithic server. Distributed services can also be designed so that there is no longer a centralized authority for the service, allowing anyone to provide a service without going through a central registry or other potential arbiter of content. Most Internet services share this feature of a distributed architecture, with the notable exception of the Domain Name System, or DNS.
Types of Internet Services
There are many types of Internet services, each suited to particular tasks and particular audiences. In this book, we cover electronic mail, mail-based services, FTP, Gopher, the Web, and DNS, as well as various miscellaneous services.
Electronic Mail
Electronic mail combines the speed and convenience of the telephone with the accountability of postal mail. Most e-mail arrives within minutes of when it is sent but doesn't interrupt the recipient, and e-mail can be filtered, prioritized, and saved. E-mail is used for person-to-person communications, and for workgroup and discussion group activities. In the latter case, a single message can have multiple recipients, so that everyone within the group is part of the discussion.
Mail-Based Services: Mailing Lists and Auto-Reply
A mailing list is a list of e-mail addresses available through a single e-mail address. When a message is sent to the mailing list's address, the message is "exploded" into the entire list, so that each member of the list receives a copy. Lists are usually large discussion groups devoted to a single narrow topic, although some lists are broadcast-only, such as company product announcement lists and newsletters. Discussion lists are also sometimes moderated in order to keep the signal-to-noise ratio low: in a moderated list, all messages sent to the list must first be approved by a moderator before being redistributed to the list.
An auto-reply service behaves as you might expect from its name: when a message is sent to a particular address, a reply is automatically generated and mailed to the original sender. The auto-reply might simply return a single message regardless of the content of the mail it receives, or it might attempt to interpret the sender's text as commands, and return a message appropriate to the original request. Although auto-reply is the most basic of information services, it shouldn't be overlooked as you consider other services to deliver your content, for the very reason that it is so basic. Although FTP, Gopher, and Web access are becoming more widespread, virtually everyone with any kind of connection to the Internet can be reached through e-mail. For short, informative, text-only resources, an auto-reply message can sometimes be the quickest, most efficient, and convenient way of delivering that content.
File Transfer Protocol
The File Transfer Protocol, or FTP, is an Internet standard for transferring files to and from computers on the Internet. Anonymous ftp is probably the most common form of FTP, in which files are available for retrieval by anyone on the Internet who identifies himself or herself as the user "anonymous." Otherwise, FTP is restricted to users with an appropriate user name and password for a site.
FTP is an obvious choice for making executable programs available to the public, or for exchanging large data files among collaborators. It's far from ideal for publishing information in the traditional sense, however; users have only filenames to rely on for descriptive information on what files might contain, and, in order to view a document, users must download the entire file.
Gopher
Gopher is a simple menu-based information service. Gopher clients present lists of items that can be downloaded and displayed (either within the client itself or with a helper application), as well as items that are links to other directories or servers.
The benefits of presenting information with Gopher are largely those of ease of use: the menu-driven interface of Gopher clients makes Gopher simple to use, and the absence of a specialized markup language or complex protocol makes Gopher sites fairly straightforward to set up and maintain. These benefits also become Gopher's drawbacks, however; since it doesn't have a standard formatting or markup language, and isn't able to handle multiple media types within one document, Gopher services are, by necessity, largely text-based. And Gopher's linking ability is limited to menu items; links from within documents that point to other documents aren't possible.
Although Gopher continues to be extended (Gopher+ includes forms support), and Gopher clients are being built with better integration between the Gopher client and other media players (such as the Adobe Acrobat reader), its appeal compared to Web-based services, described next, may be limited. At the same time, all Web browsers have the ability to access Gopher resources, and making existing documents available via Gopher may be the quickest or most economical way of placing your content on the Internet.
World Wide Web
Much like Gopher, the World Wide Web is a distributed information system, although many of the similarities end there. Most Web documents are written in Hypertext Markup Language, or HTML. HTML gives some page layout capabilities to documents, including the inlining, or inclusion, of images, while making them available across a wide variety of platforms. HTML also makes it possible to place hypertext links anywhere within a document. The concept of hypertext has moved from the esoteric to the everyday in the past decade, but, to put it briefly, hypertext is text that is not constrained to be linear - text with links. Footnotes, which might provide additional information on a topic or refer to other texts on the same subject, are an example of hypertext in traditional media.
Outside of HTML, most Web browsers support a multitude of other formats, either directly within the browser itself or through the use of helper applications; in fact, almost any browser can be configured to use additional helper applications in a virtually seamless way. In addition, other description languages are being developed - such as Virtual Reality Markup Language, or VRML - for use on the Web, as well as client-side scripting languages for true distributed processing of multimedia applications, such as Sun's Java.
Domain Name System
The Domain Name System, or DNS, resolves the names of computers - such as www.freedonia.com - into Internet Protocol, or IP addresses - such as 204.62.130.118. Every computer on the Internet is assigned an IP address, which computers use to locate other computers on the Internet. DNS is a naming scheme primarily intended for people to associate usually easy-to-remember names with almost always hard-to-remember numbers.
Depending upon your situation, it may not be necessary to run your own domain name server, however. In most cases, your Internet Service Provider (ISP) will provide DNS for you, and will usually perform such tasks as registering new domain names for a nominal fee. If your ISP is able to provide only primary or secondary service, but not both or if your local network is large enough that local administration would be greatly simplified by running your own DNS, you may wish to run DNS.
Miscellaneous Services
The miscellaneous services covered in this book - services not important or widespread enough to warrant their own chapters - are lumped together into a single chapter of this book. We cover Finger, Chat, Talk, and other services. See Chapter 10 for a description of each, to determine if any of these miscellaneous services might fill a particular need.
News
A news server is the online discussion forums available via Usenet, which is a network defined, suitably enough, as a collection of all the machines that carry and distribute news. (Though Usenet is largely congruous with the Internet, there are machines that are considered part of Usenet that aren't on the Internet.)
At the same time, news is a lot like DNS, in that, for most small organizations, news services (a "news feed") will be provided by the organization's Internet Service Provider.
Selecting the Proper Hardware for Your Services
Once you've selected the services you'd like to offer, you'll need to decide on the hardware needed to run them.
Any machine running Windows 95 NT on a Pentium CPU, Unix on a Sun Sparc Station,or Mac OS 7.5.x with Open Transport on a PowerPC CPU - will have ample horsepower to run all but the busiest of Internet sites. Unless you're in a university environment and hope to provide e-mail services for thousands of staff, faculty, and students, or you're providing some high-traffic resource on the Web in excess of 200,000 connections per day, you should have plenty of processing power to run your services.
You will want to ensure that your hardware has adequate memory. You'll need enough real memory in which to run your Internet server applications comfortably; using virtual memory or RAM Doubler memory hurts performance and makes your server less reliable. Most of the server applications discussed here require about 1 megabyte of memory, with Common Gateway Interface (CGI) add-ons to Web services increasing that figure. Depending on the memory footprint of your system and the number of Internet server applications you hope to run a machine running 16 megabytes of memory or greater may be appropriate.
There are two steps you can take to help guarantee your hardware provides the most stable environment possible, described in the following subsections.
1. Use a Dedicated Machine
Although you can run server applications in the background of a machine that is running, say, PhotoShop in the foreground, that doesn't really give an environment robust enough for providing an Internet service, both in terms of stability and speed.
2. Run the Latest System and Network Software and a Minimum of Nonessential Extensions.
Running the latest versions of your operating system should be obvious: as Apple fixes bugs and increases reliability and performance, you'll want to take advantage of these improvements. Installing a minimum of nonessential extensions should be fairly obvious, as well. Most extensions will add load to your system, slowing it down, and introduce additional layers of complexity, making your system less stable.
Multiple Servers
You can also take advantage of multiple machines serving the same content, if you use a round-robin domain name server, which will point first to one machine, then to another, for the same host name. For example, you can set up two machines as Web servers, both housing the same content with the same directory structure, named www1.freedonia.com and www2.freedonia.com, respectively. You can then configure your DNS to point first to www1.freedonia.com, and then to www2.freedonia.com, in turn, when clients request www.freedonia.com, effectively splitting the load between the two machines. The problem with this scheme? At the time of this writing, the software needed in order to mirror - or copy - the content of one server from another is still under development. Also, when using multiple machines to provide a single service, you'll need to deal with such issues as synchronization of user-submitted data, although this is often handled by making only one machine responsible for data processing and data storage.