8.1 The Nature of Web Pages

8.1  The Nature of Web Pages

    The World Wide Web’s architecture is simple. It has a large number of Web servers and a much larger number of Web clients. The Web clients most users use are the Web browsers although it is possible to develop simpler Web clients for very specific purposes. A Web client and a Web server communicate using HTTP—the HyperText Transfer Protocol. HTTP is a protocol or language for communication between a Web server and a Web client. The HTTP communication between a Web server and a Web client is translated into lower level communication to go over the network lines, but this translation is of no relevance to us here. There are a lot of books, primarily on computer networks that discuss the
HTTP protocol in great depth. Two such books are Computer Networks and Internets [Com99] and Data and Computer Communication [Sta97]. An easy and readable discussion of the HTTP protocol is found in Webmaster in a Nutshell [SQ96].

Web documents are of three types although the general perception is that a Web page is a simple textual file served by a Web server to a requesting Web client. They are the following.

•    Static:  A static Web page is a textual file that resides on a Web server. It is composed by a Web developer. Every time a such a page is requested by a Web client, the same unvarying page is sent to it by the Web server.

•       Dynamic: A dynamic Web page is not composed a-priori and stored in a file. A dynamic Web page is created by a Web server, with the assistance of an additional program when a client requests it. When a request for such a page arrives at a Web server, the server hands over the responsibility of creating the page to an auxiliary application program. Usually, the application program produces an HTML or XML page, gives it to the Web server that in turn, returns it to the requesting Web client. The client does neither know nor care whether it is a static page or a dynamic page. Since a new document is created to respond to each request from a client, the page served
for the same request can vary from time to time. Quite often, the dynamically created Web page is based on a template where data values are inserted for each request.

•    Active: An active Web page is a program that resides on the server. A Web server sends this program to the Web browser in response to a request. The program runs on the browser, communicates with the Web server and/or other programs, and displays the results on the browser.

  A static Web page is easy to create. It is usually created using HTML, but more frequently using XML as well. One does not need to know HTML to produce HTML-based Web pages. One can use GUI-based tools sold by various companies for this purpose. An example of such a tool is Dreamweaver by Macromedia, Inc. Such a tool produces HTML automatically. There are a few XML-based GUI tools as well, but in general, producing an XML page requires more work. A static Web page can be served quickly and reliably by a Web server without any fuss or delay. A static Web page can be placed in its cache for a short period of time by a Web browser for future display without contacting the server. The
main disadvantage of a static Web page is that its contents are fixed. A static Web page cannot contain data that is dynamic.

A dynamic Web page can present time-varying data, or data that is dynamic in some other fashion such as being dependent on a browser’s characteristics. Time-varying information includes current time, current temperature and other weather-related information, price of plane tickets, and stock market quotes, etc. When a browser makes a request for a dynamic Web page, the server contacts an application program that creates the Web page by inserting the current values of the dynamic data. The server sends this page to the client. The main disadvantage of a dynamic document is that it places additional burden on the Web server compared to returning a static Web page. A dynamic document needs a
programmer to create it. A dynamic document requires more extensive testing than a static document, because the data can come from various sources. One very common source of data is HTML forms.

A dynamic document has another problem. It places the computational responsibility solely on the Web server. It does not place any additional burden on the Web client or the machine on which the Web client runs. Now-a-days, most users have reasonably fast computers, and therefore, some of the computing burden can be easily shifted to the client or the client’s computer. This is exactly what happens in the case of an active Web page. The server sends a program to the Web client that runs on the client’s computer. Such an active Web page can perform animation, display time-varying graphs, produce sound, among other things. This is useful for displaying data that changes very quickly, such as
stock quotes or weather information in the middle of a fierce storm. If we depend on dynamic pages for such situations, the server will be over-burdened. In addition, the updates on the client will be slow. An active Web page must be able to communicate with the Web server or other server programs to continuously retrieve data and update the presentation. The main disadvantage of an active Web page is that it requires programming, and thus, cannot be produced by a person without programming skills. An active Web page is difficult to test because it can be requested by any computer on the Internet, and computers on the Internet vary widely in terms of computing power, operating system used, and applications that reside on them. Finally, an active Web page
can be a security risk. For example, an active Web page may contain a computer virus that can run on the client machine unless checked and deterred. An active Web page may also be able to export data from the client’s computer to other machines. If mischievous, such a program may be able to scan the client machine’s hard drive and report on the preferences of the user such as what programs reside on it, the presence or absence of specific kinds of files such as bootlegged programs or music files, or pornographic images or video files. Thus, an active Web page is not always the boon it is made out to be by its advocates. It can be a bane as well, compromising security and privacy.

In this chapter, we discuss how to create dynamic Web documents. We focus on one type of dynamic documents—ones that are produced by CGI programs. A Web server that serves only static Web pages must be modified in the following ways to be able to serve dynamic Web pages.

•    The server program must be extended so that it can execute another application that creates the dynamic Web page for each client request. This application program may, in turn, communicate with other programs such as a Web server.

•    A separate application program must be written for each dynamic document. These programs normally reside on the same machine as the Web server.

•    The Web server must be configured so that it knows which URL corresponds to a static Web page and which to a dynamic Web page. For each dynamic Web page, the URL itself specifies the application program to run.