The World Wide Web is a system of Internet servers that supports hypertext to access several Internet protocols on a single interface. The World Wide Web is often abbreviated as the Web, WWW, or W3.
The World Wide Web was developed in 1989 by Tim Berners-Lee of the European Particle Physics Lab (CERN) in Switzerland. The initial purpose of the Web was to use networked hypertext to facilitate communication among its members, who were located in several countries. Word was soon spread beyond CERN, and a rapid growth in the number of both developers and users ensued. In addition to hypertext, the Web began to incorporate graphics, video, and sound. In recent years, the use of the Web has now reached global proportions.
Almost every protocol type available on the Internet is accessible on the Web. Internet protocols are sets of rules that allow for intermachine communication on the Internet. The following major protocols are accessible on the Web:
E-mail (Simple Mail Transport Protocol or SMTP)
Distributes electronic messages and files to one or more electronic mailboxes
Telnet (Telnet Protocol)
Facilitates login to a computer host to execute commands
FTP (File Transfer Protocol)
Transfers text or binary files between an FTP server and client
Usenet (Network News Transfer Protocol or NNTP)
Distributes Usenet news articles derived from topical discussions on newsgroups
HTTP (HyperText Transfer Protocol)
Transmits hyptertext over networks. This is the protocol of the WWW.
Many other protocols are available on the Web. To name just one example, the Voice over Internet Protocol (VoIP) allows users to place a telephone call over the Web.
The World Wide Web provides a single interface for accessing all these protocols. This creates a convenient and user-friendly environment. It is no longer necessary to be conversant in these protocols within separate, command-level environments. The Web gathers together these protocols into a single system. Because of this feature, and because of the Web's ability to work with multimedia and advanced programming languages, the World Wide Web is the fastest-growing component of the Internet.
Hypertext: The Motion Of The Web
The operation of the Web relies primarily on hypertext as its means of information retrieval. HyperText is a document containing words that connect to other documents. These words are called links and are selectable by the user. A single hypertext document can contain links to many documents. In the context of the Web, words or graphics may serve as links to other documents, images, video, and sound. Links may or may not follow a logical path, as each connection is programmed by the creator of the source document. Overall, the WWW contains a complex virtual web of connections among a vast number of documents, graphics, videos, and sounds.
Producing hypertext for the Web is accomplished by creating documents with a language called HyperText Markup Language, or HTML. With HTML, tags are placed within the text to accomplish document formatting, visual features such as font size, italics and bold, and the creation of hypertext links. Graphics may also be incorporated into an HTML document. HTML is an evolving language, with new tags being added as each upgrade of the language is developed and released. The World Wide Web Consortium, led by Tim Berners-Lee, coordinates the efforts of standardizing HTML.
Pages On The Web
The World Wide Web consists of files, called pages or Web pages, containing information and links to resources throughout the Internet.
Web pages can be created by user activity. For example, if you visit a Web search engine and enter keywords on the topic of your choice, a page will be created containing the results of your search. In fact, an increasing amount of information found on the Web today is served from databases, creating temporary Web pages "on the fly" in response to user queries.
Access to Web pages may be accomplished by:
Entering an Internet address and retrieving a page directly
Browsing through pages and selecting links to move from one page to another
Searching through subject directories linked to organized collections of Web pages
Entering a search statement at a search engine to retrieve pages on the topic of your choice
Retrieving Documents On The Web: The URL
URL stands for Uniform Resource Locator. The URL specifies the Internet address of a file stored on a host computer connected to the Internet. Every file on the Internet, no matter what its access protocol, has a unique URL. Web software programs use the URL to retrieve the file from the host computer and the directory in which it resides. This file is then displayed on the monitor connected to the user's local machine. URLs are translated into numeric addresses using the Internet Domain Name System (DNS). The numeric address is actually the "real" URL. Since numeric strings are difficult for humans to use, alphneumeric addresses are employed by end users. Once the translation is made, the Web server can send the requested page to the user's Web browser.
Anatomy of a URL
This is the format of the URL:
protocol://host/path/filename
For example, this is a URL on the home page of the House Committee on Agriculture of the U.S. House of Representatives:
http://www.house.gov/agriculture/schedule.htm
This URL is typical of addresses hosted in domains in the United States.
Structure of this URL:
1. Protocol: http
2. Host computer name: www
3. Second-level domain name: house
4. Top-level domain name: gov
5. Directory name: agrictulture
6. File name: schedule.htm
Note how much information about the content of the file is present in this well-constructed URL. <> Other examples:
telnet://library.albany.edu - the University at Albany library text-based catalog
ftp://ftp.uu.net/graphics/picasso - a file at an ftp site
Several top-level domains (TLDs) are common in the United States:
com - commercial enterprise
edu - educational institution
gov - U.S. government entity
mil - U.S. military entity
net - network access provder
org - usually nonprofit organizations
In addition, dozens of domain names have been assigned to identify and locate files stored on host computers in countries around the world. These are referred to as two-letter Internet country codes, and have been standardized by the International Standards Organization as ISO 3166. For example:
ch - Switzerland
de - Germany
jp - Japan
uk - United Kingdom
It had been proposed that new top-level domains be added to the existing domain names. The U.S. Government has formed the Internet Corporation for Assigned Names and Numbers (ICANN) to work out these and other issues relating to domain names.
How To Access The World Wide Web: Web Browsers
To access the World Wide Web, you must use a Web browser. A browser is a software program that allows users to access and navigate the World Wide Web. There are two types of browsers:
Graphical: Text, images, audio, and video are retrievable through a graphical software program such as Netscape Navigator and Internet Explorer. These browsers are available for both Windows-based and Macintosh computers. Navigation is accomplished by pointing and clicking with a mouse on highlighted words and graphics.
You can install a graphical browser such as Netscape Navigator in your Windows-based or Macintosh machine. Navigator is available for downloading on the Netscape Web site: http://home.netscape.com/. Microsoft's Internet Explorer is available from the Microsoft Web site: http://www.microsoft.com/. To use these programs to access the Web, you need an ethernet connection or a dialup connection known as a SLPP or PPP. The latter may be obtained from an Internet Service Provider. For more information, see How to Connect to the Internet.
Text: Lynx is a browser that provides access to the Web in text-only mode. Navigation is accomplished by highlighting emphasized words in the screen with the arrow up and down keys, and then pressing the forward arrow (or Enter) key to follow the link. This browser is available through your personal VAX or UNIX account on campus. For more information, see Guide to Using Lynx.
Extending the Browser: Plug-Ins
Software programs may be configured to a Web browser in order to enhance its capabilities. When the browser encounters a sound, image or video file, it hands off the data to other programs, called plug-ins, to run or display the file. Working in conjunction with plug-ins, browsers can offer a seamless multimedia experience. Many plug-ins are available for free.
File formats requiring plug-ins are known as MIME types. MIME stands for Multimedia Internet Mail Extension, and was originally developed to help e-mail software handle a variety of binary (non-ASCII) file attachments. The use of MIME has expanded to the Web. For example, the basic MIME type handled by Web browsers is text/html associated with the file extention .html.
A common plug-in utilized on the Web is the Adobe Acrobat Reader. The Acrobat Reader allows you to view documents created in Adobe's Portable Document Format. These documents are the MIME type application/pdf and are associated with the file extension .pdf. When the Acrobat Reader has been configured to your browser, the program will open and display the file requested when you click on a hyperlinked file name with the suffix .pdf. The latest versions of the Acrobat Reader allow for the viewing of documents within the browser window.
Web browsers are often standardized with a small suite of plug-ins, especially for playing multimedia content. Additional plug-ins may be obtained at the browser's Web site, at special download sites on the Web, or from the Web sites of the companies that created the programs. The number of available plug-ins is increasing rapidly.
Once a plug-in is configured to your browser, it will automatically launch when you choose to access a file type that it uses.
Beyond Plug-Ins: Active X
ActiveX is a technology developed by Microsoft which may make plug-ins less neccesary. ActiveX offers the opportunity to embed animated objects, data, and computer code on Web pages. A web browser supporting ActiveX can render most items encountered on a Web page. For example, Active X allows users to view three-dimensional VRML worlds in a Web browser without the use of a VRML plug-in. As another example of the power of ActiveX, this technology can allow you to view and edit PowerPoint presentations directly within your Web browser. ActiveX works best with Microsoft's Internet Explorer browser.
Author: Laura Cohen lcohen@albany.edu. Special thanks to Laura for allowing Beginners.co.uk to reproduce this article.