Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Web Components (Introduction) Chapter 1 1 Web Protocols and Practice INTRODUCTION Topics          2 Web History Web Definition Semantic Components of the Web Content on the Web Software Components Underlying Network Standardization Web Traffic and Performance Web Applications Web Protocols and Practice INTRODUCTION Web History  1945: Vanner Bush proposed Memex which is a device to extend human memory by providing a large scaling indexing of text.  1965: Hypertext: Non-sequential writing that presents information as the collection of linked nodes.  1960-1970: U.S. Department of Defense extended the use of its communication infrastructure (ARPANET) for the connected computers. In 1980 they deployed TCP/IP that caused rapid growth in size and scope of ARPANET.  1989: Tim Berners Lee proposed using the hypertext for accessing the information of the computers at CERN 3 Web Protocols and Practice INTRODUCTION Web History  During 1980-1990 these systems have been used widely on Internet to access the information :  FTTP: For file transfer. It works by knowing the ftp server.  Gopher: Provided the ways for the users to search the servers in the network.  WAIS (Wide Area Information Servers): Allowed users to send queries to the databases around the network.  Archie: Global index of ftp servers that allowed the users to do the search based on file name.  1992: The first official release of the web browser.  1993: First graphical web browser (MOSAIC) 4 Web Protocols and Practice INTRODUCTION Web Definition  The World Wide Web, or simply the Web, is the universe of information accessible via networked computers.  Internet is different from web. It is a network of computers, in which a computer may not necessarily act as a web client or web server. 5 Web Protocols and Practice INTRODUCTION Semantic Components of the Web  Three main semantic components of the Web are:  A naming infrastructure (URI)  A document language (HTML)  A message exchange protocol (HTTP) 6 Web Protocols and Practice INTRODUCTION URI (Uniform Resource Identifier)  Accessing and manipulating resources distributed throughout the Web requires a way to identify them. URI is a universal naming mechanism for identifying resource on Web independent of its current location or value.  URI can be thought of as a pointer to a black box to which request method can be applied to generate different responses at different times. Request method is a simple operation such as fetching, changing, or deleting a resource. for example in the high level a string such as http:// www.foo.com/coolpic.gif is a URI.  Later we will see how it is different from URL. 7 Web Protocols and Practice INTRODUCTION HTML (Hypertext Markup Language)  HTML provides a standard representation for hypertext documents in ASCII format. 8 Web Protocols and Practice INTRODUCTION HTTP (Hypertext Transfer Protocol)  HTTP is the most common way of transferring resources on the Web.  HTTP defines the format and meaning of messages exchanged between web components, such as clients and servers.  HTTP is simply a language that has specific syntax and semantics associate with the use of the language elements. 9 Web Protocols and Practice INTRODUCTION HTTP (Hypertext Transfer Protocol)  HTTP is a request-response protocol  The client sends a request message and then the server replies with the response message.  HTTP is a stateless protocol  clients and servers treat each message exchange independently and are not required to maintain any state across requests and responses. 10 Web Protocols and Practice INTRODUCTION Table 1.1. Common Web terms Term Definition WWW/Web World Wide Web, the universe of information accessible via networked computers Hypertext Nonlinear writing or linking related documents for navigation Internet Worldwide collection of interconnected networks using the Internet Protocol (IP) Web page Document accessible on the Web via a URI Web site Collection of related Web pages Browser Application for requesting and displaying Web resources 11 Web Protocols and Practice INTRODUCTION Content on the Web  Each resource may be available in different formats for example:  HTML  PostScript  A resource may be:  A static file on a machine  Generated dynamically at the time of the request 12 Web Protocols and Practice INTRODUCTION Content on the Web  Each HTTP transfer consists of two messages:  The request message » Sent by the client  The response message » 13 Sent by the server Web Protocols and Practice INTRODUCTION Table 1.2. Terminology related to Web resources and HTTP messages Term Definition Resource Network data object or service identified by a URI Message Basic unit of communication in HTTP Sender/receiver Component responsible for sending/ receiving a message Header Control portion of a message Entity Information transferred in the body of a message 14 Web Protocols and Practice INTRODUCTION Software Components  User agent  A user agent can be a Web browser that generates requests on behalf of a user and performs a variety of other tasks, such as displaying Web pages and storing the user's bookmarks.  Proxy  A proxy is an intermediary between clients and servers that performs a variety of functions: » filtering of requests to undesirable Web sites » Providing a degree of anonymity to clients » caching popular resources. 15 Web Protocols and Practice INTRODUCTION Software Components  Server  The server may instruct the user agent to retain state across a series of requests and responses by storing a cookie. We will discuss cookies later 16 Web Protocols and Practice INTRODUCTION Table 1.3. Terminology related to the software components of the Web Term Definition User agent Web client Web Server Client program that initiates a request (e.g., a browser) Program that sends an HTTP request to a Web server Program that receives an HTTP request from a Web client and transmits a response Origin Server Server where the requested resource resides or is created Intermediary Web component in the path between the user agent and an origin server (e.g., a proxy, gateway) Proxy Intermediary program that functions as a server to a client and as a client to a server Cookie State information passed between the user agent and the origin server 17 Web Protocols and Practice INTRODUCTION Underlying Network  A Web client identifies the Web server by the hostname (e.g., www.att.com), rather than an IP address by using Domain name system (DNS)  The two applications exchange HTTP messages  By using Transmission Control Protocol (TCP) The client and the server establish a TCP connection. 18 Web Protocols and Practice INTRODUCTION Table 1.4. Terminology related to the Internet and its protocols of the Web Term Host Packet IP IP address Hostname DNS TCP Connection 19 Definition Computer or machine connected to the network Basic unit of communication in the Internet Internet Protocol, a protocol that coordinates the Delivery of individual packets between hosts 32-bit numerical address identifying an Internet host Case-insensitive string identifying an Internet host Domain name System, a distributed infrastructure for translating between hostnames and IP addresses Transmission Control Protocol, a protocol that provides the abstraction of a reliable, bidirectional connection Logical communication channel between two hosts Web Protocols and Practice INTRODUCTION Standardization A protocol standard is needed for interoperation of the components.  The Internet Engineering Task Force (IETF) is an open community that deals with Internet standardization through a series of official publications called Request for Comments (RFC)  Not all Internet Drafts become RFCs. RFCs are divided into different tracks: standards, historic, informational and Experimental 20 Web Protocols and Practice INTRODUCTION Standardization Standard documents have compliance requirements of the following levels:  Any compliant implementation has to meet all the MUST-level requirements.  An implementation can be considered conditionally compliant if it meets all the SHOULD-level requirements.  The MAY- level requirements are optional for an implementation to meet. 21 Web Protocols and Practice INTRODUCTION Standardization  A standards document proceeds through three stages:  Proposed Standard  Draft Standard  Internet Standard  Some RFCs reflect the Best Current Practices (BCP)  Standards do not last forever; they can be retired and replaced by a superior specification. 22 Web Protocols and Practice INTRODUCTION Standardization  World Wide Web Consortium (W3C) was founded in 1994 to encourage the growth of Web.  The W3C works on » The representation of Web content, such as the » » HTML language, rather than the networking aspects Architectural issues User-interface issues  Formats  Languages » » » 23 Social issues Legal and public policy matters Accessibility issues to ensure that people with disabilities are able to have access to the technology Web Protocols and Practice INTRODUCTION Table 1.5. Terminology related to Internet protocol standards Term Definition IETF Internet Engineering Task force, an open community contributing to the evolution of the Internet Working Group IETF group chartered to work on a particular standards specification Internet Draft Informal version of a standards documents reflecting work in progress RFC Request for comments, an official document related to Internet standards 24 Web Protocols and Practice INTRODUCTION Web Traffic and Performance  User expectations for quick responses have focused attention on performance issues.  High user perceived latency can be because of variety of factors such as:  DNS overhead  Network congestion  Load on server  Analysis of logs is a useful for knowing the workload characteristics such as time between the requests and size of the requests and resource popularity, which have the important implications on Web performance 25 Web Protocols and Practice INTRODUCTION Table 1.6. Terminology related to Web traffic and performance Term Definition Latency Time between the initiation of an action and the first Indication of a response User-perceived Time between a user action and the initial display of latency the content Bandwidth Amount of traffic that can be carried per unit time Workload Inputs received by a Web component over time Log Record of transactions performed by a Web component 26 Web Protocols and Practice INTRODUCTION Web Applications  Important applications are:  Web caching » Caching moves contents closer to the user. » A cache can be located at  A user's browser  An origin server  A machine in the path between the user and the origin server  Multimedia streaming » The client plays the samples and frames as they arrive from the server, rather than downloading the content in its entirety before beginning playout. 27 Web Protocols and Practice INTRODUCTION Table 1.7. Terminology related to Web catching and multimedia streaming Term Cache Cache coherency Replication Content distribution Audio/video stream Streaming Media player 28 Definition Store of messages used to reduce userperceived latency and load on the network and server Mechanism to lower the possibility of returning out-of-date messages from the cache Duplication of resources on multiple origin servers Delivery of resources on behalf of an origin server Sequence of audio samples or video frames Overlap of the server transmission and client playback of audio/video data Helper application for playing multimedia streams Web Protocols and Practice