* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download slides - network systems lab @ sfu
Survey
Document related concepts
Transcript
Chapter 13 TCP Implementation High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Objectives Understand the structure of typical TCP implementation Outline the implementation of extended standards for TCP over high-performance networks Understand the sources of end-system overhead in typical TCP implementations, and techniques to minimize them Quantify the effect of end-system overhead and buffering on TCP performance Understand the role of Remote Direct Memory Access (RDMA) extensions for high-performance IP networking High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Contents Overview of TCP implementation High-performance TCP End-system overhead Copy avoidance TCP offload High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Implementation Overview High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Overall Structure (RFC 793) Internal structure specified in RFC 793 Fig. 13.1 High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Data Structure of TCP Endpoint Data structure of TCP endpoint Transmission control block: Stores the connection state and related variables Transmit queue: Buffers containing outstanding data Receiver queue: Buffers for received data (but not yet forwarded to higher layer) High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Buffering and Data Movement Buffer queues reside in the protocol-independent socket layer within the operating system kernel TCP sender upcalls to the transmit queue to obtain data TCP receiver notifies the receive queue of correct arrival of incoming data BSD-derived kernels implement buffers in mbufs Moves data by reference Reduces the need to copy Most implementations commit buffer space to the queue lazily Queues consume memory only when the bandwidth of the network does not match the rate at which TCP user produces/consumes data High Performance TCP/IP Networking, Hassan-Jain Prentice Hall User Memory Access Provides for movement of data to and from the memory of the TCP user Copy semantics SEND and RECEIVE are defined with copy semantics The user can modify a send buffer at the time the SEND is issued Direct access Allows TCP to access the user buffers directly Bypasses copying of data High Performance TCP/IP Networking, Hassan-Jain Prentice Hall TCP Data Exchange TCP endpoints cooperate by exchanging segments Each segment contains: Sequence number seg.seq, segment data length seg.len, status bits, ack seq number seg.ack, advertised receive window size seg.wnd Fig. 13.3 High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Data Retransmissions TCP sender uses retransmission timer to derive retransmission of unacknowledged data Retransmits a segment if the timer fires Retransmission timeout (RTO) RTO<RTT: Aggressive; too many retransmissions RTO>RTT: Conservative; low utilisation due to connection idle In practice, adaptive retransmission timer with back-off is used (Specified in RFC 2988) High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Congestion Control A retransmission event indicates (to TCP sender) that the network is congested Congestion management is a function of the end-systems RFC 2581 requires TCP end-systems respond to congestion by reducing sending rate AIMD: Additive Increase Multiplicative Decrease TCP sender probes for available bandwidth on the network path Upon detection of congestion, TCP sender multiplicatively reduces cwnd Achieves fairness among TCP connections High Performance TCP/IP Networking, Hassan-Jain Prentice Hall High Performance TCP High Performance TCP/IP Networking, Hassan-Jain Prentice Hall TCP Implementation with High Bandwidth-Delay Product High bandwidth-delay product: High speed networks (e.g. optical networks) High-latency networks (e.g. satellite network) Collectively called Long Fat Networks (LFNs) LFNs require large window size (more than 16 bits as originally defined for TCP) Window scale option allows TCP sender to advertise large window size (e.g. 1 Gbyte) Specified at connection setup Limits window sizes in units of up to 16K High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Round Trip Time Estimation Accuracy of RTT estimation depends on frequent sample measurements of RTT Percentage of segments sampled decreases with larger windows May be insufficient for LFNs Timestamp option Enables the sender to compute RTT samples Provides safeguard against accepting out-of-sequence numbers High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Path MTU Discovery Most efficient by using the largest MSS without segmentation Enables TCP sender to automatically discover the largest acceptable MSS TCP implementation must correctly handle dynamic changes to MSS Never leaves more than 2*MSS bytes of data unacknowledged TCP sender may need to segment data for retransmission High Performance TCP/IP Networking, Hassan-Jain Prentice Hall End-System Overhead High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Reduce End-System Overhead TCP imposes processing overhead in operating system Adds directly to latency Consumes a significant share of CPU cycles and memory Reducing overhead can improve application throughput High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Relationship Between Bandwidth and CPU Utilization High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Achievable Throuput for HostLimited Systems High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Sources of Overhead for TCP/IP Per-transfer overhead Per-packet overhead Per-byte overhead Fig. 13.5 High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Per-Packet Overhead Increasing packet size can mitigate the impact of per-packet and per-segment overhead Fig. 13.6 Increasing segment size S increases achievable bandwidth As packet size grows, the effect of per-packet overhead becomes less significant Interrupts A significant source of per-packet overhead High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Relationship between Packet Size and Achievable Bandwidth High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Relationship between Packet Overhead and Bandwidth High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Checksum Overhead A source of per-byte overhead Ways for reducing checksum overhead: Complete multiple steps in a single traversal to reduce per-byte overhead Integrate chechsumming with the data copy Compute the checksum in hardware High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Copy Avoidance High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Copy Avoidance for HighPerformance TCP Page remapping Uses virtual memory to reduce copying across the TCP/user interface Typically resides at the socket layer in the OS kernel Scatter/gather I/O Does not require copy semantics Entails a comprehensive restructuring of OS and I/O interfaces Remote Direct Memory Access (RDMA) Steers incoming data directly into user-specified buffers IETF standards under way High Performance TCP/IP Networking, Hassan-Jain Prentice Hall TCP Offload High Performance TCP/IP Networking, Hassan-Jain Prentice Hall TCP Offload Supports TCP/IP protocol functions directly on the network adapter (NIC) Processing TCP checksum offloading Significantly reduces per-packet overheads for TCP/IP protocol processing Helps to avoid expensive copy operations High Performance TCP/IP Networking, Hassan-Jain Prentice Hall