Download slides - network systems lab @ sfu

Chapter 13 TCP Implementation High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Objectives  Understand the structure of typical TCP implementation  Outline the implementation of extended standards for TCP over high-performance networks  Understand the sources of end-system overhead in typical TCP implementations, and techniques to minimize them  Quantify the effect of end-system overhead and buffering on TCP performance  Understand the role of Remote Direct Memory Access (RDMA) extensions for high-performance IP networking High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Contents      Overview of TCP implementation High-performance TCP End-system overhead Copy avoidance TCP offload High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Implementation Overview High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Overall Structure (RFC 793)  Internal structure specified in RFC 793 Fig. 13.1 High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Data Structure of TCP Endpoint  Data structure of TCP endpoint  Transmission control block: Stores the connection state and related variables  Transmit queue: Buffers containing outstanding data  Receiver queue: Buffers for received data (but not yet forwarded to higher layer) High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Buffering and Data Movement  Buffer queues reside in the protocol-independent socket layer within the operating system kernel  TCP sender upcalls to the transmit queue to obtain data  TCP receiver notifies the receive queue of correct arrival of incoming data  BSD-derived kernels implement buffers in mbufs  Moves data by reference  Reduces the need to copy  Most implementations commit buffer space to the queue lazily  Queues consume memory only when the bandwidth of the network does not match the rate at which TCP user produces/consumes data High Performance TCP/IP Networking, Hassan-Jain Prentice Hall User Memory Access  Provides for movement of data to and from the memory of the TCP user  Copy semantics  SEND and RECEIVE are defined with copy semantics  The user can modify a send buffer at the time the SEND is issued  Direct access  Allows TCP to access the user buffers directly  Bypasses copying of data High Performance TCP/IP Networking, Hassan-Jain Prentice Hall TCP Data Exchange  TCP endpoints cooperate by exchanging segments  Each segment contains:  Sequence number seg.seq, segment data length seg.len, status bits, ack seq number seg.ack, advertised receive window size seg.wnd  Fig. 13.3 High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Data Retransmissions  TCP sender uses retransmission timer to derive retransmission of unacknowledged data  Retransmits a segment if the timer fires  Retransmission timeout (RTO)  RTO<RTT: Aggressive; too many retransmissions  RTO>RTT: Conservative; low utilisation due to connection idle  In practice, adaptive retransmission timer with back-off is used (Specified in RFC 2988) High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Congestion Control  A retransmission event indicates (to TCP sender) that the network is congested  Congestion management is a function of the end-systems  RFC 2581 requires TCP end-systems respond to congestion by reducing sending rate  AIMD: Additive Increase Multiplicative Decrease  TCP sender probes for available bandwidth on the network path  Upon detection of congestion, TCP sender multiplicatively reduces cwnd  Achieves fairness among TCP connections High Performance TCP/IP Networking, Hassan-Jain Prentice Hall High Performance TCP High Performance TCP/IP Networking, Hassan-Jain Prentice Hall TCP Implementation with High Bandwidth-Delay Product  High bandwidth-delay product:  High speed networks (e.g. optical networks)  High-latency networks (e.g. satellite network)  Collectively called Long Fat Networks (LFNs)  LFNs require large window size (more than 16 bits as originally defined for TCP)  Window scale option allows TCP sender to advertise large window size (e.g. 1 Gbyte)  Specified at connection setup  Limits window sizes in units of up to 16K High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Round Trip Time Estimation  Accuracy of RTT estimation depends on frequent sample measurements of RTT  Percentage of segments sampled decreases with larger windows  May be insufficient for LFNs  Timestamp option  Enables the sender to compute RTT samples  Provides safeguard against accepting out-of-sequence numbers High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Path MTU Discovery  Most efficient by using the largest MSS without segmentation  Enables TCP sender to automatically discover the largest acceptable MSS  TCP implementation must correctly handle dynamic changes to MSS  Never leaves more than 2*MSS bytes of data unacknowledged  TCP sender may need to segment data for retransmission High Performance TCP/IP Networking, Hassan-Jain Prentice Hall End-System Overhead High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Reduce End-System Overhead  TCP imposes processing overhead in operating system Adds directly to latency Consumes a significant share of CPU cycles and memory  Reducing overhead can improve application throughput High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Relationship Between Bandwidth and CPU Utilization High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Achievable Throuput for HostLimited Systems High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Sources of Overhead for TCP/IP     Per-transfer overhead Per-packet overhead Per-byte overhead Fig. 13.5 High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Per-Packet Overhead  Increasing packet size can mitigate the impact of per-packet and per-segment overhead  Fig. 13.6  Increasing segment size S increases achievable bandwidth  As packet size grows, the effect of per-packet overhead becomes less significant  Interrupts  A significant source of per-packet overhead High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Relationship between Packet Size and Achievable Bandwidth High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Relationship between Packet Overhead and Bandwidth High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Checksum Overhead  A source of per-byte overhead  Ways for reducing checksum overhead: Complete multiple steps in a single traversal to reduce per-byte overhead Integrate chechsumming with the data copy Compute the checksum in hardware High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Copy Avoidance High Performance TCP/IP Networking, Hassan-Jain Prentice Hall Copy Avoidance for HighPerformance TCP  Page remapping  Uses virtual memory to reduce copying across the TCP/user interface  Typically resides at the socket layer in the OS kernel  Scatter/gather I/O  Does not require copy semantics  Entails a comprehensive restructuring of OS and I/O interfaces  Remote Direct Memory Access (RDMA)  Steers incoming data directly into user-specified buffers  IETF standards under way High Performance TCP/IP Networking, Hassan-Jain Prentice Hall TCP Offload High Performance TCP/IP Networking, Hassan-Jain Prentice Hall TCP Offload  Supports TCP/IP protocol functions directly on the network adapter (NIC) Processing TCP checksum offloading  Significantly reduces per-packet overheads for TCP/IP protocol processing  Helps to avoid expensive copy operations High Performance TCP/IP Networking, Hassan-Jain Prentice Hall

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download slides - network systems lab @ sfu