* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download ppt - CSE Labs User Home Pages
TCP congestion control wikipedia , lookup
Net neutrality law wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Distributed firewall wikipedia , lookup
Computer network wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Airborne Networking wikipedia , lookup
Internet protocol suite wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Network tap wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Wake-on-LAN wikipedia , lookup
UniPro protocol stack wikipedia , lookup
Internet Measurement Basics • Measurement Overview and Internet Challenges –Why measure? Why model measurements? –What to measure? Where to measure? • Measurement tools –Active: ping, traceroute, and pathchar –Passive: logs, SNMP, packet, and flow monitoring • Two Case Studies: – trace-route based routing behavior measurement [Pa97] – OSPF-based passive monitoring of intra-domain routing [AG04] • Operational applications of measurement Readings: Please do the required readings CSci5221: Internet Measurement Basics 1 Why Measure? • The Internet is a man-made system, so why do we need to measure it? – Because we still don’t really understand it – Because sometimes things go wrong • Measurement for network operations – Reliability analysis, Traffic engineering, Capacity Planning • Better and more efficient management of network resources • Detecting, diagnosing and predicting problems • What-if analysis of future changes • Measurement for scientific discovery – Characterizing a complex system as organism – Creating accurate models that represent reality – Identifying new features and phenomena CSci5221: Internet Measurement Basics 2 Why Build Models of Measurements? • Compact summary of measurements – Efficient way to represent a large data set – E.g., exponential distribution with mean 100 sec • Expose important properties of measurements – Reveals underlying cause or engineering question – E.g., mean RTT to help explain TCP throughout • Generate random but realistic data as input – Generate new data that agree in key properties – E.g., topology models to feed into simulators “All models are wrong, but some models are useful.” – George Box CSci5221: Internet Measurement Basics 3 What Can be Measured? • Traffic – Load statistics – Packet or flow traces • Performance of paths – Application performance, e.g,. Web download time – Transport performance, e.g., TCP bulk throughput – Network performance, e.g., packet delay and loss • Network structure – Topology, and paths on the topology – Dynamics of the routing protocol CSci5221: Internet Measurement Basics 4 Where Measure, and How? • Short answer – Anywhere you can! • End hosts – Application logs, e.g., Web server logs – Sending active probes to measure performance • Individual links/routers – – – – Load statistics, packet traces, flow traces Configuration state Routing-protocol messages or table dumps Alarms • How: Active vs. Passive Measurement – First understand some measurement challenges CSci5221: Internet Measurement Basics 5 Internet Challenges Make Measurement an Art • Stateless routers – Routers do not routinely store packet/flow state – Measurement is an afterthought, adds overhead • IP narrow waist – IP measurements cannot see below network layer – E.g., link-layer retransmission, tunnels, etc. • Violations of end-to-end argument – E.g., firewalls, address translators, and proxies – Not directly visible, and may block measurements • Decentralized control – Autonomous Systems may block measurements – No global notion of time CSci5221: Internet Measurement Basics 6 Active Measurement Example: Ping • Adding traffic for purposes of measurement – Trade-offs between accuracy and overhead – Need careful methods to avoid introducing bias • Ping – – – – Host sends an ICMP ECHO packet to a target … and captures the ICMP ECHO REPLY Useful for checking connectivity, and RTT Only requires control of one of the two end-points • Problems with ping – Round-trip rather than one-way delays – Some hosts might not respond CSci5221: Internet Measurement Basics 7 Active Measurement Example: Pathchar for Links rtt (i 1) rtt (i ) d L / c i : initial TTL value c : link capacity L : packet size rtt(i+1) -rtt(i) Three delay components: d : propagation delay L / c : transmission delay : queueing delay noise How to infer d,c? CSci5221: Internet Measurement Basics min. RTT (L) e slope=1/c d 8 L Active Measurement Example: Traceroute • Time-To-Live field in IP packet header – Source sends a packet with a TTL of n – Each router along the path decrements the TTL – “TTL exceeded” sent when TTL reaches 0 • Traceroute tool exploits this TTL behavior TTL=1 source Time exceeded destination TTL=2 Send packets with TTL=1, 2, 3, … and record source of “time exceeded” message CSci5221: Internet Measurement Basics 9 Challenges of Traceroute • Measuring multiple paths – Successive probes may traverse different paths • Non-participating network elements – Some routers and firewalls don’t reply • Inaccurate delay information – Includes processing delays on the router CPU • Round-trip vs. one-way measurements – Paths may have asymmetric properties • Interfaces, not routers – Returns IP address of interfaces, not routers CSci5221: Internet Measurement Basics 10 Applications of Traceroute • Network troubleshooting – Identify forwarding loops and black holes – Identify long and convoluted paths – See how far the probe packets get • Network topology inference – – – – Launch traceroute probes from many places … toward many destinations Join together to fill in parts of the topology … though traceroute undersamples the edges CSci5221: Internet Measurement Basics 11 Paxson Study: Forwarding Loops • Forwarding loop – Packet returns to same router multiple times • May cause traceroute to show a loop – If loop lasted long enough – So many packets traverse the loopy path • Traceroute may reveal false loops – Path change that leads to a longer path – Causing later probe packets to hit same nodes • Heuristic solution – Require traceroute to return same path 3 times Paxson Study: Causes of Loops • Transient vs. persistent – Transient: routing-protocol convergence – Persistent: likely configuration problem • Challenges – Appropriate time boundary between the two? – What about flaky equipment going up and down? – Determining the cause of persistent loops? • Anecdote on recent study of persistent loops – Provider has static route for customer prefix – Customer has default route to the provider Paxson Study: Path Fluttering • Rapid changes between paths – Multiple paths between a pair of hosts – Load balancing policies inside the network • Packet-based load balancing – Round-robin or random – Multiple paths for packets in a single flow • Flow-based load balancing – Hash of some fields in the packet header – E.g., IP addresses, port numbers, etc. – To keep packets in a flow on one path Paxson Study: Routing Stability • Route prevalence – – – – Likelihood of observing a particular route Relatively easy to measure with sound sampling Poisson arrivals see time averages (PASTA) Most host pairs have a dominant route • Route persistence – – – – How long a route endures before a change Much harder to measure through active probes Look for cases of multiple observations Typical host pair has path persistence of a week Paxson Study: Route Asymmetry • Hot Potato Routing • Other causes Customer B – Asymmetric link weights in intradomain routing – Cold-potato routing, where AS requests traffic enter at particular place Provider B multiple peering points Early-exit routing Provider A Customer A • Consequences – Lots of asymmetry – One-way delay is not necessarily half of the round-trip time Passive Measurement Example: Logs at Hosts • Web server logs – Host, time, URL, response code, content length, … – E.g., 122.345.131.2 - - [15/Oct/1998:00:00:25 -0400] "GET /images/wwwtlogo.gif HTTP/1.0" 304 "http://www.aflcio.org/home.htm" "Mozilla/2.0 (compatible; MSIE 3.02; Update a; AK; AOL 4.0; Windows 95)" "-" • DNS logs – Request, response, time • Useful for workload characterization, troubleshooting, etc. CSci5221: Internet Measurement Basics 17 “Passive” Traffic Measurement • Packet-level: – Tcpdump: software based – Special hardware packet collectors • Flow-level: – Cisco Netflow; other vendors have similar facility – 5-tuple flow: srcIP, dstIP, scrPort, dstPort, protocol • use a time-out value to “terminate” a flow • statistics collected: start/end time, packet/byte counts – Sampling may be used for scalability • Link-level: – SNMP traffic statistics, often over 5-min interval – IETF MIB (management information base) • Byte counts, packet counts, etc. • pros and cons of each? CSci5221: Internet Measurement Basics 18 Passive Measurement: SNMP • Simple Network Management Protocol – Coarse-grained counters on the router – E.g., byte and packet counts • Polling – Management system can poll the counters – E.g., once every five minutes • Limitations – Extremely coarse-grained statistics – Delivered over UDP! • Advantages: ubiquitous CSci5221: Internet Measurement Basics 19 Passive Measurement: Packet Monitoring • Tapping a link Multicast switch Shared media (Ethernet, wireless) Host A Host A Host B Monitor Host B S w i t c h Host C Monitor Splitting a point-to-point link Line card that does packet sampling Router A Router B Monitor CSci5221: Internet Measurement Basics Router A 20 Packet Monitoring: Selecting the Traffic • Filter to focus on a subset of the packets – IP addresses/prefixes (e.g., to/from specific Web sites, client machines, DNS servers, mail servers) – Protocol (e.g., TCP, UDP, or ICMP) – Port numbers (e.g., HTTP, DNS, BGP, Napster) • Collect first n bytes of packet (snap length) – – – – – Medium access control header (if present) IP header (typically 20 bytes) IP+UDP header (typically 28 bytes) IP+TCP header (typically 40 bytes) Application-layer message (entire packet) CSci5221: Internet Measurement Basics 21 Analysis of Packet Traces • IP header – Traffic volume by IP addresses or protocol – Burstiness of the stream of packets – Packet properties (e.g., sizes, out-of-order, etc.) • TCP header – Traffic breakdown by application (e.g., Web) – TCP congestion and flow control – Number of bytes and packets per session • Application header – URLs, HTTP headers (e.g., cacheable response?) – DNS queries and responses, user key strokes, … CSci5221: Internet Measurement Basics 22 Packet vs. Flow Measurement • Basic statistics (available from both techniques) – Traffic mix by IP addresses, port numbers, and protocol – Average packet size • Traffic over time – Both: traffic volumes on a medium-to-large time scale – Packet: burstiness of the traffic on a small time scale • Statistics per TCP connection – Both: number of packets & bytes transferred over the link – Packet: frequency of lost or out-of-order packets, and the number of application-level bytes delivered • Per-packet info (available only from packet traces) – TCP seq/ack #s, receiver window, per-packet flags, … – Probability distribution of packet sizes – Application-level header and body (full packet contents) CSci5221: Internet Measurement Basics 23 Network Topology Measurement • Use traceroute – Pros • Can be done at end hosts • “router-level” topology • Can a “sample” of “global” Internet topology, – Cons • Active measurement, incur overhead/load on routers • Not routers all respond to traceroutes • IP address aliasing problem; – Also MPLS tunnels may “obscure” real topology • Only “sampled”, or “snapshots” • BGP routing data – “global” AS-level topology, – Partial view, unless you can BGP data from all BGP routers • ISP topology – If you are the ISP operator, an easier task, but not necessarily an easy task CSci5221: Internet Measurement Basics 24 OSPF Protocol: A Quick Recap • Link-state protocol – Routers flood Link State Advertisements (LSAs) – Routers compute shortest paths based on weights – Routers identify next-hop to reach other routers 2 3 2 CSci5221: 1 1 1 3 5 4 3 Network Failures and Fast Convergence 25 Measurement: Intradomain Route Monitoring • OSPF is a flooding protocol – Every link-state advertisements sent on every link – Very helpful for simplifying the monitor • Can participate in the protocol – Shared media (e.g., Ethernet) • Join multicast group and listen to LSAs – Point-to-point links • Establish an adjacency with a router • … or passively monitor packets on a link – Tap a link and capture the OSPF packets Intradomain Route Monitoring • Construct continuous view of topology – Detect when equipment goes up or down – Input to traffic-engineering and planning tools • Detect routing anomalies – Identify failures, LSA storms, and route flaps – Verify that LSA load matches expectations – Flag strange weight settings as misconfigurations • Analyze convergence delay – Monitor LSAs in multiple locations with go – Compare the times when LSAs arrive • Detect router implementation mistakes CSci5221: Network Failures and Fast Convergence 27 Passive Collection of LSAs • OSPF is a flooding protocol – Every LSA sent on every participating link – Very helpful for simplifying the monitor • Can participate in the protocol – Shared media (e.g., Ethernet) • Join multicast group and listen to LSAs – Point-to-point links • Establish an adjacency with a router • … or passively monitor packets on a link – Tap a link and capture the OSPF packets • Note LSAs do not tell us the “root causes” of failures! – need to gather route configurations, syslogs, … – need to dig below IP: link/physical layers, … CSci5221: Network Failures and Fast Convergence 28 Reducing Volume of Information • Prioritizing the messages – Router failure over router recovery – Link failure or weight change over a refresh – Informational messages about weight settings • Grouping related messages – Link failure: group messages for the two ends – Router failure: group the affected links – Common failure: group links failing close in time CSci5221: Network Failures and Fast Convergence 29 Anomalies Found in Shaikh04 paper • Intermittent hardware problem – Router periodically losing OSPF adjacencies – Risk of network partition if 2nd failure occurred • External link flaps – Congestion on edge link causing lost messages – Lost adjacency leading to flapping routes • Configuration errors – Two routers assigned the same IP address – Inefficient config leading to duplicate LSAs • Vendor implementation bug – More frequent refreshing of LSAs than specified CSci5221: Network Failures and Fast Convergence 30 Measurement Challenges for Operators • Network-wide view – Crucial for evaluating control actions – Multiple kinds of data from multiple locations • Large scale – Large number of high-speed links and routers – Large volume of measurement data • Poor state-of-the-art – Working within existing protocols and products – Technology not designed with measurement in mind • The “do no harm” principle – Don’t degrade router performance – Don’t require disabling key router features – Don’t overload the network with measurement data CSci5221: Internet Measurement Basics 31 Network Operations Tasks • Reporting of network-wide statistics – Generating basic information about usage and reliability • Performance/reliability troubleshooting – Detecting and diagnosing anomalous events • Security – Detecting, diagnosing, and blocking security problems • Traffic engineering – Adjusting network configuration to the prevailing traffic • Capacity planning – Deciding where and when to install new equipment CSci5221: Internet Measurement Basics 32 Basic Reporting • Producing basic statistics about the network – For business purposes, network planning, ad hoc studies • Examples – – – – – Proportion of transit vs. customer-customer traffic Total volume of traffic sent to/from each private peer Mixture of traffic by application (Web, Napster, etc.) Mixture of traffic to/from individual customers Usage, loss, and reliability trends for each link • Requirements – Network-wide view of basic traffic and reliability statistics – Ability to “slice and dice” measurements in different ways (e.g., by application, by customer, by peer, by link type) CSci5221: Internet Measurement Basics 33 Troubleshooting • Detecting and diagnosing problems – Recognizing and explaining anomalous events • Examples – – – – – Why a backbone link is suddenly overloaded Why the route to a destination prefix is flapping Why DNS queries are failing with high probability Why a route processor has high CPU utilization Why a customer cannot reach certain Web sites • Requirements – Network-wide view of many protocols and systems – Diverse measurements at different protocol levels – Thresholds for isolating significant phenomena CSci5221: Internet Measurement Basics 34 Security • Detecting and diagnosing problems – Recognizing suspicious traffic or disruptions • Examples – Denial-of-service attack on a customer or service – Spread of a worm or virus through the network – Route hijack of an address block by adversary • Requirements – – – – Detailed measurements from multiple places Including deep-packet inspection, in some cases Online analysis of the data Installing filters to block the offending traffic CSci5221: Internet Measurement Basics 35 Traffic Engineering • Adjusting resource allocation policies – Path selection, buffer management, and link scheduling • Examples – OSPF weights to divert traffic from congested links – BGP policies to balance load on peering links – Link-scheduling weights to reduce delay for “gold” traffic • Requirements – Network-wide view of the traffic carried in the backbone – Timely view of the network topology and configuration – Accurate models to predict impact of control operations (e.g., the impact of RED parameters on TCP throughput) CSci5221: Internet Measurement Basics 36 Capacity Planning • Deciding whether to buy/install new equipment – What? Where? When? • Examples – – – – – Where to put the next backbone router When to upgrade a link to higher capacity Whether to add/remove a particular peer Whether the network can accommodate a new customer Whether to install a caching proxy for cable modems • Requirements – Projections of future traffic patterns from measurements – Cost estimates for buying/deploying the new equipment – Model of the potential impact of the change (e.g., latency reduction and bandwidth savings from a caching proxy) CSci5221: Internet Measurement Basics 37 Examples of Public Data Sets • Network-wide data – Abilene and GEANT backbones – Netflow, IGP, and BGP traces • CAIDA DatCat – Data catalogue maintained by CAIDA – http://imdc.datcat.org/ • Interdomain routing – RouteViews and RIPE-NCC – BGP routing tables and update messages • Traceroute and looking glass servers – http://www.traceroute.org/ – http://www.nanog.org/lookingglass.html CSci5221: Internet Measurement Basics 38