* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Network Control and Management in the 100x100 Architecture
Asynchronous Transfer Mode wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Distributed operating system wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Deep packet inspection wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Computer network wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Distributed firewall wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Network tap wikipedia , lookup
Peer-to-peer wikipedia , lookup
UniPro protocol stack wikipedia , lookup
Rethinking Network Control & Management The Case for a New 4D Architecture David A. Maltz Carnegie Mellon University/Microsoft Research Joint work with Albert Greenberg, Gisli Hjalmtysson Andy Myers, Jennifer Rexford, Geoffrey Xie, Hong Yan, Jibin Zhan, Hui Zhang 1 The Role of Network Control and Management   Many different network environments  Access, backbone networks  Data-center networks, enterprise/campus  Sizes: 10-10,000 routers/switches Many different technologies   Longest-prefix routing (IP), fixed-width routing (Ethernet), label switching (MPLS, ATM), circuit switching (optical, TDM) Many different policies  Routing, reachability, transit, traffic engineering, robustness The control plane software binds these elements together and defines the network 2 We Can Change the Control Plane!   3 Pre-existing industry trend towards separating router hardware from software  IETF: FORCES, GSMP, GMPLS  SoftRouter [Lakshman, HotNets’04] Incremental deployment path exists  Individual networks can upgrade their control planes and gain benefits  Small enterprise networks have most to gain  No changes to end-systems required A Clean-slate Design  What are the fundamental causes of network problems?  How to secure the network and protect the infrastructure?  How to provide flexibility in defining management logic?  What functionality needs to be distributed – what can be centralized?   How to reduce/simplify the software in networks?  What would a “RISC” router look like? How to leverage technology trends?  4 CPU and link-speed growing faster than # of switches Three Principles for Network Control & Management Network-level Objectives:  Express goals explicitly   Security policies, QoS, egress point selection Do not bury goals in box-specific configuration Reachability matrix Traffic engineering rules Management Logic 5 Three Principles for Network Control & Management Network-wide Views:  Design network to provide timely, accurate info   Topology, traffic, resource limitations Give logic the inputs it needs Reachability matrix Traffic engineering rules Management Logic Read state info 6 Three Principles for Network Control & Management Direct Control:  Allow logic to directly set forwarding state   FIB entries, packet filters, queuing parameters Logic computes desired network state, let it implement it Reachability matrix Traffic engineering rules Write state Management Logic Read state info 7 Overview of the 4D Architecture Network-level objectives Decision Network-wide views Dissemination Discovery Direct control Data Decision Plane: 8  All management logic implemented on centralized servers making all decisions  Decision Elements use views to compute data plane state that meets objectives, then directly writes this state to routers Overview of the 4D Architecture Network-level objectives Decision Network-wide views Dissemination Discovery Direct control Data Dissemination Plane: 9  Provides a robust communication channel to each router – and robustness is the only goal!  May run over same links as user data, but logically separate and independently controlled Overview of the 4D Architecture Network-level objectives Decision Network-wide views Dissemination Discovery Direct control Data Discovery Plane: 10  Each router discovers its own resources and its local environment  E.g., the identity of its immediate neighbors Overview of the 4D Architecture Network-level objectives Decision Network-wide views Dissemination Discovery Direct control Data Data Plane: 11  Spatially distributed routers/switches  Can deploy with today’s technology  Looking at ways to unify forwarding paradigms across technologies Concerns and Challenges   12 Distributed Systems issues  How will communication between routers and DEs survive failures in the network?  Latency means DE’s view of network is behind reality. Will the control loop be stable?  What is the overhead to/from the DEs?  What happens in a network partition? Networking issues  Does the 4D simplify control and management?  Can we create logic to meet multiple objectives? The Feasibility of the 4D Architecture We designed and built a prototype of the 4D Architecture  4D Architecture permits many designs – prototype is a single, simple design point  Decision plane  13  Contains logic to simultaneously compute routes and enforce reachability matrix  Multiple Decision Elements per network, using simple election protocol to pick master Dissemination plane  Uses source routes to direct control messages  Extremely simple, but can route around failed data links Evaluation of the 4D Prototype  Evaluated using Emulab (www.emulab.net)  Linux PCs used as routers (650 – 800MHz)  Tested on 9 enterprise network topologies (10-100 routers each) Example network with 49 switches and 5 DEs 14 Performance of the 4D Prototype Trivial prototype has performance comparable to welltuned production networks    Recovers from single link failure in < 300 ms  < 1 s response considered “excellent”  Faster forwarding reconvergence possible Survives failure of master Decision Element  New DE takes control within 1 s  No disruption unless second fault occurs Gracefully handles complete network partitions  15 Less than 1.5 s of outage Fundamental Problem: Wrong Abstractions Shell scripts Traffic Eng Planning tools Databases Configs SNMP OSPF Link metrics OSPF BGP FIB OSPF BGP FIB • Figure out what is happening in network netflow modems • Decide how to change it Routing policies OSPF BGP FIBPacket filters 16 Management Plane Control Plane • Multiple routing processes on each router • Each router with different configuration program • Huge number of control knobs: metrics, ACLs, policy Data Plane • Distributed routers • Forwarding, filtering, queueing • Based on FIB or labels Good Abstractions Reduce Complexity Management Plane Control Plane Data Plane Configs FIBs, ACLs Decision Plane FIBs, ACLs Dissemination Data Plane All decision making logic lifted out of control plane 17  Eliminates duplicate logic in management plane  Dissemination plane provides robust communication to/from data plane switches Today: Simple Things are Hard to Do D Access Networks 18 Inter-POP Links Fundamental Problem: Configurations Allow Too Many Degrees of Freedom   Computing configuration files that cause control plane to compute desired forwarding states is intractable  NP-hard in many cases  Requires predictive model of control plane behavior Configurations files form a program that defines a set of forwarding states  19 Very hard to create program that permits only desired states, and doesn’t transit through bad ones Forwarding states allowed by configs Auto-adaptation leads to/thru bad states Direct Control avoids bad states Fundamental Problem: Conflation of Issues  Ideal case: all routing information flooded to all routers inside network   20 Robustness achieved via flooding Reality: routing information filtered and aggregated extensively  Route filtering used to implement security and resource policies  Route aggregation used to achieve scalability 4D Separates Distributed Computing Issues from Networking Issues   21 Distributed computing issues ! protocols and network architecture  Overhead  Resiliency  Scalability Networking issues ! management logic  Traffic engineering and service provisioning  Egress point selection  Reachability control (VPNs)  Precomputation of backup paths Future Work    Scalability  Evaluate over 1-10K switches, 10-100K routes  Networks with backbone-like propagation delays Structuring decision logic  Arbitrate among multiple, potentially competing objectives  Unify control when some logic takes longer than others Protocol improvements   Deployment in today’s networks  22 Better dissemination and discovery planes Data center, enterprise, campus, backbone (RCP) Future Work  Experiment with network appliances   Expand relationships with security   23 Using 4D as mechanism for monitoring/quarantine Formulate models that establish bounds of 4D   Traffic shapers, traffic scrubbers Scale, latency, stability, failure models, objectives Generate evidence to support/refute principles Questions? 24 Direct Control Provides Complete Control 25  Zero device-specific configuration  Supports many models for “pushing” routes  Trivial push – convergence requires time for all updates to be receive and applied – same as today  Synchronized update – updates propagated, but not applied till agreed time in the future – clock skew defines convergence time  Controlled state trajectory – DE serializes updates to avoid all incorrect transient states Fundamental Problem: Wrong Abstractions interface Ethernet0 ip address 6.2.5.14 255.255.255.128 interface Serial1/0.5 point-to-point ip address 6.2.2.85 255.255.255.252 ip access-group 143 in frame-relay interface-dlci 28 access-list 143 deny 1.1.0.0/16 access-list 143 permit any route-map 8aTzlvBrbaW deny 10 match ip address 4 route-map 8aTzlvBrbaW permit 20 match ip address 7 ip route 10.2.2.1/16 10.2.1.7 router ospf 64 redistribute connected subnets redistribute bgp 64780 metric 1 subnets network 66.251.75.128 0.0.0.127 area 0 router bgp 64780 redistribute ospf 64 match route-map 8aTzlvBrbaW neighbor 66.253.160.68 remote-as 12762 neighbor 66.253.160.68 distribute-list 4 in 26 Fundamental Problem: Wrong Abstractions 2000 Size of configuration files in a single enterprise network (881 routers) Lines in config file 1000 0 0 881 Router ID (sorted by file size) 27 28 29 Fundamental Problem: Conflating Distributed Systems Issues with Networking Issues Routing Process D left D Routing Process D  D D Routing Process D left Distributed Systems Concern: resiliency to link failures  30 left D Solution: multiple paths through routing process graph Fundamental Problem: Conflating Distributed Systems Issues with Networking Issues Routing Process D right D Routing Process D  D D left Distributed Systems Concern: resiliency to link failures  31 left D Routing Process Solution: multiple paths through routing process graph Fundamental Problem: Conflating Distributed Systems Issues with Networking Issues Routing Process D Filter routes to D left D Routing Process D  D D Routing Process D left Networking Concern: implement resource or security policy  32 left D Solution: restrict flow of routing information, filter routes, summarize/aggregate routes 4D Supports Network Evolution & Expansion  Decision logic can be upgraded as needed   Decision Elements can be upgraded as needed  33 No need for update of distributed protocols implemented in software distributed on every switch Network expansion requires upgrades only to DEs, not every switch Reachability Example R1 Chicago (chi) R2 New York (nyc) Data Center Front Office R5 R3 34 R4  Two locations, each with data center & front office  All routers exchange routes over all links Reachability Example R1 Chicago (chi) R2 New York (nyc) Data Center Front Office R5 R3 chi-DC chi-FO nyc-DC nyc-FO 35 R4 Reachability Example R1 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit * R3 chi-DC chi-FO nyc-DC nyc-FO 36 R2 R5 R4 chi Front Office nyc Reachability Example R1 Data Center Packet filter: Drop chi-FO -> * Permit * R3 37 Packet filter: Drop nyc-FO -> * Permit * R2 R5 chi Front Office nyc R4  A new short-cut link added between data centers  Intended for backup traffic between centers Reachability Example R1 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit * R3 38 R2 R5 chi Front Office nyc R4  Oops – new link lets packets violate security policy!  Routing changed, but  Packet filters don’t update automatically Prohibiting Packets from chi-FO to nyc-DC 39 Reachability Example R1 Data Center Packet filter: Drop chi-FO -> * Permit * R3  40 Packet filter: Drop nyc-FO -> * Permit * R2 R5 chi Front Office nyc R4 Typical response – add more packet filters to plug the holes in security policy Reachability Example R1 Drop nyc-FO -> * R2 Data Center R5 Drop chi-FO -> * R3 41 R4  Packet filters have surprising consequences  Consider a link failure  chi-FO and nyc-FO still connected chi Front Office nyc Reachability Example R1 Drop nyc-FO -> * R2 Data Center R5 chi Front Office nyc Drop chi-FO -> * R3 42 R4  Network has less survivability than topology suggests  chi-FO and nyc-FO still connected  But packet filter means no data can flow!  Probing the network won’t predict this problem Allowing Packets from chi-FO to nyc-FO 43 Multiple Interacting Routing Processes Client Server OSPF OSPF Internet FIB OSPF FIB 44 Policy1 OSPF Policy2 BGP FIB OSPF FIB OSPF FIB The Routing Instance Graph of a 881 Router Network 45 Reconvergence Time Under Single Link Failure 46 Reconvergence Time When Master DE Crashes 47 Reconvergence Time When Network Partitions 48 Reconvergence Time When Network Partitions 49 Many Implementations Possible Single redundant decision engine Multiple decision engines • Hot stand-by • Divide network & load share Distributed decision engines • Up to one per router Choice can be based on reliability requirements • Dessim. Plane can be in-band, or leverage OOB links Less need for distributed solutions (harder to reason about) • More focus on network issues, less on distributed protocols 50 Direct Expression Enables New Algorithms D OSPF normally calculates a single path to each destination D  OSPF allows load-balancing only for equal-cost paths to avoid loops  Using ECMP requires careful engineering of link weights D Decision Plane with network-wide view can compute multiple paths • “Backup paths” installed for free! • Bounded stretch, bounded fan-in 51 Systems of Systems  Systems are designed as components to be used in larger systems in different contexts, for different purposes, interacting with different components    52 Example: OSPF and BGP are complex systems in its own right, they are components in a routing system of a network, interacting with each other and packet filters, interacting with management tools … Complex configuration to enable flexibility  The glue has tremendous impact on network performance  State of art: multiple interactive distributed programs written in assembly language Lack of intellectual framework to understand global behavior Supporting Network Evolution   53 Logic for controlling the network needs to change over time  Traffic engineering rules  Interactions with other networks  Service characteristics Upgrades to field-deployed network equipment must be avoided  Very high cost  Software upgrades often require hardware upgrades (more CPU or memory) Supporting Network Evolution Today  Today’s “Solution”  Vendors stuff their routers with software implementing all possible “features” – Multiple routing protocols – Multiple signaling protocols (RSVP, CR-LDP) – Each feature controlled by parameters set at configuration time to achieve late binding  Feature-creep creates configuration nightmare – Tremendous complexity for syntax & semantics – Mis-interactions between features is common Our Goal: Separate decision making logic from the fielddeployed devices 54 Supporting Network Expansion   Networks are constantly growing  New routers/switches/links added  Old equipment rarely removed Adding a new switch can cause old equipment to become overloaded  55 CPU/Memory demands on each device should not scale up with network size Supporting Network Expansion Today   Routers run a link-state routing protocol  Size of link-state database scales with # of routers  Expanding network can exceed memory limits of old routers Today’s “Solution”  Monitor resources on all routers  Predict approach of exhaustion and then: – Global upgrade – Rearchitecture of routing design to add summarization, route aggregation, information hiding Our Goal: make demands scale with hardware (e.g., # of interfaces) 56 Supporting Remote Devices   57 Maintaining communication with all network devices is critical for network management  Diagnosis of problems  Monitoring status and network health  Updating configuration or software “the chicken or the egg….”  Cannot send device configuration/management information until it can communicate  Device cannot communicate until it is correctly configured Supporting Remote Devices Today  Today’s “Solution”  Use PSTN as management network of last resort  Connect console of remote routers to phone modem  Can’t be used for customer premise equipment (CPE): DSL/cable modems, integrated access devices (IADs)  In a converged network, PSTN is decommissioned Our Goal: Preserve management communication to any device that is not physically partitioned, regardless of configuration state 58 Recent Publications 59  G. Xie, J. Zhan, D. A. Maltz, H. Zhang, A. Greenberg, G. Hjalmtysson, J. Rexford, “On Static Reachability Analysis of IP Networks,” IEEE INFOCOM 2005, Orlando, FL, March 2005.  J. Rexford, A. Greenberg, G. Hjalmtysson, D. A. Maltz, A. Myers, G. Xie, J. Zhan, H. Zhang, “Network-Wide Decision Making: Toward a Wafer-Thin Control Plane,” Proceedings of ACM HotNets-III, San Diego, CA, November 2004.  D. A. Maltz, J. Zhan, G. Xie, G. Hjalmtysson, A. Greenberg, H. Zhang, “Routing Design in Operational Networks: A Look from the Inside,” Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (ACM SIGCOMM 2004), Portland, Oregon, 2004.  D. A. Maltz, J. Zhan, G. Xie, H. Zhang, G. Hjalmtysson, A. Greenberg, J. Rexford, “Structure Preserving Anonymization of Router Configuration Data,” Proceedings of ACM/Usenix Internet Measurement Conference (IMC 2004), Sicily, Italy, 2004.
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            