* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Drawing inside the lines: Musings on architectural
Power engineering wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Pulse-width modulation wikipedia , lookup
Alternating current wikipedia , lookup
Mains electricity wikipedia , lookup
Rectiverter wikipedia , lookup
Distribution management system wikipedia , lookup
Relaxing Constraints: Thoughts on the Evolution of Computer Architecture Joel Emer Alpha Development Group Compaq Computer Corporation Better answers Moore’s Law Alpha-style 100 EV67-730 EV6-575 SPECint95. EV56-500 EV56-400 10 EV5-300 EV45-275 EV4-200 1 3.73 Date of Introduction Better answers EV56-600 Iron Law of Performance Performance = Frequency * Instructions CPI  Frequency – largely circuit design/technology  CPI – largely organization  Instructions – largely architecture/compiler Better answers Outline  Review of technology factors  Retrospective on the quantitative method  Augmenting the quantitative method  Recommendation Better answers Power Dissipation Trends Power Dissipation 80 60 40 20 0 21064 21164 21264 21364 80 70 60 50 40 30 20 10 0 3.5 3 2.5 2 1.5 1 0.5 0 21064 21164 21264 21364 •Power consumption is increasing •Supply current is increasing faster! Better answers Voltage (V) Power (W) 100 Current (A) 3.5 3 2.5 2 1.5 1 0.5 0 Voltage (V) 120 Supply Current Coping With Power Growth  Technology techniques Better cooling technology needed  Accelerate V dd scaling  SOI  Clock distribution   Architectural possibilities Use less power-hungry structures  Reduce useless speculation  Better answers Clock Distribution Trends 21264 Power (Peak) 2% 5% 8% 32% Global Clock Networks Instruction Issue Units 10% Caches Floating Execution Units Integer Execution Units Memory Management Unit 10% I/O Miscellaneous Logic 15% Better answers 18%  Frequencies will continue to scale  Clock edge rates are not scaling Coping With Clock Distribution   Technology solution  Low swing differential clocks  Adiabatic clocking Architectural possibilities  Multiple clock zones  Asynchronous design Better answers Communication Delay Microprocessor Chip 21064 ~ 1cycle 21164 ~ 1.5 cycles 21264 ~ 3 cycles 21464 ~ 6 cycles Not drawn to scale Better answers Coping With Communication Delay  Technology solutions    Low K dielectrics Thinner (Cu) interconnect Architectural possibilities  Deeper pipelining Replication/clustering of structures  More autonomous computation  Better answers SIA Roadmap 1997 1999 2002 2005 2008 2012 Technology Node (um) 250 180 130 100 70 50 Memory (bit/chip) 256M 1G 4G 16G 64G 256G Transistors/chip (MPU) 11M 21M 76M 200M 520M 1.4G Chip Frequency (MHz) 750 1250 2100 3500 6000 10,000 Wiring Levels (max) 6 6 to 7 7 7 to 8 8 to 9 9 Power Supply Voltage, Vdd (V) 1.8-2.5 1.5-1.8 1.2-1.5 0.9-1.2 0.6-0.9 0.5-0.6 Power - High Performance (W), w/Heat sink 70 90 130 160 170 175 Power -Hand-held (W) 1.2 1.4 2 2.4 2.8 3.2 *The 2012 is directly from the SIA 1997 National Technology Roadmap Better answers Outline  Review of technology factors  Retrospective on the quantitative method  Augmenting the quantitative method  Recommendation Better answers Disclaimer The names used and events depicted in this talk are meant to be real. The events are, however, not an exhaustive enumeration of significant milestones. The misrepresentations of fact and omission of contributors are unintentional and solely the responsibility of the presenter. Finally, the interpretations are just that and are mine as well. Better answers Early quantitative method - 1981 Better answers uPC Histogram Chart – 1981-5 Better answers Paper counts ISCA 1 ISCA24 No model 22 1 Analytic Model 5 ½ Simulation 1 21½ Measurement 0 7 Better answers Scientific Method Make hypothesis about behavior  Design experiment  Run experiment and quantify  Interpret results  New hypothesis  Better answers Scientific Method Make hypothesis about behavior  Pick baseline design and workload  Run experiment and quantify  Interpret results  New hypothesis  Better answers Scientific Method Make hypothesis about behavior  Pick baseline design and workload  Run simulation model or measure hardware  Interpret results  New hypothesis  Better answers Scientific Method Make hypothesis about behavior  Pick baseline design and workload  Run simulation model or measure hardware  Interpret results  Propose new design  Better answers Making and Testing Hypothesis  Cache experiment (Schlansker)  64K word cache     32-way set associative cache/LRU replacement 200x200 matrix subblock of an N x N matrix Read twice Sizes    N=2727: 0 misses N=2729: 24160 misses N=2731: 36382 misses Better answers Propose new design  Skewed associative (Seznec) Direct mapped Better answers 4-way associative 4-way skewed Quantitative Approach Problems  Too much abstraction   Intra-chip latencies Memory subsystem  Poor workloads  Too incremental… Better answers Quantitative -> Incremental 4 3.5 3 2.5 2 1.5 1 0.5 0 a Better answers b c d e f g h I j k l Outline  Review of technology factors  Retrospective on the quantitative method  Augmenting the quantitative method  Recommendation Better answers Relaxing Constraints  Select a constraint to relax  Generate design  Employ quantitative method  Evaluate results Better answers Important Steps…  Before   Carefully pick a constraint to relax After   Find contributions without constraint Preserving results after reinstating the constraint Better answers Extrapolate From Current Trends   Personal Workstation – Xerox PARC – late 70’s VAX 11/780 Dorado 5 MHz 15 MHz 512 Kilobytes 8 Megabytes 40+ Users 1 User Results  Accelerate innovation Better answers Throw Out Standards  Distributed file system - 1985 Better answers Use a Simpler Starting Point  Fetch RISC out-of-order (Johnson, Tourng) Decode/ Map Queue Reg Read Execute Dcache/ Store Buffer Reg Write PC Register Map Regs Icache Better answers Dcache Regs Retire CISC-based O-O-O K6 (Johnson)  Pentium Pro (Colwell, Papworth…)  PC Covert CISC to RISC Icache Better answers RISC O-O-O Core Abandon conventions  VLIW (Fisher)    Relieve hardware of all dependency responsibility Give that responsibility to compiler Expected consequences   Much simpler implementation Faster cycle time Better answers Sometimes not what you expect  Compiler scheduling for hardware is a great idea    For 21064 - narrow in-order For 21164 - wider in-order For 21264 – wider out-of-order Better answers Issue Logic Critical Loop Issue Conflict Checker to floating point multiply pipeline to floating point add pipeline X to integer pipeline 0 to integer pipeline 1 Instruction Slot S2 Better answers Instruction Issue S3 Make a Radical Departure  Multiscalar research (Sohi, Smith…) Better answers New Mechanism Required  Dependence prediction (Moshovos) Store Program Order Execution Order Load Load Store Store Load Trap! Load Load Better answers Load What Was Really Important  Full hardware management (Sohi) Sequencing Register dependencies  Memory dependencies    Refinement (Mowry and Olukuton)   Compiler managed – registers, sequencing Hardware managed memory dependence only Better answers Ignoring Implementation Realities  SMT - in-order (Tullsen, Eggers, Levy) Fetch Issue Reg Read Execute Dcache/ Store Buffer Reg Write PC Icache Regs Icache Better answers Dcache Regs Solution Already Available  Fetch SMT out-of-order Decode/ Map Queue Reg Read Execute Dcache/ Store Buffer Reg Write PC Register Map Regs Icache Better answers Dcache Regs Retire Outline  Review of technology factors  Retrospective on the quantitative method  Augmenting the quantitative method  Recommendation Better answers Pay Attention to Reality  Look at technology trends    Power Latency Use more realistic models   More organizational details Better workloads Better answers Ignore Reality  Look for revolutionary contributions Decide on a constraint to relax  Apply the scientific method   Revolutionary contributions may arise because – Constraint will be relaxed in time – Constraint wasn’t fundamental – New avenues of exploration will be opened Better answers Acknowledgments Bill Bowhill  Paul Gronowski  Bill Herrick  Toni Juan  Geoff Lowney  Ellen Piccioli  Andre Seznec  Better answers
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            