* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download L21-layout design - VADA
Immunity-aware programming wikipedia , lookup
Power inverter wikipedia , lookup
Power factor wikipedia , lookup
Standby power wikipedia , lookup
Three-phase electric power wikipedia , lookup
Electrical substation wikipedia , lookup
Wireless power transfer wikipedia , lookup
Pulse-width modulation wikipedia , lookup
History of electric power transmission wikipedia , lookup
Electric power system wikipedia , lookup
Electrification wikipedia , lookup
Voltage optimisation wikipedia , lookup
Buck converter wikipedia , lookup
Power electronics wikipedia , lookup
Audio power wikipedia , lookup
Power over Ethernet wikipedia , lookup
Power MOSFET wikipedia , lookup
Distribution management system wikipedia , lookup
Amtrak's 25 Hz traction power system wikipedia , lookup
Time-to-digital converter wikipedia , lookup
Rectiverter wikipedia , lookup
Power engineering wikipedia , lookup
Power supply wikipedia , lookup
Alternating current wikipedia , lookup
L21:Lower Power Layout Design 1998. 6.7 성균관대학교 조 준 동 교수 http://vlsicad.skku.ac.kr Device Scaling of Factor of S • • • • • • • • • Constant scaled wire increases coupling capacitance by S and wire resistance by S Supply Voltage by 1/S, Theshold Voltage by 1/S, Current Drive by 1/S Gate Capaitance by 1/S, Gate Delay by 1/S Global Interconnection Delay, RC load+para by S Interconnect Delay: 50-70% of Clock Cycle Area: 1/S2 Power dissipation by 1/S - 1/S2 ( P = nCVdd2f, where nC is the sum of capacitance times #transitions) SIA (Semiconductor Industry Association): On 2007, physical limitation: 0.1 m 20 billion transistors, 10 sqare centimeters , 12 or 16 inch wafer Delay Variations at Low-Voltage • At high supply voltage, the delay increases with temperature (mobility is decreasing with temperature) while at very low supply voltages the delay decreases with temperature (VT is decreasing with temperature). • At low supply voltages, the delay ratio between large and minimum transistor widths W increases in several factors. • Delay balancing of clock trees based on wire snaking in order to avoid clock-skew. In this case, at low supply voltages, slightly VT variations can significantly modify the delay balancing. Quarter Micron Challenge • • • • • • • • • • • • • Computers/peripherals (SOC): 1996 ($50 Billion) 1999 ($70 Billion) Wiring dominates delay: wire R comparable to gate driver R; wire/wire coupling C > C to ground Push beyond 0.07 micron Quest for area(past), speed-speed (now), power-power-power(future) Accelerated increases of clock frequencies Signal integrity-based tools Design styles (chip + packages) System-level design(system partitioning) Synthesis with multiple constraints (power,area,timing) Partitioning/MCM Increasing speed limits complicate clock and power distribution Design bounded by wires, vias, via resistance, coupling Reverse scaling: adding area/spacing as needed: widening, thickening of wires, metal shielding & noise avoidance - adding metal CLOCK POWER CONSUMPTION •Clock power consumption is as large as the logic power; Clock Signal carrying the heaviest load and switching at high frequency, clock distribution is a major source of power dissipation. • In a microprocessor, 18% of the total power is consumed by clocking • Clock distribution is designed as a hierarchical clock tree, according to the decomposition principle. Power Consumption per block in typical microprocessor Crosstalk • • • • • • • • • • • Solution for Clock Skew Dynamic Effects on Skew Capacitance Coupling Supply Voltage Deviation (Clock driver and receiver voltage difference) Capacitance deviation by circuit operation Global and local temperature Layout Issues: clocks routed first Must aware of all sources of delay Increased spacing Wider wires Insert buffers Specialized clock need net matching Two approaches: Single Driver, Htree driver • • • • Gated Clocks: The local clocks that are conditionally enabled so that the registers are only clocked during the write cycles. The clock is partitioned in different blocks and each block is clocked with its own clock. Gating the clocks to infrequently used blocks does not provide and acceptable level of power savings Divide the basic clock frequency to provide the lowest clock frequency needed to different parts of the circuit Clock Distribution: large clock buffer waste power. Use smaller clock buffers with a well-balanced clock tree. PowerPC Clocking Scheme CLOCK DRIVERS IN THE DEC ALPHA 21164 DRIVER for PADS or LARGE CAPACITANCES Off-chip power (drivers and pads) are increasing and is very difficult to reduce such a power, as the pads or drivers sizes cannot be decreased with the new technologies. Layout-Driven Resynthesis for Lower Power Low Power Process Vdd • Dynamic Power Dissipation C djp Pd C L Vdd f 2 I ds 2 (Vgs Vt ) 2 Vin C ovp Vo C ovn C djn n C gate Cox (W L) i 1 m Cin (C gate ) j D j 1 Cov CGD0 W Cdj C j AD C jsw PD AD W D, PD 2(W D ) Drain W C jb C jsw Crosstalk • • • In deep-submicron layouts, some of the netlengths for connection between modules can be so long that they have a resistance which is comparable to the resistance of the driver. Each net in the mixed analog/digital circuits is identified depending upon its crosstalk sensitivity – 1. Noisy = high impedance signal that can disturb other signals, e.g., clock signals. – 2. High-Sensitivity = high impedance analog nets; the most noise sensitive nets such as the input nets to operational amplifiers. – 3. Mid-Sensitivity = low/medium impedance analog nets. – 4. Low-Sensitivity = digital nets that directly affect the analog part in some cells such as control signals. – 5. Non-Sensitivity = The most noise insensitive nets such as pure digital nets, The crosstalk between two interconnection wires also depends on the frequencies (i.e., signal activities) of the signals traveling on the wires. Recently, deep-submicron designs require crosstalk-free channel routing. Power Measure in Layout • • • • • The average dynamic power consumed by a CMOS gate is given below, where C_l is the load capacity at the output of the node, V_dd is the supply voltage, T_cycle is the global clock period, N is the number of transitions of the gate output per clock cycle, C_g is the load capacity due to input capacitance of fanout gates, and C_w is the load capacity due to the interconnection tree formed between the driver and its fanout gates. Pav = (0.5 Vdd2) / (Tcycle Cl N) = (0.5 Vdd2) / (Tcycle (Cg + Cw )N) Logic synthesis for low power attempts to minimize SUMi Cgi Ni Physical design for low power tries to minimize SUMi Cwi Ni . Here Cwi consists of Cxi + CsI, where Cxi is the capacitance of net i due to its crosstalk, and CsI is the substrate capacitance of net i. For low power layout applications, power dissipation due to crosstalk is minimized by ensuring that wires carrying high activity signals are placed sufficiently far from the other wires. Similarly, power dissipation due to substrate capacitance is proportional to the wirelength and its signal activity. 이중 전압을 이용한 레이아웃 • 조합회로의 전력 소모량을 줄이는 이중 전압 레이아웃 기법 제안 • 이중 전압 셀을 사용할 때, 한 cell row에 같은 전압의 cell이 배치되면 서 증가하는 wiring 과 track 의 수를 줄임 • 최소 트랜지스터 개수를 사용하는 Level Converter 회로의 구현 • 디바이스의 성능을 유지하면서 이중 전압을 사용하는 Clustered Voltage Scaling [Usami, ’95]을 적 용 • 제안된 Mix-And-Match Power Supply 레이 아웃 구조는 기존의 Row by Row Power Supply [Usami, ’97] 레이 아웃 구조를 개선하여 전력과 면적을 줄임 Clustered Voltage Scaling • 저전력 netlist 를 생성 G5 F/F S 5>0 G4 Slack(S i) = R i - A i G3 G6 G2 S 6>0 S 4>0 G8 S 2<0 S 3>0 LC1 S 8<0 G1 S 1>0 F/F G7 S 7<0 S 9>0 : VDDL S 11<0 F/F : VDDH LC2 G11 G10 S 10<0 G9 : Level Converter Row by Row Power Supply 구조 standard cell VDDL VDDH VDDL cell VDDL VDDH standard cell standard cell VDDL VDDH cell module VSS VDDL cell VDDH cell VDDH VSS Mix-And-Match Power Supply 구조 standard cell VDDL VDDH cell VDDH VDDL VDDL cell VDDH standard cell standard cell module VDDH cell VDDL cell VDDH cell VDDL VDDL VDDH VDDH VSS VSS VDDL cell 구조비교 Conventional Circuit RRPS MAMPS VDDL VDDH VDDH VDDL VDDH module module module Level Converter 구조 • Transistor의 갯수 : 6개 4개 • 전력과 면적면에서 효과적 VDDH VDDH VDDH OUT VDDL VSS/VDDL VSS/VDDH IN Vth=1.5V 기 존 Vth=2.0V 제 안 Mix-And-Match Power Supply Design Flow Single voltage netlist Multiple voltage scaling Netlist with multiple supply voltage (OPUS) Assign supply voltage to each cell Physical placement (Aquarius XO) Routing Synthesis timing, power and area (PowerMill) 실험결과 전체 Power 전체 Area Area (%) power (%) 100 47% 10% 15% 100 2% Conventional circuit RRPS MAMPS Conventional circuit RRPS MAMPS Low Power Design Tools • Transistor Level Tools (5-10% of silicon) – SPICE, PowerMill(Epic), ADM(Avanti/Anagram), Lsim Power Analyst(mentor) • Logic Level Tools (10-15%) – Design Power and PowerGate (Synopsys), WattWatcher/Gate (Sente), PowerSim (System Sciences), POET (Viewlogic), and QuickPower (Mentor) • Architectural (RTL) Level Tools (20-25%) – WattWatcher/Architect (Sente): 20-25% accuracy • Behavioral (spreadsheet) Level Tools (50-100%) – Active area of academic research Commercial synthesis systems Research synthesis systems AArchitectural synthesis. L - Logic synthesis. Low-Power CAD sites • • • • • • Alternative System Concepts, Inc, : 7X power reduction throigh optimization, contact http://www.ee.princeton.edu and Jake Karrfalt at jake@ascinc.com or (603) 437-2234. Reduction of glitch and clock power; modeling and optimization of interconnect power; power optimization for data-dominated designs with limited control flow. Mentor Graphics QuickPower: Hierarchical of determining overall benet of exchanging the blocks for lower power. powering down or disabling blocks when not in use by gated-clock choose candidates for power-down Calculate the effect of the power-down logic http://www.mentorg.com Synopsys's Power Compiler http://www.synopsys.com/products/power/power_ds Sente's WattWatcher/Architect (first commerical tool operating at the architecture level(20-25 %accuracy). http://www.powereda.com Behavioral Tool: Hyper-LP (Optimization), Explore (Estimation) by J. Rabaey Design Power(Synopsys) • • • DesignPower(TM) provides a single, integrated environment for power analysis in multiple phases of the design process: – Early, quick feedback at the HDL or gate level through probabilistic analysis. – Improved accuracy through simulation-based analysis for gate level and library exploration. DesignPower estimates switching, internal cell and leakage power. It accepts user-defined probabilities, simulation toggle data or a combination of both as input. DesignPower propagates switching information through sequential devices, including flip-flops and latches. It supports sequential, hierarchical, gated-clock, and multiple-clock designs. For simulation toggle data, it links directly to Verilog and VHDL simulators, including Synopsys' VSS. References [1] Gary K. Yeap, "Practical Low Power Digital VLSI Design", Kluwer Academic Publishers. [2] Jan M. Rabaey, Massoud Pedram, "Low Power Design Methodologies", Kluwer Academic Publishers. [3] Abdellatif Bellaouar, Mohamed I. Elmasry, "Low-Power Digital VLSI Design Circuits And Systems", Kluwer Academic Publishers. [4] Anantha P. Chandrakasan, Robert W. Brodersen, "Low Power Digital CMOS Design", Kluwer Academic Publishers. [5] Dr. Ralph Cavin, Dr. Wentai Liu, "1996 Emerging Technologies : Designing Low Power Digital Systems" [6] Muhammad S. Elrabaa, Issam S. Abu-Khater, Mohamed I. Elmasry, "Advanced Low-Power Digital Circuit Techniques", Kluwer Academic Publishers. References • • • • • [BFKea94] R. Bechade, R. Flaker, B. Kaumann, and et. al. A 32b 66 mhz 1.8W Microprocessor". In IEEE Int. Solid-State Circuit Conference, pages 208-209, 1994. [BM95] Bohr and T. Mark. Interconnect Scaling - The real limiter to high performance ULSI". In proceedings of 1995 IEEE international electron devices meeting, pages 241-242, 1995. [BSM94] L. Benini, P. Siegel, and G. De Micheli. Saving Power by Synthesizing Gated Clocks for Sequential Circuits". IEEE Design and Test of Computers, 11(4):32-41, 1994. [GH95] S. Ganguly and S. Hojat. Clock Distribution Design and Verification for PowerPC Microprocessor". In International Conference on Computer-Aided Design, page Issues in Clock Designs, 1995. [MGR96] R. Mehra, L. M. Guerra, and J. Rabaey. Low Power Architecture Synthesis and the Impact of Exploiting Locality". In Journal of VLSI Signal Processing,, 1996.