

# Low Power Techniques in Digital Systems

SOCRATES'04 Joan Oliver ETSE-UAB

# Power Analysis Estimation on VLSI Circuits

- Actual microprocessor trends in power consumption
- Sources of power consumption in CMOS circuits
  - $\square$  Power consumption in CMOS circuits
  - ☐ Delay in MOS devices
  - $\square$  Scaling principles for low power
  - ☐ Architecture driven voltage scaling

SOCRATES'04 – Joan Oliver

# Power consumption microprocessor trends

- The continuing decrease in feature size and the corresponding increase in chip density and operating frequency have made power consumption a major concern in VLSI design. Modern microprocessors and microcontrolers are indeed hot. For exemple, following tables show the evolution in terms of power consumtion and clock speed of the most used microprocessors.
- In 1965 Gordon Moore stated that complexity of microprocessors will double every 12 months. This has been to be adjusted to every 24 months, but...
- The data published in January 2003 of the National Roadmap for Semiconductors predicts the continuation of this trend. For instance, current microprocessors have feature sizes (in gate lenghts) of 37nm and speed clock in microprocessor of 4.2GHz. This specs tend to be reduced to 7nm, and that microprocessor clock speed tends to 53GHz by the year 2018.

| ITRS Technology Nodes and Chip Capabilities <sup>2</sup> |      |      |      |       |  |
|----------------------------------------------------------|------|------|------|-------|--|
|                                                          | 2004 | 2007 | 2010 | 2018  |  |
| DRAM Half-Pitch<br>(nanometers)                          | 90   | 65   | 45   | 18    |  |
| DRAM Memory Size<br>(mega or gigabits)                   | 1G   | 2G   | 4G   | 32G   |  |
| DRAM Cost/Bit<br>(micro-cents)                           | 2.7  | 0.96 | 0.34 | 0.021 |  |
| Microprocessor Physical Gate<br>Length (nanometers)      | 37   | 25   | 18   | 7     |  |
| Microprocessor Speeds (GHz)                              | 4.2  | 9.3  | 15   | 53    |  |

SOCRATES'04 - Joan Oliver





| CDII                  |     | t Besser           |        | Resser         | T-b- |
|-----------------------|-----|--------------------|--------|----------------|------|
| CPU MHz               |     | SPECint95 SPECfp95 |        | Strom/Watt     | Jahr |
| PowerPC 603e          | 300 | 7.7                | 6.1    | (250MHz) 3.5W  | 1997 |
| PowerPC 604           | 180 | 6.2                | 5.3    | (120MHz) 16.5W | 1996 |
| PowerPC 604e          | 233 | 10.3               | 7.3    | 16.7W          | 1997 |
| PowerPC 604r          | 375 | 15.9               | 10.1   | (350MHz) 8.0W  | 1997 |
| PowerPC 740           | 300 | 12.2               | 7.1    | 3.4W           | 1998 |
| PowerPC 750           | 500 | 23.9               | 14.6   | 6.0W           | 1999 |
| PowerPC X704          | 533 | ca. 12             | ca. 10 | ~80W           | 1997 |
| PowerPC 7400          | 450 | 21.4               | 20.4   | (400MHz) 5.0W  | 1999 |
| Pentium               | 200 | 5.2                | 4.3    | (120MHz) 10.xW | 1995 |
| Pentium MMX           | 233 | 7.1                | 5.2    | 17.xW          | 1997 |
| Pentium Pro/256k      | 200 | 8.2                | 6.2    | 28.xW          | 1996 |
| Pentium Pro/1MB       | 200 | 8.7                | 6.8    | 43.xW          | 1997 |
| Celeron               | 500 | 17.9               | 12.9   | 27.0W          | 1999 |
| Pentium II            | 450 | 17.2               | 12.9   | 27.1W          | 1998 |
| Pentium II Xeon/2MB   | 450 | 19.7               | 15.0   | 46.7W          | 1998 |
| Pentium III           | 600 | 24.0               | 15.9   | 34.5W          | 1999 |
| Pentium III/E         | 800 | 38.4               | 28.9   | 26.2W          | 1999 |
| Pentium III Xeon/512k | 550 | 23.6               | 16.9   | 34.0W          | 1999 |
| AMD K7 Athlon         | 750 | 32.8               | 24.3   | 50W@700MHz     | 1999 |
| Alpha 21164PC         | 583 | 16.7               | 20.7   | ~45W           | 1997 |
| Alpha 21164           | 667 | 20.8               | 32.4   | 54W            | 1998 |
| Alpha 21264           | 667 | 31.8               | 49.0   | (600MHz) 109W  | 1999 |
| HP PA-RISC 8000       | 180 | 11.8               | 20.2   | ~40W           | 1996 |
| HP PA-RISC 8200       | 240 | 16.4               | 25.3   | ???            | 1997 |
| HP PA-RISC 8500       | 440 | 34.0               | 51.4   | ???            | 1998 |
| Sun UltraSPARC II     | 450 | 19.6               | 27.1   | (250MHz) ~25W  | 1998 |
| Sun UltraSPARC IIi    | 360 | 15.2               | 19.9   | ???            | 1998 |
| Sun UltraSPARC III    | 600 | 35++               | 60++   | ???            | 1998 |
| SGI MIPS R10000       | 250 | 14.7               | 24.5   | (180MHz) ~30W  | 1997 |

## Power consumption microprocessor trends

■ Smaller festure sizes → integration of a larger number of components in a chip reduction of the signal propagation delays higher clock frequencies.

But, ... overall increase of power dissipation,

- → overheating degrades performances reduces chip lifetime portability problems.
- → Identification of low power design as a technological critical need.
- For microprocessors there exist an analytical relationship between power consumption, area and clock frequency that potentially limits integration density:

  W

 $P_{\mu P} = 0.063 \frac{W}{\text{cm}^2 \text{MHz}} \text{Af}_{\text{clk}}$ 

■ There is a strong necessity for minimising power consumption when designing complex microelectronic digital circuits and systems:

For example, portable computation and wireless communication devices imposes very tight restrictions on the design to minimise power consumption, at the same time that real time digital processing (data, video, audio) require high computational resources to meet the requirements of the process.

SOCRATES'04 - Joan Oliver

7

### Power consumption microprocessor trends

- □ Actual design tools supply designers with advanced tools, from the system specification to the mask layout, with stepwise refinement processes, that allows the specifications at each stage to be optimised using tools at different level of abstraction.
- ☐ But, the components that contribute to the overall power consumption differs from component to component. That is, power optimisation is an inherently application specific problem to be carefully analysed for each component.



Power distribution of three designs: low-end microprocessor for embedded use, high-end CPU and MPEG2 decoder

SOCRATES'04 - Joan Oliver





## Power consumption sources in CMOS circuits



It is the power dissipated by the current that loads the capacitance of the output node of the gate.

Total switching power of a circuit can be expressed as

$$P_{sw} = \frac{1}{2} f_{clk} V_{DD}^2 \sum_{signal \ y} \alpha(y) C(y) = C_{eff} f_{clk} V_{DD}^2$$

- ☐ The most important contribution to power consumption. Can be
  - educed Reducing the power supply voltage. But  $f_{clk} \propto 0.7 \frac{V_t}{V_{DD}}$
  - Minimising the fanin logic feeding signals y.
  - Reducing the physical capacitance C(y) to be switched by signal y and the wiring capacitance. It is technology dependent. But wiring length does not scale down proportionally to the feature sizes of the technology.

SOCRATES'04 - Joan Oliver

11

## Power consumption sources in CMOS circuit

#### Delay in MOS devices

- ☐ One of the effective way of reducing power in MOS devices is lowering the supply voltage. But delay in MOS devices increases when decreasing the supply voltage.

$$\label{eq:td} \square \text{ For long channel devices } t_d = \frac{Q}{I_D} = \frac{C_L V_{DD}}{I_D} = k' \frac{C_L V_{DD}}{\left(V_{DD} - V_t\right)^2}$$

 $\square$  For short channel devices,  $I_D = (V_{DD} - V_t)^{\alpha}$ 

with  $\alpha$  = 1.3 for sub-half micrometer MOSFET's for a wide range of technologies from  $L_{eff}$  = 0.4 $\mu m$  down to  $L_{eff}$  = 0.1 $\mu m$ 

$$\Box \text{ Then } \mathbf{t_d} = \widetilde{\mathbf{k}} \frac{\mathbf{C_L} \mathbf{V_{DD}}}{(\mathbf{V_{DD}} - \mathbf{V_t})^{\alpha}}$$

SOCRATES'04 - Joan Oliver













## Power consumption sources in CMOS circuits

■ The evaluation of the energy performance of the two designs at the same speed (the delay remains constant and scales to 1/V<sub>DD</sub>) the voltage of the scaled version is

 $V_N = \frac{1 + \frac{\alpha}{N}}{1 + \alpha} V_{ref}$ 

with N being the transistor size of the speed up circuit in front of transistor with unity size.

■ Then the energy consumed by the first stage

Energy(N) = 
$$\left(C_p + NC_{ref}\right)V_N^2 = \frac{NC_{ref}\left(1 + \frac{\alpha}{N}\right)^3 V_{ref}^2}{\left(1 + \alpha\right)^2}$$

- Analysis of the expression tells that
  - The lowest power case occurs at  $\alpha$ =0  $\rightarrow$  without parasitic capacitance
  - $\blacksquare$  At high values of  $\alpha$  (significant interconnection capacitances) there is an optimum value for N
- It results that the determination of an optimum supply voltage is a key to minimise the power consumption

SOCRATES'04 - Joan Oliver









### Optimisation techniques



#### ☐ Levels of abstraction from system to circuit design

- Circuit description is transformed and manipulated at different levels of abstraction:
  - System level. System described in terms of software, hardware and memory components with algorithms that perform a certain functionality.
  - Behavioral or architectural level. Individual components described in terms of their algorithmic behavior. Descriptions usually build using specific languages like VHDL, Verilog, or general purpose as C.
  - Register transfer level. Hardware described in terms of arithmetic modules, registers, multiplexors and interconnect to steer data flow.
  - At gate level functionality of the circuit is described in terms of netlist or a set of boolean equations.
  - At the transistor level the circuit is described in terms of their network structure.
  - On the physically level the circuit is described in terms of the mask layout to be fabricated
  - □ Partitions of the system in components have great incidences in terms of power consumption → Power have to be taken in consideration as soon as possible in the design flow.

SOCRATES'04 - Joan Oliver

### Optimisation techniques



#### ☐ Power consumption optimisation at the system level

- □ Algorithm selection has to be made for best meeting design constraints. Power consumption of an algorithm depends on its characteristics (overall complexity and basic operations complexity).
  - Power consumption due to capacitance switching at the algorithmic level can be reduced if the computation task could be performed with fewer operations:
    - Example: vector quantization technique of a lossy compression technique, for coding video data: for a vector size of 16, the distorsion metric calculation involves 16 memory accesses, 16 substractions, 16 multiplicacions and 16 additions.
    - In order to minimize a distorsion metric, a codeword for each input is chosen.
    - Election of algorithms for implementation:

|                          | #memory accesses | #multiplications | #adds | #subs |
|--------------------------|------------------|------------------|-------|-------|
| Full search              | 4096             | 4096             | 3840  | 4096  |
| Tree search              | 256              | 256              | 240   | 264   |
| Differential tree search | 136              | 128              | 128   | 0     |

SOCRATES'04 - Joan Oliver

25

### Optimisation techniques



- ☐ Memory access and management. Data transfers to memory are a lot more power consuming than word multiplication.
- ☐ Hardware/software partitioning. At high level, decisions on where to execute a process in hardware or software have to be made, in terms of flexibility, timing, performance, and power consumption.

Tiwari at alt. showed that also power consumption of software can be optimised.

Partitioning is important in order to minimise the number of off-chip operations, since off-chip operations imply a significant amount of power consumption.

Many chips that use system-on-chip processors are nowadays available as synthesizable cores.

- ☐ Quiescent unit shutdown are used at different levels of abstraction.

  Mutually exclusive processes allow the clock shutdown of the process while it is not operating.
- □ Voltage scaling. Power consumption depends on the square of the power supply. Lowering supply voltage is an efficient means to lower power consumtion.

Throughput of the circuit can be architecturally compensated by means of area incresing: parallel implementation and pipelining

SOCRATES'04 – Joan Oliver

### Optimisation techniques



#### ☐ Power consumption optimisation at the behavioural level

- Algorithm transformation. Slow operations replaced by faster ones.
  - ☐ For example, the multiplication by constants by shift and add operations
  - □ Prepare behavioral description for latter efficient power optimisations transformations.
- Clock scheduling. Imposing a maximum delay in a clock cycle helps reducing power operation in a system.
- Eliminating redundant computation.

Winograd introduces incremental refinement structures for signal processing transformations with control strategies for low power. It is applied to FIR filter design. The number of filter taps used is dynamically varied to provide stop-band attenuation in proportion to a simple estimate of the time-varying energy in the undesired components of the input signal.

The approach lowers the number of taps used to produce each output sample in correspondance to the processign task to be performed. That is, controlling the number of taps that must switch on proportionally to the required stopband, power savings can be achieved

SOCRATES'04 - Joan Oliver

27

### Optimisation techniques





The incremental refinement structure along with an adaptation strategy was applied to two speech signals which had been frequency-division multiplexed: one signal was in the passband region of the lowpass filter and the other in the stopband region. The samplig rate for the FDM speech was 16 KHz.

Figure shows the FDM speech demultiplexing using low-power frequency selective filtering

- a) Corresponds to the speech signal in the passband region
- b) It is the speech signal in the stopband region.
- c) Number of filter sections used by the adaptive filtering technique.

SOCRATES'04 - Joan Oliver















































