#### FINAL EXAMINATION - SOLUTIONS (Average score = 68/100)

### Problem 1 - (20 points – This problem is required)

a.) An interconnect line is 10 mm long and has a resistance of  $54m\Omega/\mu m$  and a capacitance of  $0.1 \text{fF}/\mu m$  and is driven by a 2X inverter (an inverter with an 8 $\lambda$  PMOS and a 4 $\lambda$  NMOS where  $\lambda = 0.1 \mu m$ ). What is the total delay of the circuit from the input of the inverter to the end of the interconnect line?

b.) Find the number of buffers (round off to the nearest integer) and the size of these buffers to minimize the delay of the 10 mm interconnect through buffer insertion.

c.) Since the X2 buffer probably cannot drive the insertion buffer at the input directly, use a fanout of 4 (FO4) to design a cascade of inverters that will allow the X2 buffer at the input to drive the first insertion buffer with minimum delay.

d.) For your design above which includes the cascade of inverters at the input followed by the buffer insertion in the 10 mm interconnect, compute the delay value from the 2X inverter at the input to the end of the interconnect. Assume that the average F04 delay of the cascaded inverters is 100ps.

<u>Solution</u>

a.)



$$\tau \approx (6.25 \mathrm{k}\Omega)(0.5 \mathrm{pF}) + (6.79 \mathrm{k}\Omega)(0.5 \mathrm{pF}) = 6.52 \mathrm{ns}$$

b.) Find the number of inserted buffers, N, and the size of each buffer, M.

$$N = \sqrt{\frac{R_{wire}C_{wire}/2}{R_{eqn}(C_J + C_G)(1 + \beta)}} = \sqrt{\frac{0.5 \cdot 540 \cdot 1\text{pF}}{12.5\text{k}\Omega(1\text{fF} \cdot 0.2 \cdot 3 + 2\text{fF} \cdot 0.2 \cdot 3)}} = \sqrt{\frac{270\text{ps}}{22.5\text{ps}}} = 3.46 \approx \underline{3}$$
$$M = \sqrt{\frac{R_{eqn}}{C_G(1 + \beta)}} \frac{C_{int}}{R_{int}} = \sqrt{\frac{12.5\text{k}\Omega}{2\text{fF} \cdot 0.2 \cdot 3}} \frac{0.1\text{fF}/\mu\text{m}}{0.054\Omega/\mu\text{m}}}$$
$$= \sqrt{1.042 \times 10^{19} \cdot 1.82 \times 10^{-15}} = 137.7 \approx \underline{140}$$

c.) Find the number of cascaded inverters to go from X2 to X140 with a fanout of 4.

$$n = \frac{\ln(C_{load}/C_{in})}{\ln f} = \frac{\ln(140/2)}{\ln 4} = 3.06 \approx \underline{3}$$

The design of the inverters driving and insertion buffers is shown below.

#### Problem 1 – Continued

c.) Continued.



Note that for the optimum delay, the size of the buffer being driven by the X32 buffer should be X128. However, it has been increased to X140 to be the first insertion buffer. The alternative would be to use a different fanout or add another buffer which would increase the delay. We will assume the above design represents the best choice under the conditions given.

d.) The model one stage of the buffer insertion is given below.



$$R_{X140} = \frac{12.5\text{K}}{140} = 90\Omega, \ C_{X140(out)} = 140(3 \cdot W \cdot C_{self}) = 140 \cdot 3 \cdot 0.4 \cdot 1 = 168\text{fF},$$
  
$$C_{X140(in)} = 140(3 \cdot W \cdot C_g) = 140 \cdot 3 \cdot 0.4 \cdot 2 = 336\text{fF}, \ C_{wire} = 1\text{pF}$$

The delay of the buffer insertion stage modeled above is,

 $\tau_{buffer insertion+wire/3} = (90)(168\text{fF} + 167\text{fF}) + (90+180)((167\text{fF} + 336\text{fF}) = 166 \text{ ps}$ The delay of the design is expressed as

Delay = 
$$3FO4 + 3(Buffer + \frac{Wire}{3}) = 3.100ps + 3(166ps) = \frac{798ps = 0.798ns}{298ps = 0.798ns}$$

# Problem 2 – (20 points – This problem is optional)

(a.) Estimate the values of  $t_{PLH}$  and  $t_{PHL}$  for the following pseudo-NMOS inverter that does not have an external load on it (only its own self-capacitance). Assume that a ramp input is applied.

(b.) Usually  $t_{PLH} >> t_{PHL}$  for this type of inverter. Give two reasons why the p-channel device is not made larger so that  $t_{PLH} = t_{PHL}$ .

<u>Solution</u>

a.)

$$t_{PLH}(\text{step}) = R_p C_{eff} = R_{eqp} \left(\frac{L_p}{W_p}\right) 5 \cdot C_{eff} = 30 \text{k}\Omega(0.2)(5)1\text{fF} = 30\text{ps}$$

$$t_{PHL}(\text{step}) = R_n C_{eff} = R_{eqp} \left(\frac{L_n}{W_n}\right) 5 \cdot C_{eff} = 12.5 \text{k}\Omega \left(\frac{0.2}{4}\right) (5)1\text{fF} = 3.125 \text{ps}$$
$$t_{PLH}(\text{ramp}) = 1.5 \ t_{PLH}(\text{step}) = 1.5(30 \text{ps}) = \frac{45 \text{ps}}{1.5(2.125 \text{ s})}$$

- $t_{PHL}(\text{ramp}) = 1.5 t_{PHL}(\text{step}) = 1.5(3.125\text{ps}) = 4.69\text{ps}$
- b.) Two possible reasons are:
  - 1.) Increases V<sub>OL</sub>.
  - 2.) Requires more area.



# Problem 3 – (20 points – This problem is optional)

For the dynamic D-latch shown, compute the output voltages at Q and  $\overline{Q}$  for the given input when the *CLK* goes high ( $V_{DD}$ ). Assume 0.18µm CMOS technology, W = L = 200nm, and  $V_{DD} = 1.8$ V. Use the velocity  $V_{DD}$  saturation models for the transistors.

### <u>Solution</u>

....

The first task is to find  $V_Q$  which is complicated by the bulk-source voltage not being zero. We know that,



$$V_T = V_{TO} + \gamma \sqrt{V_{SB} + |2\phi F|} - \gamma \sqrt{|2\phi F|}$$
  
or 
$$V_Q = 1.8 - V_T = 1.8 - 0.5 - \gamma \sqrt{V_Q + |2\phi F|} + \gamma \sqrt{|2\phi F|}$$
$$V_Q = 1.3 - 0.3 \sqrt{V_Q + 0.84} + 0.3 \sqrt{0.84} = 1.575 - 0.3 \sqrt{V_Q + 0.84}$$

Solving by quadratic or iterating gives  $\underline{V}_{\underline{Q}} = 1.1516V \implies V_{SG} = 0.6484V$ We see from this value that the PMOS is saturated and the NMOS is active. Therefore,

$$\frac{2Wv_{sat}C_{ox}(V_{SG}-0.5)^{2}}{(V_{SG}-0.5)+E_{c}L} = \frac{W}{L} \frac{\mu_{e}C_{ox}}{\left(1+\frac{V_{Q}}{E_{c}L}\right)} \left[ (V_{Q}-V_{TO})\overline{V_{Q}} - \frac{\overline{V_{Q}}^{2}}{2} \right]$$

$$\left(1+\frac{\overline{V_{Q}}}{1.2}\right) \left(\frac{2(0.2x10^{-4})(8x10^{6})(0.648-0.5)^{2}}{(0.148+0.5)(270)}\right) = 0.652\overline{V_{Q}} - \frac{\overline{V_{Q}}^{2}}{2}$$

$$\left(1+\frac{\overline{V_{Q}}}{1.2}\right) (0.0259) = 0.652\overline{V_{Q}} - \frac{\overline{V_{Q}}^{2}}{2} \rightarrow \overline{V_{Q}}^{2} - 1.303\overline{V_{Q}} + 0.052 = 0$$

$$\overline{V_{Q}} = 0.6516 \pm 0.5\sqrt{1.303^{2} - 4\cdot0.052} = 0.6516 \pm 0.0.6103 = 41\text{mV}$$

$$: \underline{V_{Q}} = 41\text{mV}$$

## Problem 4 – (20 points – This problem is optional)

Use 0.18µm technology for this problem.

(a.) For the NAND gate shown, size the transistors to deliver a switching threshold of  $V_S = 0.75$ V. Place the device sizes (W) on the schematic in units of nanometers assuming L =200nm. Choose the W such that  $R_{pullup}$  is the same as the standard inverter.

(b.) The voltage transfer curve of this gate is shown below for various combination of inputs. Provide an explanation as to why this occurs. How would you adjust the expression of  $V_S$  to account for this effect?



$$\therefore \quad 0.25\chi = 0.55 \quad \rightarrow \chi = 2.2$$

$$\chi^2 = \frac{\mu_n W_n}{\mu_p W_p} \longrightarrow 2.2^2 = \frac{540 W_n}{180 W_p} \longrightarrow \frac{W_n}{W_p} = \frac{180}{540} 2.2^2 = 1.613$$

To get the standard pull-up resistance,  $W_p$  must be 400nm. Therefore,

 $W_{\underline{p}} = 400$ nm and  $W_{\underline{n}} = 1.613 \cdot 400$ nm = 645nm

Another way to work this problem is to assume  $L_n = L_p = L$ . Therefore,

$$\chi = \sqrt{\frac{E_{cp}W_n}{E_{cn}W_p}} = 2\sqrt{\frac{W_n}{W_p}} = 2.2 \qquad \longrightarrow \qquad \frac{W_n}{W_p} = 1.21$$

Choosing  $W_p = 400$ nm gives  $W_n = 484$ nm which are acceptable answers although slightly different.

(b.) Compared with B, A moves to the left because of the body effect on the NMOS transistor.

With both inputs tied together the pull-down is weaker because neither input is at  $V_{DD}$  initially.



What is the minimum possible delay through the following circuit from node A to node E, and how would you size the gates? You can leave the delay and sizes in unitless quantities. Assume that the two NAND gates at the output are the same size.



## Problem 6 – (20 points – This problem is optional)

A dynamic logic gate is shown. The pre-charge device has a W/L of 5, the n-channel devices have a W/L of 3, and inverter has a pull-up of 4 and a pull-down of 1. (a.) What is the logic function of the gate at the output of the inverter? (b.) Why is the output inverter skewed? (c.) Using the device sizes above, what is the logical effort of input B of the first stage of the domino logic (when its immediate output is falling), and the inverter (when its output is rising). Computer these two values separately.



**Solution** 

(a.) f = (B+D)(A+C+E)

(b.) To make the risetime faster.

(c.) *B* input:

The *W/L* of each transistor is 3 times the minimum size so the resistance of each of these transistors is R/3. However, there are three of them in series in the pulldown path so the effective output resistance is 3R/3 = R. The input capacitance is 5 times the minimum capacitance. Therefore,

$$LE_B = \frac{(R)(5C_g)}{3RC_g} = 5/3 \quad \rightarrow \underline{LE}_B = \frac{5/3}{2}$$

Inverter input:

For the standard inverter, the input capacitance is  $3C_g$  and the output resistance is R. However, the pullup is twice as big as usual so its output resistance is R/2 instead of R. Therefore,

$$LE_{inv} = \frac{(R/2)(5C_g)}{3RC_g} = 5/6 \qquad \rightarrow \underline{LE}_{\underline{inv}} = 5/6$$

# Problem 7 – (20 points – This problem is optional)

(a.) Estimate the worst-case fall propagation delay,  $t_{PHL}$ , for the circuits below. Assume that each gate is only loaded by its own self-capacitance and a step function is applied at the inputs. The values beside each transistor are W/L ratios. Identify the fastest and slowest gate given these configurations. Assume that the channel length of all transistors is  $0.2\mu$ m.

(b.) Now assume that there is a single 100fF loading of each of the gates above (and the self capacitances are now zero). Compute the  $t_{PHL}$  delays for the gates and again identify the fastest and the slowest gates.



<u>Solution</u>

(a.)

Inverter:  $\tau = 0.7RC_{inv} = 0.7(12.5k\Omega)(3)(0.2\mu m)(1fF/\mu m) = 5.25ps$ 

NAND:  $\tau = 0.7RC_{nand} = 0.7(12.5k\Omega)(8)(0.4\mu m)(1fF/\mu m) = 28ps$ 

NOR:  $\tau = 0.7RC_{nor} = 0.7(12.5k\Omega/2)(10)(0.4\mu m)(1fF/\mu m) = 17.5ps$ 

Therefore the inverter is the fastest and the NAND gate the slowest.

(b.)

Inverter:  $\tau = 0.7RC_{inv} = 0.7(12.5k\Omega)(100\text{fF}) = 875\text{ps}$ 

NAND:  $\tau = 0.7RC_{nand} = 0.7(12.5k\Omega)(100fF) = 875ps$ 

NOR:  $\tau = 0.7RC_{nor} = 0.7(12.5k\Omega/2)(100fF) = 438ps$ 

Therefore the NOR is the fastest and the NAND gate and inverter the slowest.