



## GLOBAL JOURNAL OF RESEARCHES IN ENGINEERING: F ELECTRICAL AND ELECTRONICS ENGINEERING

Volume 17 Issue 7 Version 1.0 Year 2017

Type: Double Blind Peer Reviewed International Research Journal

Publisher: Global Journals Inc. (USA)

Online ISSN: 2249-4596 & Print ISSN: 0975-5861

# A Pseudo-PMOS Logic for Realizing Wide Fan-in NAND Gates

By Sherif M. Sharroush

*Port Said University*

**Abstract-** Wide fan-in logic gates when implemented in static complementary CMOS logic consume a significant area overhead, consume a large power consumption, and have a large propagation delay. In this paper, a pseudo-PMOS logic is presented for the realization of wide fan-in NAND gates in a manner similar to the realization of wide fan-in NOR gates using the pseudo-NMOS logic. The circuit design issues of this family are discussed. Also, it is compared with the conventional CMOS logic from the points of view of the area, the average propagation delay, the average power consumption, and the logic swing using a proper figure of merit. The effects of technology scaling and process variations on this family are investigated. Simulation results verify the enhancement in performance in which the 45 nm CMOS technology is adopted.

**Keywords:** area, energy-delay product, power consumption, power-delay product, propagation delay, pseudo-PMOS logic, wide fan-in.

**GJRE-F Classification:** FOR Code: 090699



APSEUDOPMOSLOGICFORREALIZINGWIDEFANINNANDGATES

*Strictly as per the compliance and regulations of:*



RESEARCH | DIVERSITY | ETHICS

# A Pseudo-PMOS Logic for Realizing Wide Fan-in NAND Gates

Sherif M. Sharroush

**Abstract** Wide fan-in logic gates when implemented in static complementary CMOS logic consume a significant area overhead, consume a large power consumption, and have a large propagation delay. In this paper, a pseudo-PMOS logic is presented for the realization of wide fan-in NAND gates in a manner similar to the realization of wide fan-in NOR gates using the pseudo-NMOS logic. The circuit design issues of this family are discussed. Also, it is compared with the conventional CMOS logic from the points of view of the area, the average propagation delay, the average power consumption, and the logic swing using a proper figure of merit. The effects of technology scaling and process variations on this family are investigated. Simulation results verify the enhancement in performance in which the 45 nm CMOS technology is adopted.

**Keywords:** area, energy-delay product, power consumption, power-delay product, propagation delay, pseudo-PMOS logic, wide fan-in.

## I. INTRODUCTION

MOS circuits that can be implemented using the universal NAND or NOR gates may contain a large number of serially connected NMOS or PMOS transistors if the fan-in is wide. Refer to Fig. 1 for such a wide fan-in logic circuit with  $n$  inputs realized in CMOS logic. The main problem associated with such circuits is the large propagation delay. This is due to the large  $RC$  time constant associated with charging or discharging the parasitic capacitances at the output node as well as the parasitic capacitances at the internal nodes. Also, the current-driving capability of the transistors degrades due to the reduction of their effective-gate voltages and their drain-to-source voltages in addition to the threshold-voltage increase due to the body effect. To make the matter worse, such gates when realized in logic-circuit families such as domino CMOS and pseudo NMOS suffer from the contention current, thus requiring a large area for the pull-down network (PDN) in order to have an acceptable noise margin and speed. Multi-input exclusive-OR gates that are required in applications such as parity-check and error-correction circuits or some built-in testing circuits and barrel-shifters [1] are types of applications that may include wide fan-in gates.

For realizing wide fan-in NOR gates, it is better to use the pseudo -NMOS logic -circuit family in which

the long chain of the PMOS transistors is substituted by an always-activated PMOS transistor. In this paper, the pseudo-PMOS logic-circuit family is adopted in a similar manner in realizing wide fan-in NAND gates in which the long chain of the NMOS transistors is substituted by an always-activated NMOS transistor. The performance of this family is investigated and compared with the conventional CMOS logic. The remainder of this paper is organized as follows: A survey of the previous work related to the problem at hand is presented in Section II. The pseudo-PMOS logic is presented qualitatively in Section III. The circuit design issues and the comparison of this family with the conventional static CMOS realization are presented quantitatively in Section IV. The effects of the technology scaling and the process variations on this family are discussed in Sections V and VI, respectively. The enhancement in performance is verified by simulation in Section VII. Finally, the paper is concluded in Section VIII.



**Fig. 1:** An  $n$ -input NAND gate using the static complementary CMOS logic circuit family with the internal capacitances indicated.

## II. PREVIOUS WORK

In this section, a review on some of the previous techniques for enhancing the performance of wide fan-in gates is presented. M. M. Khellah et al. [2] proposed a technique to lower both the dynamic-switching power consumption and the time delay of wide fan-in dynamic gates. This technique depends on generating a low swing signal at the output node by charging and discharging a small dummy capacitor. By virtue of the principle of charge sharing, a small swing is created on the gate output; finally, this swing is amplified to full rail using a suitable sense amplifier. Several techniques for reducing the power consumption of wide fan-in gates can be found in [3, 4, and 5]. Lowering the power consumption by reordering schemes is usually associated with a delay penalty as reordering is usually associated with a movement of the inputs that arrive lately farther away from the gate output.

A novel conditional isolation technique for reducing the evaluation time of wide fan-in domino gates was proposed by W. H. Chiu et al. [6]. This technique also reduces both the subthreshold and the gate-oxide leakage currents simultaneously. According to [6], reductions on total static power by 36%, dynamic power by 49.14%, and delay time by 60.27% compared to the conventional domino gate can be achieved. H. Mostafa et al. proposed adopting novel negative-capacitance circuits in order to reduce the delay variability under process variations [7]. According to this technique, the timing yield was improved from 50% to 100% for a 64-input wide dynamic OR gate at the expense of an excess power overhead.

In [8], K. Mohanram et al. proposed the reordering of the inputs exploiting the symmetry of the circuit with respect to their inputs in order to minimize the switching activity and hence the power consumption. An average reduction of 16% was achieved in power consumption using this scheme. This technique allows for a tradeoff between the complexity of the computation and the quality of the final output. Also, in order to reduce the glitching-power consumption, an extra dimension is added to the complexity of the problem (specifically, the pipelining) in

order to obtain the inputs of the circuit at nearly the same instant. A. A George et al. achieved a better noise immunity and a reduced leakage current without any degradation in speed for wide fan-in domino gates by comparing the worst-case leakage current of the pull-up network (PUN) with a mirrored version of this current [9].

F. Moradi et al. proposed a technique that acts to enhance the performance of wide fan-in domino gates by employing a footer transistor that is initially off in the evaluation phase, thus reducing leakage [10]. Also, his proposed scheme reduces the contention between the keeper transistor and the PDN during the evaluation phase. K. Rajasri et al proposed a 256-bit comparator by adopting a novel technique called current-comparison domino circuit, thus reducing both the time delay and the leakage-power consumption [11]. Anamika et al. adopted the stacking effect to reduce the leakage-power consumption for wide fan-in circuits [12].

Finally, the reader is referred to [13 and 14] for techniques that depend on novel circuits having the same output as the conventional wide fan-in circuit but with improved performance. In the next section, the pseudo-PMOS logic is presented.

## III. THE PSEUDO-PMOS LOGIC-CIRCUIT FAMILY

The idea of the pseudo-PMOS logic is simply as follows: It is well known from DeMorgan's law that

$$\overline{A_1 A_2 \dots A_n} = \overline{A_1} + \overline{A_2} + \dots + \overline{A_n} \quad (1)$$

That is, logic "0" is obtained at the output of a circuit if all the inputs are at logic "1" and logic "1" is obtained at the output if any of the inputs is at logic "0." This can be implemented as well known by the series connection of NMOS transistors in the PDN and the parallel connection of PMOS transistors in the PUN. However, the right-hand side of Eq. (1) can be implemented simply using a parallel connection of PMOS transistors in the PUN. Now, refer to Fig. 2 for illustration of the pseudo-PMOS logic with  $n$  inputs.



Fig. 2: The pseudo-PMOS logic for realizing wide fan-in NAND gates.



Fig. 3: The pseudo-PMOS logic with the use of two cascaded inverters.

If at least one of the inputs is deactivated, then the corresponding PMOS transistor will conduct. Due to the continuous conduction of the NMOS transistor,  $M_N$ , there is a voltage division between these two devices. The equivalent resistance of  $M_N$  can be adjusted by properly choosing the biasing voltage,  $V_B$ , or adjusting its strength through the aspect ratio,  $(W/L)_n$ , or the threshold voltage,  $V_{thn}$ .

Now, if all the inputs are activated, all the PMOS devices will be deactivated. Thus, the parasitic capacitance at the output node is discharged with no contention from the PMOS parallel network and the output is at logic "0" as it must be. The main advantage of the pseudo-PMOS logic is that increasing the number of the inputs merely increases the parasitic capacitance at the output node, thus not affecting the performance of this family significantly. On the other hand, increasing the number of the inputs in the CMOS logic has a significantly deleterious effect on its performance; a point that was discussed in the preceding section and is returned to in Section IV.

Instead of the application of a different biasing voltage,  $V_B$ , a voltage equal to the adopted power supply,  $V_{DD}$ , can be applied with either increasing the threshold voltage of  $M_N$  or lowering its aspect ratio to obtain a larger equivalent resistance at the lower arm of the voltage divider. Lowering the aspect ratio can be implemented by connecting multi transistors in series as the aspect ratio of the transistor equivalent to  $n$  serially connected transistors each with aspect ratio  $(W/L)$  is  $(W/nL)$  [15]. This is done in order to avoid the need to generate a separate voltage. An important note that is worth mentioning here is the proper choice of the threshold voltage of  $M_N$ ,  $V_{thn}$ . Increasing this voltage, although certainly slows down the discharging process of  $C_L$ , makes the equivalent resistance of  $M_N$  larger. Thus, the output voltage resulting from the voltage division is larger with the result that the output-high level and consequently the logic swing is larger. Also, the low-to-high transition is faster. These contradictions are returned to in Sections IV and VII.

In order to resolve this contradiction, the scheme of Fig. 3 can be used in which two cascaded inverters were added. The benefits gained from adding these two inverters are to obtain a rail-to-rail voltage

swing at the output and to reduce the rise and fall times of the output waveform. This in turn reduces the short-circuit power consumption in the driven stages. If the previously mentioned parameters are properly chosen, then the voltage at the input of the first inverter,  $V_{CL}$ , will be relatively high (larger than the threshold voltage of the first inverter,  $V_{thinv1}$ ) in case only one input is deactivated. If more than one input is deactivated, then more than one PMOS device will be activated with the result that the equivalent resistance of the upper part of the voltage divider decreases. The result is that the voltage at the input of the first inverter becomes larger than that of the previous case and also the output voltage becomes at logic "1." The price paid, however, is the dc current drawn through the first inverter, the short-circuit power consumption of the two inverters, and the additional propagation delays of the two added inverters. Also, the change of  $V_{CL}$  with respect to the threshold voltage of the first inverter due to the process variations affects the reliability of the scheme; a point that is returned to in Section VI. *Throughout this paper, the scheme of Fig. 3 is adopted unless otherwise specified.* Note also that in order to inhibit the large power consumption in the standby state due to the continuous current drawn from  $V_{DD}$  to ground, the signal,  $V_B$ , must be connected to the standby signal. Thus, the path from  $V_{DD}$  to ground becomes open during the standby interval.

Two important notes are in order here. The first one is that the pseudo-PMOS logic family can be used in realizing any logic circuit with series or parallel connections in the PUNs or PDNs. In this case, the PUN is the same as that of the conventional CMOS logic. The pseudo-PMOS logic is obviously not suitable for realizing logic circuits containing serially connected PMOS transistors in their PUNs as it requires a significant area overhead. The second note is that the quantitative analysis of the next section can be applied equally well to the pseudo-NMOS logic after substituting the acronyms associated with the NMOS devices by those of the PMOS ones and vice versa.



Fig. 4: The circuit schematic representing the worst-case scenario

## IV. CIRCUIT DESIGN ISSUES

In this section, the circuit design issues of the pseudo-PMOS logic are discussed quantitatively from six aspects. The first one is the proper choice of the strength of the NMOS transistor,  $M_N$ , and the threshold voltage of the first inverter,  $V_{thinv1}$  (if used). The second, third, fourth, and fifth aspects concern the comparisons between the pseudo-PMOS logic and the conventional CMOS logic from the points of view of the area, the average propagation delay, the average power consumption, and the logic swing. Finally, a figure of merit that includes these metrics is defined and adopted in comparing the performance of the pseudo-PMOS logic and conventional CMOS logic.

### a) The Proper Choice of the Strength of the NMOS Transistor

In determining the proper range for the values of  $V_{thinv1}$ ,  $(W/L)_n$ ,  $V_{thn}$ , and  $V_B$ , we adopt the worst-case scenario. The worst-case scenario is the assumption of only one deactivated input because it represents the minimum strength for the PMOS parallel combination and thus the highest equivalent resistance. If the pseudo-PMOS logic operates properly under this condition, it can be ensured to operate properly for all possible input combinations.

Refer now to Fig. 4 for this scenario.  $M_N$  operates in the saturation region for typical values of the adopted NMOS-transistor parameters in the worst case just described. Since  $V_{CL1}$ , the final steady-state voltage across  $C_L$  (the parasitic capacitance at the input of the first inverter), is chosen to be larger than  $V_{thinv1}$ , it is expected to be larger than  $V_{DD}/2$ . Thus, for the typical values of the PMOS-transistor parameters,  $M_P$  is expected to operate in the triode region as its  $V_D$  (drain voltage) is larger than its  $V_G + |V_{thp}|$ , where  $V_G$  and  $V_{thp}$  are the gate and threshold voltages of  $M_P$ , respectively. If the PMOS device is assumed to operate in the deep-triode region, then after equating the currents of the NMOS and PMOS devices in which the Shichman-Hodges square-law MOSFET model is adopted [16], we obtain

$$\frac{1}{2}k_n' \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 (1 + \lambda_n V_{CL1}) = k_p' \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) (V_{DD} - V_{CL1}) \quad (2)$$

where  $k_n'$ ,  $(W/L)_n$ , and  $V_{thn}$  are the process-transconductance parameter, the aspect ratio, and the threshold voltage of NMOS devices, and  $k_p'$ ,  $(W/L)_p$ , and  $V_{thp}$  are their PMOS counterparts.  $\lambda_n$  is the channel-length modulation effect parameter of NMOS devices. After simple mathematical manipulations, we readily obtain

$$V_{CL1} = \frac{k_p' \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) (V_{DD} - \frac{1}{2}k_n' \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2)}{\frac{1}{2}k_n' \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 \lambda_n + k_p' \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} \quad (3)$$

Alternatively, each of the NMOS and PMOS devices,  $M_N$  and  $M_P$ , can be replaced by its equivalent resistance. The equivalent resistance,  $R_{MP}$ , of  $M_P$  in the deep-triode region is [17]

$$R_{MP} = \frac{1}{k_p' \left( \frac{W}{L} \right)_p (V_{SG} - |V_{thp}|)} = \frac{1}{k_p' \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} \quad (4)$$

However, the equivalent resistance of  $M_N$ , let it be  $R_{MN}$ , can be written as the ratio between the average drain-to-source voltage and the average drain current, thus

$$R_{MN} = \frac{\frac{1}{2}(V_{CL1} + 0)}{\frac{1}{2} \left[ \frac{1}{2}k_n' \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 (1 + \lambda_n V_{CL1}) + 0 \right]} = \frac{V_{CL1}}{\frac{1}{2}k_n' \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 (1 + \lambda_n V_{CL1})} \quad (5)$$

The voltage,  $V_{CL1}$ , can be found simply from the voltage division between  $R_{MN}$  and  $R_{MP}$  as follows:

$$V_{CL1} = \frac{R_{MN} V_{DD}}{R_{MN} + R_{MP}} \quad (6)$$

After substituting by  $R_{MN}$  and  $R_{MP}$  into Eq. (6), the expression for  $V_{CL1}$  can be obtained. Now, putting  $V_{CL1}$  larger than  $V_{thinv1}$  results in the following inequality (from which the strength of the NMOS device can be determined):

$$\frac{k_p' \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) V_{DD} - \frac{1}{2}k_n' \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2}{\frac{1}{2}k_n' \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 \lambda_n + k_p' \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} > V_{thinv1} \quad (7)$$

$V_{thinv1}$  in turn can be evaluated from [17]

$$V_{thinv1} = \frac{\sqrt{k_p' \left( \frac{W}{L} \right)_{p1} [V_{DD} - |V_{thp1}|]} + V_{thn1}}{1 + \sqrt{\frac{k_p' \left( \frac{W}{L} \right)_{p1}}{k_n' \left( \frac{W}{L} \right)_{n1}}}} \quad (8)$$

where  $(W/L)_{n1}$ ,  $(W/L)_{p1}$ ,  $V_{thn1}$ , and  $V_{thp1}$  are the aspect ratios and the threshold voltages of the constituting NMOS and PMOS transistors of the first inverter, respectively.

Before leaving this subsection, an important note follows: It is obvious from the qualitative discussion of the pseudo-PMOS logic that increasing the strength of the NMOS device,  $M_N$ , causes the low-to-high and the high-to-low propagation delays to increase and decrease, respectively, i.e. to change in opposite directions. Thus, it can be concluded that there is an optimum value for the parameters determining this strength such as the threshold voltage and the aspect ratio at which the average propagation delay is at its minimum. This is really the case and this point is confirmed in Subsection C.

### b) The Area Comparison

In comparing the areas of the pseudo-PMOS logic with the CMOS logic, we adopt the approximation that the area of a certain transistor is equal to its channel area [17]. Adopting the convention that the size of the PMOS transistor is twice that of the NMOS one in order to compensate for the mobility difference and that each of the  $n$  NMOS transistors in the series connection has an aspect ratio of  $n$  in order to compensate for the degradation in delay [17], then the areas of the conventional and the proposed logic-circuit families,  $A_c$  and  $A_p$ , can be approximated by

$$A_c = WL(n^2 + 2n) \quad (9)$$

$$A_p = (2n + 7)WL \quad (10)$$



**Fig. 5:** The plots of the approximated areas of the CMOS logic and the pseudo-PMOS logic versus the number of the inputs.

The plots of  $A_c$  and  $A_p$  versus  $n$  for  $W = L = 45$  nm are shown in Fig. 5. It can be concluded from this rough estimation of the area that the area overhead of the two-cascaded inverters is justified when the number of the inputs exceeds 2. Had we adopted the version of Fig. 2 for the pseudo-PMOS logic, the area of this family would have been smaller than that of the CMOS logic for all values of  $n$ .

### c) The Average Propagation-Delay Comparison

The average propagation delay according to the pseudo-PMOS logic is defined as

$$t_{pavg} = \frac{t_{PLH_p} + t_{PHL_p}}{2} \quad (11)$$

$$\therefore i_{MPavg} = \frac{1}{2} \left[ \frac{1}{2} k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)^2 (1 + \lambda_p V_{DD}) + k_p \left( \frac{W}{L} \right)_p \left[ (V_{DD} - |V_{thp}|)(V_{DD} - V_{CL1}) - \frac{1}{2} (V_{DD} - V_{CL1})^2 \right] \right] \quad (16)$$

$i_{MNavg}$  is given by

$$i_{MNavg} = \frac{i_{MN}(at V_{CL} = 0) + i_{MN}(at V_{CL} = V_{CL1})}{2} \quad (17)$$

where  $t_{PLH_p}$  and  $t_{PHL_p}$  are the low-to-high and the high-to-low propagation delays according to the pseudo-PMOS logic, respectively. To determine  $t_{PLH_p}$ , refer to the circuit shown in Fig. 4. This circuit represents the worst case from the point of view of the delay also as the charging current of  $C_L$  is the smallest one and thus the estimated value of  $t_{PLH_p}$  is the largest one. The time delay,  $t_{PLH_p}$ , contains there subcomponents,  $t_{PLH_{p1}}$ ,  $t_{PLH_{p2}}$ , and  $t_{PLH_{p3}}$ , respectively. These are the time delays required to precharge  $C_L$  to a certain steady-state value that depends on the relative strengths of the activated PMOS device and the always activated NMOS device, the high-to-low propagation delay of the first inverter, and the low-to-high propagation delay of the second inverter, respectively. The first subcomponent can be approximated by [17]

$$t_{PLH_{p1}} = \frac{C_L \Delta V_{CL}}{i_{chavg}} \quad (12)$$

where  $\Delta V_{CL}$  is the voltage change of  $V_{CL}$  and  $i_{chavg}$  is the average charging current of  $C_L$ . To determine  $C_L$ , we adopt the assumption that the aspect ratio of the PMOS device is twice that of the NMOS one in order to compensate for the difference in their mobilities and assume that the parasitic capacitance associated with each terminal of the minimum-sized NMOS transistor is  $C$  [18], then  $C_L$  can be approximated as

$$C_L = \left[ 3 + 2n + \left( \frac{W}{L} \right)_n \right] C \quad (13)$$

where  $(W/L)_n$  is the aspect ratio of  $M_N$ .  $\Delta V_{CL}$  is equal to  $0.5V_{CL1}$  (adopting the 50% criterion) and  $i_{chavg}$  can be found from

$$i_{chavg} = i_{MPavg} - i_{MNavg}, \quad (14)$$

where  $i_{MPavg}$  and  $i_{MNavg}$  are the average currents of  $M_P$  and  $M_N$ , respectively. The last two currents can be found as follows:

$$i_{MPavg} = \frac{i_{MP}(at V_{CL} = 0) + i_{MP}(at V_{CL} = V_{CL1})}{2} \quad (15)$$

$$\therefore i_{MNV_{avg}} = \frac{1}{4} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 (1 + \lambda_n V_{CL1}). \quad (18)$$

$$t_{PLH_{p1}} = \frac{\left[ 3 + 2n + \left( \frac{W}{L} \right)_n \right] C (0.5 V_{CL1})}{\frac{1}{2} \left[ \frac{1}{2} k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)^2 (1 + \lambda_p V_{DD}) + k_p \left( \frac{W}{L} \right)_p \left[ (V_{DD} - |V_{thp}|)(V_{DD} - V_{CL1}) - \frac{1}{2} (V_{DD} - V_{CL1})^2 \right] \right] - \frac{1}{4} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 (1 + \lambda_n V_{CL1})} \quad (19)$$

The other two subcomponents are given by

$$t_{PLH_{p2}} = \frac{C_{out} \Delta V_{out}}{i_{avg}} \quad (20)$$

where  $C_{out}$ ,  $V_{out}$ , and  $i_{avg}$  are the parasitic capacitance at the output of the first inverter, the voltage at the output of the first inverter, and the average discharging current of  $C_{out}$ , respectively. Keeping in mind that the logic “1” feeding the first inverter is  $V_{CL1}$ , then

$$i_{avg} = \frac{1}{4} k_n \left( \frac{W}{L} \right)_{n1} (V_{CL1} - V_{thn1})^2 (1 + \lambda_n V_{DD}) \quad (21)$$

$C_{out}$  and  $V_{out}$  are equal to  $6C$  and  $V_{DD}$ , respectively. So,

$$t_{PLH_{p2}} = \frac{6C V_{DD}}{\frac{1}{4} k_n \left( \frac{W}{L} \right)_n (V_{CL1} - V_{thn})^2 (1 + \lambda_n V_{DD})} \quad (22)$$

$t_{PHL_{p3}}$  is given by [17]

$$t_{PLH_{p3}} = \frac{2C_{out}}{k_p \left( \frac{W}{L} \right) (V_{DD} - |V_{thp}|)} \left[ \frac{|V_{thp}|}{V_{DD} - |V_{thp}|} + \frac{1}{2} \ln \left( \frac{3V_{DD} - 4|V_{thp}|}{V_{DD}} \right) \right] \quad (23)$$

$C_{out}$  is the parasitic capacitance at the output of the second inverter and is given by  $3C + C_{fan}$ , where  $C_{fan}$  is the parasitic capacitance due to the fan-out.

Now,  $t_{PHL_{p1}}$  contains three subcomponents also;  $t_{PHL_{p1}}$ ,  $t_{PHL_{p2}}$ , and  $t_{PHL_{p3}}$  which are the time delays required to discharge  $C_L$  from  $V_{CL1}$  to 0 V, the low-to-high propagation delay of the first inverter, and the high-to-low propagation delay of the second inverter, respectively.  $t_{PHL_{p1}}$  can be found from

$$t_{PHL_{p1}} = \frac{C_L \Delta V_{CL}}{i_{disavg}} \quad (24)$$

where  $i_{disavg}$  is the average discharging current through  $M_N$  and is given by

$$i_{disavg} = \frac{i_{MN} (at V_{CL} = V_{CL1}) + i_{MN} (at V_{CL} = 0)}{2} \quad (25)$$

Substituting by these two currents into Eqs. (12) and (14) results in

$$\therefore i_{disavg} = \frac{1}{4} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 (1 + \lambda_n V_{CL1}) \quad (26)$$

So,  $t_{PHL_{p1}}$  is given by

$$t_{PHL_{p1}} = \frac{2C_L V_{CL1}}{k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 (1 + \lambda_n V_{CL1})} \quad (27)$$

$t_{PHL_{p2}}$  and  $t_{PHL_{p3}}$  were estimated in [17] and were found to be respectively.

$$t_{PHL_{p2}} = \frac{2C_{out}}{k_p \left( \frac{W}{L} \right) (V_{DD} - |V_{thp}|)} \left[ \frac{|V_{thp}|}{V_{DD} - |V_{thp}|} + \frac{1}{2} \ln \left( \frac{3V_{DD} - 4|V_{thp}|}{V_{DD}} \right) \right] \quad (28)$$

and

$$t_{PHL_{p3}} = \frac{2C_{out}}{k_n \left( \frac{W}{L} \right)_n (V_{DD} - V_{thn})} \left[ \frac{V_{thn}}{V_{DD} - V_{thn}} + \frac{1}{2} \ln \left( \frac{3V_{DD} - 4V_{thn}}{V_{DD}} \right) \right] \quad (29)$$

As discussed in Subsection A and illustrated in Figs. 6 and 7, there is an optimum value for the aspect ratio and the threshold voltage of  $M_N$  at which  $t_{pavg}$  is at its minimum. Figs. 6 and 7 show the plots of the average propagation delay of the pseudo-PMOS logic versus the aspect ratio and the threshold voltage of  $M_N$ , respectively, according to the scheme of Fig. 3 with  $V_{thn} = 0.25$  V,  $V_{thp} = -0.32$  V,  $V_{DD} = 0.8$  V, and  $C = 1$  fF. As shown in these two figures, the optimum average propagation delay occurs at  $(W/L)_n$  and  $V_{thn}$  equal to 5.2 and 0.58 V, respectively.



Fig. 6: The plot of the average propagation delay of the pseudo-PMOS logic versus the aspect ratio of  $M_N$ ,  $(W/L)_n$ .



Fig. 7: The plot of the average propagation delay of the pseudo-PMOS logic versus the threshold voltage of  $M_N$ ,  $V_{thn}$ .

Now, refer to Fig. 8 for the plots of the average propagation delay versus the number of the inputs according to the analysis and the simulation results adopting the scheme of Fig. 2 and estimating the time delays up to the 50% point.



Fig. 8: The average propagation delay versus the number of the inputs according to the analysis and the simulation results.

Now, the average propagation delay according to the CMOS logic,  $t_{pavgc}$ , can also be defined in a similar way as the average of the low-to-high and the high-to-low propagation delays,  $t_{PLHc}$  and  $t_{PHLc}$ , respectively, as follows:

$$t_{pavgc} = \frac{t_{PLHc} + t_{PHLc}}{2} \quad (30)$$

Toward a simplified evaluation for these two time delays, each NMOS and PMOS transistor in the CMOS-logic circuit is represented by an equivalent resistance,  $R_N$  and  $R_P$ , respectively. In [19], approximate expressions for the equivalent resistances of the NMOS and PMOS transistors are:

$$R_N = \frac{\alpha_n}{(W/L)_n} \quad (31)$$

and

$$R_P = \frac{\alpha_p}{(W/L)_p} \quad (32)$$

respectively, where  $\alpha_n$  and  $\alpha_p$  are process-dependent parameters for the NMOS and PMOS devices, respectively. Simulation results reveal that the best estimates for  $\alpha_n$  and  $\alpha_p$  are  $2.5 \Omega$  and  $18 \text{ k}\Omega$ , respectively, for the 45 nm CMOS technology. We adopt the worst case in estimating the low-to-high propagation delay in that only one input is assumed to be at logic "0" and corresponds to the lowermost NMOS transistor so that all the internal capacitances will be charged. Applying Elmore's delay formula [20 and 21] to the conventional CMOS circuit shown in Fig. 1 results in the following estimations for  $t_{PLHc}$  and  $t_{PHLc}$ :

$$t_{PLHc} = (\ln 2) [R_p C_1 + (R_p + R_N) C_2 + (R_p + 2R_N) C_3 + \dots + (R_p + (n-1)R_N) C_n] \quad (33)$$

and

$$t_{PHLc} = (\ln 2) [nR_N C_1 + (n-1)R_N C_2 + (n-2)R_N C_3 + \dots + R_N C_n] \quad (34)$$

According to the estimation of the parasitic capacitances adopted in this paper, we have:  $C_1 = C_{fan} + 3nC$ ,  $C_2 = C_3 = \dots = C_n = 2nC$ . After substituting for the values of these capacitances into Eqs. (33) and (34), we obtain

$$t_{PLHc} = (\ln 2) [R_p C_{fan} + n(2n+1)R_p C + n^2(n-1)R_N C] \quad (35)$$

and

$$t_{PHLc} = (\ln 2) R_N [nC_{fan} + n^2(n+2)C] \quad (36)$$

#### d) The Average Power-Consumption Comparison

The average power consumption is the average of the power consumptions in cases of low-to-high and high-to-low transitions. The power consumption of the pseudo-PMOS logic contains the static and the dynamic power components. The static-power consumption is that associated with the first inverter due to the activation of its two devices in case of low-to-high transition and due to the current drawn through the activated PMOS devices and the always activated NMOS device,  $M_N$ , in case of low-to-high transition also. The input voltage of the first inverter certainly depends on the number of the activated inputs. So, we, in order to simplify the dc-power estimation, assume that the first-inverter's input is at  $V_{DD}/2$  and that this inverter is matched so that its output will also be at  $V_{DD}/2$ . The estimated dc-power consumption according to this evaluation is certainly overestimated as the dc current of the first inverter is at its maximum when the inverter's

input is at  $V_{DD}/2$  [17]. Thus, the range of the number of the inputs over which the pseudo-PMOS logic is better than the CMOS logic is expected to be larger than that estimated. In case of low-to-high transition, the dc current of the first inverter is (where LH indicates low-to-high transition)

$$I_{DCLH} = \frac{1}{2} k_n \left( \frac{W}{L} \right)_n \left( \frac{V_{DD}}{2} - V_{thn} \right)^2 \left( 1 + \lambda_n \frac{V_{DD}}{2} \right) \quad (37)$$

The dc-power consumption of the first inverter in the low-to-high transition is thus

$$P_{DCLH} = \frac{V_{DD}}{2} k_n \left( \frac{W}{L} \right)_n \left( \frac{V_{DD}}{2} - V_{thn} \right)^2 \left( 1 + \lambda_n \frac{V_{DD}}{2} \right) \quad (38)$$

Similarly, the dc-power consumption through  $M_N$  can be

$$P_{DC} = \frac{P_{DCLH} + P_{DCHL}}{2} = \frac{V_{DD}}{4} k_n \left( \frac{W}{L} \right)_n \left[ \left( \frac{V_{DD}}{2} - V_{thn} \right)^2 \left( 1 + \lambda_n \frac{V_{DD}}{2} \right) + (V_B - V_{thn})^2 (1 + \lambda_n V_{CL1}) \right] \quad (42)$$

The average dynamic-switching power consumption is the average of that associated in cases of low-to-high and high-to-low transitions. In case of worst-case low-to-high transition, all except one of the inputs are activated, thus the dynamic-switching power consumption associated with the charging of the parasitic capacitances at the input of the first inverter, the output of the second inverter, and the gate terminals can be written as

$$P_{dLH} = \alpha f V_{DD}^2 C_{out} + \alpha f V_{DD} V_{CL1} C_L + \alpha f V_{DD}^2 [2(n-1)] C \quad (43)$$

where  $\alpha$  is the switching activity and  $f$  is the frequency of operation. The corresponding value in case of high-to-low transition is

$$P_{dHL} = \alpha f V_{DD}^2 C_{out} + \alpha f V_{DD}^2 (2n) C \quad (44)$$

Thus, the average dynamic-switching power consumption associated with the parasitic capacitances at the previously mentioned nodes is

$$P_{davg} = \frac{P_{dLH} + P_{dHL}}{2} \quad (45)$$

Note that all these capacitances have the same switching activities and the same frequencies of operation. The second type of the dynamic-power consumption is the short-circuit power consumption associated with the two inverters,  $P_{sc1avg}$  and  $P_{sc2avg}$ . Assuming that the two inverters are matched, then the

$$P_{dLHc} = \alpha f V_{DD}^2 [2C(n-1) + nC(n-1)] + \alpha f V_{DD}^2 [C_1 + C_2 + \dots + C_n] = \alpha f V_{DD}^2 [C_{fan} + 3nC + C(n-1)(3n+2)] \quad (48)$$

During the high-to-low transition, all the inputs are activated and thus the associated power consumption is

written as

$$P_{DCLH2} = \frac{V_{DD}}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 (1 + \lambda_n V_{CL1}) \quad (39)$$

In the other case (high-to-low transition), all the inputs are activated with the result that all the PMOS devices in the parallel connection become equivalent to an open circuit. So, there is no dc power consumption in this transition (HL indicates high-to-low transition),

$$i_{DCHL} = 0 \quad (40)$$

$$P_{DCHL} = 0 \quad (41)$$

The average dc power consumption is thus

$$P_{scavg1} = \frac{\alpha k_n \left( \frac{W}{L} \right)_n \tau_1 f (V_{DD} - 2V_{thn})^3}{12} \quad (46)$$

where  $\tau_1$  is the rise or fall time of the voltage that feeds the first inverter and is equal to twice the average of the low-to-high and the high-to-low propagation delays at the input of the first inverter.  $P_{scavg2}$  can be written as

$$P_{scavg2} = \frac{\alpha k_n \left( \frac{W}{L} \right)_n \tau_2 f (V_{DD} - 2V_{thn})^3}{12} \quad (47)$$

where  $\tau_2$  is the rise or fall time of the voltage that feeds the second inverter and is equal to twice the average of the low-to-high and the high-to-low propagation delays at the input of the second inverter. Certainly, the short-circuit power consumption is zero in case  $V_{DD}$  is smaller than  $V_{thn} + |V_{thp}|$  as there is no time interval during which both the NMOS and PMOS devices conduct simultaneously [21].

The average dynamic-switching power consumption of the CMOS logic is also taken as the average of that in cases of low-to-high and high-to-low transitions. Adopting the worst case in case of low-to-high transition (illustrated in Subsection C), the power required to charge the internal capacitances at the gate terminals as well as at the internal nodes is

$$P_{dHLC} = \alpha f V_{DD}^2 (n^2 C + 2nC) \quad (49)$$

### e) The Logic Swing

The logic swing,  $LS$ , at the output node is simply equal to the difference between the output high and low levels. Since  $C_L$  is discharged to 0 V with no contention from the PMOS device,  $LS$  is equal to  $V_{CL1}$ . Refer to Fig. 9 for the plots of  $LS$  versus  $V_{thn}$  according to the analysis and the simulation results (of Fig. 2). Certainly, the logic swing of the CMOS logic is  $V_{DD}$ .



Fig. 9: The change of the logic swing with the threshold voltage of  $M_N$ ,  $V_{thn}$ , according to the analysis and the simulation results.

### f) The Figure of Merit

In order to evaluate the performance of the pseudo-PMOS logic compared to the CMOS logic, we define a figure of merit,  $FOM$ , that includes the four previously estimated metrics; the area, the average propagation delay, the average power consumption, and the logic swing. Since these four metrics are preferred to be at their minimum except the logic swing which is preferred to be at its maximum, the  $FOM$  is defined as follows according to the conventional and proposed logic-circuit families:

$$FOM_c = \frac{LS_c}{A_c t_{avgc} P_{avgc}} \quad (50)$$

and

$$FOM_p = \frac{LS_p}{A_p t_{avgp} P_{avgp}} \quad (51)$$

respectively. Thus, the larger the  $FOM$ , the better the performance will be. Refer to Figs. 10, 11, and 12 for the plots of the figures of merit of the CMOS logic and the pseudo-PMOS logic according to the version of Fig. 2 versus  $n$ ,  $f$ , and  $V_{DD}$ , respectively. The parameters adopted with these plots are  $n = 8$ ,  $C_{fan} = 10$  fF, and  $f = 10$  MHz. As shown, the performance of the pseudo-PMOS logic is better than that of the CMOS logic when  $n$  exceeds 8. This can be attributed to the degradation of the performance of the CMOS logic when  $n$  exceeds this value due to the limitations discussed in Section I. However, the matter is not that bad with the pseudo-PMOS logic as mentioned in Section III in which the degradation is merely due to the increased number of the PMOS transistors and the associated parasitic

effects at the output node. The same can be said about Fig. 11 in which the performance of the conventional CMOS logic degrades faster than that of the pseudo-PMOS logic with increasing  $f$ . This is certainly due to the need to deal with numerous parasitic capacitances in the conventional CMOS-logic realization. According to Fig. 11, the pseudo-PMOS logic is better than the conventional CMOS when  $f$  exceeds 15 MHz. According to Fig. 12, it is apparent that the pseudo-PMOS logic exhibits an optimum behavior versus  $V_{DD}$  due to the obvious conflicts associated with changing  $V_{DD}$  with the optimum performance occurring at  $V_{DD} = 0.7685$  V. Specifically, increasing  $V_{DD}$  enhances the logic swing and the propagation delay; however, at the expense of worsening the power consumption. In a nutshell, the pseudo-PMOS logic has a smaller area and average propagation delay but larger power consumption and slightly smaller noise margin compared with the CMOS logic.



Fig. 10: The plots of  $FOM_c$  and  $FOM_p$  versus the number of the inputs,  $n$ .



Fig. 11: The plots of  $FOM_c$  and  $FOM_p$  versus the frequency of switching,  $f$ .



Fig. 12: The plot of  $FOM_p$  versus the power-supply voltage,  $V_{DD}$ .

## V. EFFECT OF TECHNOLOGY SCALING

In this section, the effect of technology scaling on the pseudo-PMOS logic is investigated. The following effects are investigated: velocity saturation, mobility degradation, reduction of the  $V_{DD}/V_{thn}$  ratio, increased process variations, and channel-length modulation.

### a) Velocity Saturation and Mobility Degradation

These two effects act to reduce the current of the MOS transistor for the same applied voltages. Thus, the time required to develop a certain voltage at the first-inverter input or at the output of the circuit increases. However, this effect is common in both the CMOS logic and the pseudo-PMOS logic. It must be noted that the mobility-degradation effect is more pronounced in NMOS transistors compared to PMOS ones. Thus, the sizing of the PMOS transistors is expected to decrease with technology scaling. Since most of the area of the pseudo-PMOS logic is due to the PMOS network, the area advantage becomes more pronounced.

### b) Reduction of the $V_{DD}/V_{thn}$ Ratio

Due to the performance degradation with reducing  $V_{DD}$ ,  $V_{thn}$  also reduces with technology scaling but at a smaller rate, thus the ratio,  $V_{DD}/V_{thn}$ , is expected to decrease with technology scaling. This has the effect of reducing the short-circuit power consumption which enhances the performance of the pseudo-PMOS logic.

### c) Increased Process Variations

The effect of the process variations increases with technology scaling. This has the effect of narrowing the range within which the threshold voltage of the first inverter lies for the proper operation of the pseudo-PMOS logic. This seems to be the most important degradation associated with the pseudo-PMOS logic as it reduces its reliability if the version of Fig. 3 were used. This effect, however, has no counterpart in the CMOS logic.

### d) The Channel-Length Modulation Effect

The Early voltage modeling the dependence of the drain current on the drain-to-source voltage is

proportional to the channel length [17]. So, it decreases with technology scaling with the result that the drain current increases. This effect is more pronounced in the CMOS logic due to the division of the voltage across the stacked devices which has no counterpart in the pseudo-PMOS logic. Also, the slope of the voltage-transfer characteristics of the inverters in the transition region decreases. So, for a certain difference that is developed at the inverter input, there is a smaller value for the logic swing at the inverter output. This indicates that a smaller propagation delay is associated with the two cascaded inverters.

## VI. EFFECT OF PROCESS VARIATIONS

In this section, the effect of the process variations on the reliability of the pseudo-PMOS logic is investigated quantitatively. Specifically, the variations of the aspect ratio and the threshold voltage of the NMOS and PMOS devices composing the voltage divider are taken into account with their effects on the first inverter's input voltage quantified. The equation describing the voltage at the input of the first inverter was derived in Section IV and is repeated here for convenience as follows:

$$V_{CL1} = \frac{k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) V_{DD} - \frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2}{\frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 \lambda_n + k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} \quad (52)$$

Let the variation in the aspect ratio of the NMOS device be  $\Delta(W/L)_n$ , then after substituting  $(W/L)_n$  by  $(W/L)_n + \Delta(W/L)_n$  into Eq. (52), neglecting the terms containing  $(\Delta(W/L)_n)^2$ , using the approximation

$$\frac{1}{1+x} \approx 1-x \text{ for } x \ll 1 \quad (53)$$

and performing some algebraic manipulations, we get the percentage variation of the voltage,  $V_{CL1}$ , due to the change of  $(W/L)_n$  (let it be  $\Delta V_{CL1}/V_{CL1}$ ) as shown in Eq. (54).

The percentage variations of  $V_{CL1}$  due to each of  $\Delta V_{thn}$ ,  $\Delta(W/L)_p$ , and  $\Delta V_{thp}$  can be evaluated in a similar manner and shown to be (let them be  $\Delta V_{CL12}/V_{CL1}$ ,  $\Delta V_{CL13}/V_{CL1}$ , and  $\Delta V_{CL14}/V_{CL1}$ , respectively) as shown in Eqs. (55), (56) and (57).

Refer to Figs. 13, 14, 15, and 16 for the plots of the percentage variations of  $V_{CL1}$  due to that in  $(W/L)_n$ ,  $V_{thn}$ ,  $(W/L)_p$ , and  $V_{thp}$ , respectively. It is obvious that the variation in the threshold voltage of  $M_p$  and  $M_n$  has the largest effect on  $V_{CL1}$ . If these variations cannot be tolerated, the two inverters must be dispensed and the scheme of Fig. 2 can instead be adopted.



Fig. 13: The percentage variation of the voltage,  $V_{CL1}$ , versus that of the aspect ratio of  $M_N$ .

$$\frac{\Delta V_{CL11}}{V_{CL1}} = -\Delta \left( \frac{W}{L} \right)_n \left[ \frac{-\frac{1}{2} k_n (V_B - V_{thn})^2}{k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) V_{DD} - \frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2} - \frac{-\frac{1}{2} k_n (V_B - V_{thn})^2 \lambda_n}{\frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 \lambda_n + k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} \right] \quad (54)$$

$$\frac{\Delta V_{CL12}}{V_{CL1}} = \Delta V_{thn} \left[ \frac{k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})}{k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) V_{DD} - \frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2} + \frac{k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn}) \lambda_n}{\frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 \lambda_n + k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} \right] \quad (55)$$

$$\frac{\Delta V_{CL13}}{V_{CL1}} = \Delta \left( \frac{W}{L} \right)_p \left[ \frac{k_p (V_{DD} - |V_{thp}|) V_{DD}}{k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) V_{DD} - \frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2} - \frac{k_p (V_{DD} - |V_{thp}|)}{\frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 \lambda_n + k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} \right] \quad (56)$$

$$\frac{\Delta V_{CL14}}{V_{CL1}} = |\Delta V_{thp}| \left[ \frac{-k_p \left( \frac{W}{L} \right)_p V_{DD}}{k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) V_{DD} - \frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2} + \frac{k_p \left( \frac{W}{L} \right)_p}{\frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 \lambda_n + k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} \right] \quad (57)$$



Fig. 15: The percentage variation of the voltage,  $V_{CL1}$ , versus that of the aspect ratio of  $M_P$ .



Fig. 14: The percentage variation of the voltage,  $V_{CL1}$ , versus that of the threshold voltage of  $M_N$ .

$$\frac{\Delta V_{CL11}}{V_{CL1}} = -\Delta \left( \frac{W}{L} \right)_n \left[ \frac{-\frac{1}{2} k_n (V_B - V_{thn})^2}{k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) V_{DD} - \frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2} - \frac{-\frac{1}{2} k_n (V_B - V_{thn})^2 \lambda_n}{\frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 \lambda_n + k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} \right] \quad (54)$$

$$\frac{\Delta V_{CL12}}{V_{CL1}} = \Delta V_{thn} \left[ \frac{k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})}{k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) V_{DD} - \frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2} + \frac{k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn}) \lambda_n}{\frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 \lambda_n + k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} \right] \quad (55)$$

$$\frac{\Delta V_{CL13}}{V_{CL1}} = \Delta \left( \frac{W}{L} \right)_p \left[ \frac{k_p (V_{DD} - |V_{thp}|) V_{DD}}{k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|) V_{DD} - \frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2} - \frac{k_p (V_{DD} - |V_{thp}|)}{\frac{1}{2} k_n \left( \frac{W}{L} \right)_n (V_B - V_{thn})^2 \lambda_n + k_p \left( \frac{W}{L} \right)_p (V_{DD} - |V_{thp}|)} \right] \quad (56)$$



Fig. 16: The percentage variation of the voltage,  $V_{CL1}$ , versus that of the threshold voltage of  $M_P$ .

## VII. SIMULATION RESULTS

In this section, the pseudo-PMOS logic (Fig. 2) is verified by simulation adopting the 45 nm CMOS technology with  $V_{DD} = 0.8$  V [22]. Assume minimum-sized devices and a frequency of operation equal to 500 MHz. As a compromise between the enhancement of the logic swing and the degradation in the high-to-low propagation delay with increasing  $V_{thn}$ ,  $M_N$  is chosen with a threshold voltage equal to 0.7 V and biased by  $V_B = V_{DD}$ . In evaluating the low-to-high or the high-to-low propagation delays, the 50% criterion is adopted. The worst-case scenarios are also adopted.

Refer to Figs. 17, 18, and 19 for the low-to-high, the high-to-low, and the average propagation delays, respectively, versus the number of the inputs,  $n$ , for the conventional and pseudo-PMOS logic families. The low-to-high propagation delay according to the pseudo-PMOS logic is found to be smaller than that of the conventional one for all values of  $n$ . The superiority in performance of the pseudo-PMOS logic during the low-to-high transition is attributed to the need to charge all the internal capacitances of the pull-down network in the conventional CMOS stack. Although the contention current of  $M_N$  slows down the charging of  $C_L$  in the pseudo-PMOS logic, it does not affect the performance considerably.

On the other hand, the high-to-low transition of the pseudo-PMOS logic is faster than that of the conventional CMOS logic when  $n$  exceeds 4. The average propagation delay of the pseudo-PMOS logic is smaller than that of the conventional CMOS logic when  $n$  exceeds 3. Finally, note that the degradation in the logic swing compared to the conventional CMOS logic is approximately 63 mV, i.e. only 7.8%, in the worst case.



Fig. 17: The plots of the low-to-high propagation delays according to the conventional CMOS logic and the pseudo-PMOS logic versus the number of the inputs.



Fig. 18: The plots of the high-to-low propagation delays according to the conventional CMOS logic and the pseudo-PMOS logic versus the number of the inputs.



Fig. 19: The plots of the average propagation delays according to the conventional CMOS logic and the pseudo-PMOS logic versus the number of the inputs.



Fig. 20: The plots of the average power consumption according to the conventional CMOS logic and the pseudo-PMOS logic versus the number of the inputs.



Fig. 21: The plots of the average power-delay products according to the conventional CMOS logic and the pseudo-PMOS logic versus the number of the inputs.



Fig. 22: The plots of the average energy-delay products according to the conventional CMOS logic and the pseudo-PMOS logic versus the number of the inputs.

The average power consumption, the average power-delay products (PDPs), and the average energy-delay products (EDPs) are plotted versus  $n$  in Figs. 20, 21, and 22, respectively, for the conventional and proposed logic families. The average power consumption of the CMOS logic rises with  $n$  at a faster rate compared to that of the pseudo-PMOS logic due to the need to charge the internal capacitances of the CMOS logic circuit. The PDP and the EDP, however, of the pseudo-PMOS logic are smaller than their counterparts of the CMOS logic when  $n$  exceeds 6 and 5, respectively.

## VIII. CONCLUSIONS

The pseudo-PMOS logic family was adopted for realizing wide fan-in CMOS circuits containing long stacks of NMOS transistors. The area, propagation delay, power consumption, power-delay and energy-delay products of this family were compared with those of the conventional CMOS logic. The pseudo-PMOS logic showed superior performance from the points of view of the average propagation delay, power-delay product, and energy-delay product when the number of the inputs exceeds 3, 6, and 5, respectively. In fact, the pseudo-PMOS logic had a smaller area and average

propagation delay but larger average power consumption and slightly smaller noise margin (by about 7.8% in the worst case compared to the conventional CMOS logic). According to the estimation performed in this paper using a proper figure of merit, the pseudo-PMOS logic is better than the CMOS logic when the number of the inputs exceeds 8.

## REFERENCES RÉFÉRENCES REFERENCIAS

1. K. Martin, *Digital Integrated Circuit Design*, Oxford University Press, New York, 2000.
2. M. M. Khellah and M. I. Elmasry, "Use of Charge Sharing to Reduce Energy Consumption in Wide Fan-In Gates," Proceedings of the IEEE International Symposium on Circuits and Systems, 31 May - 3 Jun. 1998.
3. S. C. Prasad and K. Roy, "Transistor Reordering for Power Minimization under Delay Constraint," ACM Transactions on Design Automation Elect. Syst., Vol. 1, No. 2, Pages: 280 - 300, Apr. 1996.
4. E. Musoll and J. Cortadella, "Optimizing CMOS Circuits for Low Power Using Transistor Reordering," Proceedings of European Design and Test Conference, Pages: 219 - 223, 1996.
5. C. Tan and J. Allen, "Minimization of Power in VLSI Circuits Using Transistor Sizing, Input Ordering, and Statistical Power Estimation," Proceedings of International Workshop Low-Power Design, Pages: 75 – 80, 1994.
6. W. H. Chiu and H. R. Lin, "A Conditional Isolation Technique for Low-Energy and High-Performance Wide Domino Gates," IEEE Region 10 Conference, 30 Oct. - 2 Nov. 2007.
7. H. Mostafa, M. Anis, and M. Elmasry, "Novel Timing Yield Improvement Circuits for High-Performance Low-Power Wide Fan-In Dynamic OR Gates," IEEE Transactions on Circuits and Systems I: Regular Papers, Vol. 58, Issue: 8, Pages: 1785 – 1797, Aug. 2011.
8. K. Mohanram and N. A. Touba, "Lowering Power Consumption in Concurrent Checkers via Input

Ordering," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 12, Issue: 11, Pages: 1234 - 1243, Nov. 2004.

9. A. A. George and A. R. Sankar, "Current Comparison Based High Speed Domino Circuits," National Conference on Science, Engineering, and Technology (NCSET), Vol. 4, Issue: 6, Pages: 102 – 105, 2016.
10. F. Moradi, D. T. Wisland, H. Mahmoodi, and T. V. Cao, "High Speed and Leakage-Tolerant Domino Circuits for High Fan-in Applications in 70nm CMOS Technology," Proceedings of the 7<sup>th</sup> International Caribbean Conference on Devices, Circuits, and Systems, Mexico, 28 - 30 Apr., 2008.
11. K. Rajasri, M. Manikandan, A. J. Dhanaseely, and M. Nishanthi, "Low Leakage High Speed Domino Circuit For Wide Fan-in Equality Comparator," International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE), Vol. 4, Issue: 3, Mar. 2015.
12. Anamika and G. N. Chiranjeevi, "Low Power Wide Fan In Domino OR Logics," International Journal of Current Engineering and Scientific Research, (IJCESR), Vol. 3, Issue: 7, Pages: 39 – 43, 2016.
13. X. Kavousianos and D. Nikolos, "Novel Single and Double Output TSC Berger Code Checkers," Proceedings of VLSI Test Symposium, Pages: 348 – 353, 1998.
14. C. Metra, M. Favalli, and B. Ricco, "Tree Checkers for Applications with Low Power-Delay Requirements," Proceedings of International Symposium on Defect and Fault Tolerance VLSI Systems, Pages: 213 – 220, 1996.
15. B. Razavi, *Design of Analog CMOS Integrated Circuits*, Second Edition, McGraw-Hill, New York, 2016.
16. H. Shichman and D. Hodges, "Modeling and Simulation of Insulated-Gate Field-Effect Transistor Switching Circuit," IEEE Journal of Solid-State Circuits, Vol. sc-13, No. 3, Pages: 285 - 289, Sep. 1968.
17. A. S. Sedra and K. C. Smith, *Microelectronic Circuits*, Seventh Edition, Oxford University Press, New York, 2015.
18. N. H. E. Weste and D. M. Harris, *CMOS VLSI Design: A Circuits and Systems Perspective*, Fourth Edition, Addison-Wesley, Massachusetts, USA, 2011.
19. D. A. Hodges, H. G. Jackson, and R. A. Saleh, *Analysis and Design of Digital Integrated Circuits: In Deep Submicron Technology*, Third Edition, McGraw Hill, Singapore, 2004.
20. W. C. Elmore, "The Transient Response of Damped Linear Networks with Particular Regard to Wideband Amplifiers," Journal of Applied Physics, Vol. 19, Pages: 55 – 63, Jan. 1948.
21. J. E. Ayers, *Digital Integrated Circuits: Analysis and Design*, CRC Press, Boca Raton, USA, 2005.
22. Predictive Technology Model (PTM), <http://ptm.asu.edu>.